Ben's technical blog

Sun, 15 Sep 2013

The terrible state of EFI variable storage

EFI includes support for storing persistent 'variables': it implements a key-value store where keys are (UUID, string) and values are arbitrary blobs. (There are also some flags that restrict when the variables can be accessed.) The physical storage medium is normally flash. The Linux efivars module provides access to variable storage, both through a specific interface in sysfs and through the somewhat abstracted pstore interface.

An EFI implementation reads the boot order and (if applicable) Secure Boot configuration from specific EFI variables. An operating system can add arbitrary variables, using its own UUID(s). On Linux, the efibootmgr utility is used to read and write the boot order and the pstore interface is used within the kernel to dump kernel log messages or other information when particular events occur.

Since efivars is needed to set the boot order on EFI systems, it's a critical part of installation. But it wasn't being loaded on Debian systems until efibootmgr was run. This doesn't work so well when using a rescue system, as the installation kernel and installed kernel may be different (Debian bug 703363). In future (Debian package version 3.2.41-1) this will be loaded automatically on EFI systems.

But now comes a new problem: bugs in the implementation of variable storage can prevent a system from booting, and even worse, they may prevent manual recovery without special equipment (the system is 'bricked'). Matthew Garrett recently talked about this and mentioned that some Samsung laptops are bricked if the variable storage becomes more than 50% full. efivars now has some extra checks to avoid triggering this (both in mainline and in stable branches). This isn't the end of the story, unfortunately.

Remember that nothing can be overwritten in flash without erasing a relatively large block, and that writes may fail after a few thousand erase cycles (number depending on the type of flash used). Therefore, deleting a variable usually does not free storage immediately, and overwriting a variable usually reduces the amount of free storage. In the case of the Samsung laptops, unused storage is apparently reclaimed at the next reboot.

However, while testing the backported efivars changes I found that my EFI-booting system (Asus P8Z68-V LX motherboard) does not reclaim unused storage until it's very nearly full. Since the firmware itself increments a counter variable on every boot, using 36 more bytes of storage, I was able to observe the available space dropping to under 100 bytes, and then after another boot jumping to about 42K (out of 64K). This means that well over 50% of storage was unused, yet efivars would refuse to write any variables because the firmware didn't reclaim it. This would prevent (re)installation of Linux, as the boot configuration could not be written.

On the assumption that both behaviours are common, I've settled on a compromise that I hope minimises overall risk:

But it's essential that we find a way to identify and work safely with both behaviours. Not only do I want to eliminate the possibility of bricking or installaton failure, but I want to make EFI-backed pstore safe to enable. Crash logs are often hard to obtain and pstore can help to solve that problem.

Updated 2013-09-15: The free space check has been revised (in Linux 3.2.48, 3.5.7.17, 3.8.13.5, 3.9.7 and 3.10) based on advice from Samsung. The kernel leaves at least 5K (rather than 50%) of the EFI variable store free. If it is about to breach this limit, it attempts to write a dummy variable that is larger than the free space, which will either fail or trigger garbage collection. The normal variable can then be written if garbage collection freed up enough space. This check can be disabled using the kernel parameter efi_no_storage_paranoia, but there should be no need to do so.

posted at: 22:24 | path: / | permanent link to this entry