Authenticated Boot and Disk Encryption on Linux

TL;DR: Linux has been supporting Full Disk Encryption (FDE) and technologies such as UEFI SecureBoot and TPMs for a long time. However, the way they are set up by most distributions is not as secure as they should be, and in some ways quite frankly weird. In fact, right now, your data is probably more secure if stored on current ChromeOS, Android, Windows or MacOS devices, than it is on typical Linux distributions.

Generic Linux distributions (i.e. Debian, Fedora, Ubuntu, …) adopted Full Disk Encryption (FDE) more than 15 years ago, with the LUKS/cryptsetup infrastructure. It was a big step forward to a more secure environment. Almost ten years ago the big distributions started adding UEFI SecureBoot to their boot process. Support for Trusted Platform Modules (TPMs) has been added to the distributions a long time ago as well — but even though many PCs/laptops these days have TPM chips on-board it's generally not used in the default setup of generic Linux distributions.

How these technologies currently fit together on generic Linux distributions doesn't really make too much sense to me — and falls short of what they could actually deliver. In this story I'd like to have a closer look at why I think that, and what I propose to do about it.

The Basic Technologies

Let's have a closer look what these technologies actually deliver:

LUKS/dm-crypt/cryptsetup provide disk encryption, and optionally data authentication. Disk encryption means that reading the data in clear-text form is only possible if you possess a secret of some form, usually a password/passphrase. Data authentication means that no one can make changes to the data on disk unless they possess a secret of some form. Most distributions only enable the former though — the latter is a more recent addition to LUKS/cryptsetup, and is not used by default on most distributions (though it probably should be). Closely related to LUKS/dm-crypt is dm-verity (which can authenticate immutable volumes) and dm-integrity (which can authenticate writable volumes, among other things).
1. https://linux.die.net/man/8/cryptsetup
2. https://github.com/martinjungblut/go-cryptsetup
UEFI SecureBoot provides mechanisms for authenticating boot loaders and other pre-OS binaries before they are invoked. If those boot loaders then authenticate the next step of booting in a similar fashion there's a chain of trust which can ensure that only code that has some level of trust associated with it will run on the system. Authentication of boot loaders is done via cryptographic signatures: the OS/boot loader vendors cryptographically sign their boot loader binaries. The cryptographic certificates that may be used to validate these signatures are then signed by Microsoft, and since Microsoft's certificates are basically built into all of today's PCs and laptops this will provide some basic trust chain: if you want to modify the boot loader of a system you must have access to the private key used to sign the code (or to the private keys further up the certificate chain).
TPMs do many things. For this text we'll focus one facet: they can be used to protect secrets (for example for use in disk encryption, see above), that are released only if the code that booted the host can be authenticated in some form. This works roughly like this: every component that is used during the boot process (i.e. code, certificates, configuration, …) is hashed with a cryptographic hash function before it is used. The resulting hash is written to some small volatile memory the TPM maintains that is write-only (the so called Platform Configuration Registers, "PCRs"): each step of the boot process will write hashes of the resources needed by the next part of the boot process into these PCRs. The PCRs cannot be written freely: the hashes written are combined with what is already stored in the PCRs — also through hashing and the result of that then replaces the previous value. Effectively this means: only if every component involved in the boot matches expectations the hash values exposed in the TPM PCRs match the expected values too. And if you then use those values to unlock the secrets you want to protect you can guarantee that the key is only released to the OS if the expected OS and configuration is booted. The process of hashing the components of the boot process and writing that to the TPM PCRs is called "measuring". What's also important to mention is that the secrets are not only protected by these PCR values but encrypted with a "seed key" that is generated on the TPM chip itself, and cannot leave the TPM (at least so goes the theory). The idea is that you cannot read out a TPM's seed key, and thus you cannot duplicate the chip: unless you possess the original, physical chip you cannot retrieve the secret it might be able to unlock for you. Finally, TPMs can enforce a limit on unlock attempts per time ("anti-hammering"): this makes it hard to brute force things: if you can only execute a certain number of unlock attempts within some specific time then brute forcing will be prohibitively slow.

How Linux Distributions use these Technologies

As mentioned already, Linux distributions adopted the first two of these technologies widely, the third one not so much.

So typically, here's how the boot process of Linux distributions works these days:

<aside> 💡 linux 启动的度量链

</aside>

The UEFI firmware invokes a piece of code called "shim" (which is stored in the EFI System Partition — the "ESP" — of your system), that more or less is just a list of certificates compiled into code form. The shim is signed with the aforementioned Microsoft key, that is built into all PCs/laptops. This list of certificates then can be used to validate the next step of the boot process. The shim is measured by the firmware into the TPM. (Well, the shim can do a bit more than what I describe here, but this is outside of the focus of this article.)
The shim then invokes a boot loader (often Grub) that is signed by a private key owned by the distribution vendor. The boot loader is stored in the ESP as well, plus some other places (i.e. possibly a separate boot partition). The corresponding certificate is included in the list of certificates built into the shim. The boot loader components are also measured into the TPM.
The boot loader then invokes the kernel and passes it an initial RAM disk image (initrd), which contains initial userspace code. The kernel itself is signed by the distribution vendor too. It's also validated via the shim. The initrd is not validated, though (!). The kernel is measured into the TPM, the initrd sometimes too.
The kernel unpacks the initrd image, and invokes what is contained in it. Typically, the initrd then asks the user for a password for the encrypted root file system. The initrd then uses that to set up the encrypted volume. No code authentication or TPM measurements take place.
The initrd then transitions into the root file system. No code authentication or TPM measurements take place.
When the OS itself is up the user is prompted for their user name, and their password. If correct, this will unlock the user account: the system is now ready to use. At this point no code authentication, no TPM measurements take place. Moreover, the user's password is not used to unlock any data, it's used only to allow or deny the login attempt — the user's data has already been decrypted a long time ago, by the initrd, as mentioned above.

What you'll notice here of course is that code validation happens for the shim, the boot loader and the kernel, but not for the initrd or the main OS code anymore. TPM measurements might go one step further: the initrd is measured sometimes too, if you are lucky. Moreover, you might notice that the disk encryption password and the user password are inquired by code that is not validated, and is thus not safe from external manipulation. You might also notice that even though TPM measurements of boot loader/OS components are done nothing actually ever makes use of the resulting PCRs in the typical setup.

The Basic Technologies

How Linux Distributions use these Technologies

Attack Scenarios