24. Secure Launch

24.1. Background

The Trusted Computing Group (TCG) architecture defines two methods in which the target operating system is started, aka launched, on a system for which Intel and AMD provides implementations. These two launch types are static launch and dynamic launch. Static launch is referred to as such because it happens at one fixed point, at system startup, during the defined life-cycle of a system. Dynamic launch is referred to as such because it is not limited to being done once nor bound to system startup. It can in fact happen at anytime without incurring/requiring an associated power event for the system. Since dynamic launch can happen at anytime, this results in dynamic launch being split into two type of its own. The first is referred to as an early launch, where the dynamic launch is done in conjunction with the static lunch of the system. The second is referred to as a late launch, where a dynamic launch is initiated after the static launch was fully completed and the system was under the control of some target operating system or run-time kernel. These two top-level launch methods, static launch and dynamic launch provide different models for establishing the launch integrity, i.e. the load-time integrity, of the target operating system. When cryptographic hashing is used to create an integrity assessment for the launch integrity, then for a static launch this is referred to as the Static Root of Trust for Measurement (SRTM) and for dynamic launch it is referred to as the Dynamic Root of Trust for Measurement (DRTM).

The reasoning for needing the two integrity models is driven by the fact that these models leverage what is referred to as a “transitive trust”. A transitive trust is commonly referred to as a “trust chain”, which is created through the process of an entity making an integrity assessment of another entity and upon success transfers control to the new entity. This process is then repeated by each entity until the Trusted Computing Base (TCB) of system has been established. A challenge for transitive trust is that the process is susceptible to cumulative error and in this case that error is inaccurate or improper integrity assessments. The way to address cumulative error is to reduce the number of instances that can create error involved in the process. In this case it would be to reduce then number of entities involved in the transitive trust. It is not possible to reduce the number of firmware components or the boot loader(s) involved during static launch. This is where dynamic launch comes in, as it introduces the concept for a CPU to provide an instruction that results in a transitive trust starting with the CPU doing an integrity assessment of a special loader that can then start a target operating system. This reduces the trust chain down to the CPU, a special loader, and the target operation system. It is also why it is said that the DRTM is rooted in hardware since the CPU is what does the first integrity assessment, i.e. the first measurement, in the trust chain.

24.2. Overview

Prior to the start of the TrenchBoot project, the only active Open Source project supporting dynamic launch was Intel’s tboot project to support their implementation of dynamic launch known as Intel Trusted eXecution Technology (TXT). The approach taken by tboot was to provide an exokernel that could handle the launch protocol implemented by Intel’s special loader, the SINIT Authenticated Code Module (ACM 3) and remained in memory to manage the SMX CPU mode that a dynamic launch would put a system. While it is not precluded from being used for doing a late launch, tboot’s primary use case was to be used as an early launch solution. As a result the TrenchBoot project started the development of Secure Launch kernel feature to provide a more generalized approach. The focus of the effort is twofold, the first is to make the Linux kernel directly aware of the launch protocol used by Intel as well as AMD, and the second is to make the Linux kernel be able to initiate a dynamic launch. It is through this approach that the Secure Launch kernel feature creates a basis for the Linux kernel to be used in a variety of dynamic launch use cases.

The first use case that the TrenchBoot project focused on was the ability for the Linux kernel to be started by a dynamic launch, in particular as part of an early launch sequence. In this case the dynamic launch will be initiated by a boot loader with associated support added to it, for example the first targeted boot loader in this case was GRUB2. An integral part of establishing a measurement-based launch integrity involves measuring everything that is intended to be executed (kernel image, initrd, etc) and everything that will configure that kernel to execute (command line, boot params, etc). Then storing those measurements in a protected manner. Both the Intel and AMD dynamic launch implementations leverage the Trusted Platform Module (TPM) to store those measurements. The TPM itself has been designed such that a dynamic launch unlocks a specific set of Platform Configuration Registers (PCR) for holding measurement taken during the dynamic launch. These are referred to as the DRTM PCRs, PCRs 17-22. Further details on this process can be found in the documentation for the GETSEC instruction provided by Intel’s TXT and the SKINIT instruction provided by AMD’s AMD-V. The documentation on these technologies can be readily found online; see the Resources section below for references.

Currently only Intel TXT is supported in this first release of the Secure Launch feature. AMD/Hygon SKINIT support will be added in a subsequent release. Also this first version does not support a late relaunch which allows re-establishing the DRTM at various points post boot.

To enable the kernel to be launched by GETSEC, a stub must be built into the setup section of the compressed kernel to handle the specific state that the dynamic launch process leaves the BSP in. Also this stub must measure everything that is going to be used as early as possible. This stub code and subsequent code must also deal with the specific state that the dynamic launch leaves the APs in.

24.2.1. Basic Boot Flow:

Pre-launch: Phase where the environment is prepared and configured to initiate the secure launch in the GRUB bootloader.

  • Prepare the CPU and the TPM for the launch.

  • Load the kernel, initrd and ACM 3 into memory.

  • Setup the TXT heap and page tables describing the MLE 1 per the specification.

  • Initiate the secure launch with the GETSET[SENTER] instruction.

Post-launch: Phase where control is passed from the ACM to the MLE and the secure kernel begins execution.

  • Entry from the dynamic launch jumps to the SL stub.

  • SL stub fixes up the world on the BSP.

  • For TXT, SL stub wakes the APs, fixes up their worlds.

  • For TXT, APs are left halted waiting for an NMI to wake them.

  • SL stub jumps to startup_32.

  • SL main locates the TPM event log and writes the measurements of configuration and module information into it.

  • Kernel boot proceeds normally from this point.

  • During early setup, slaunch_setup() runs to finish some validation and setup tasks.

  • The SMP bringup code is modified to wake the waiting APs. APs vector to rmpiggy and start up normally from that point.

  • SL platform module is registered as a late initcall module. It reads the TPM event log and extends the measurements taken into the TPM PCRs.

  • SL platform module initializes the securityfs interface to allow access to the TPM event log and TXT public registers.

  • Kernel boot finishes booting normally

  • SEXIT support to leave SMX mode is present on the kexec path and the various reboot paths (poweroff, reset, halt).

Note

A quick note on terminology. The larger open source project itself is called TrenchBoot, which is hosted on GitHub (links below). The kernel feature enabling the use of the x86 technology is referred to as “Secure Launch” within the kernel code.

24.3. Configuration

The settings to enable Secure Launch using Kconfig are under:

"Processor type and features" --> "Secure Launch support"

A kernel with this option enabled can still be booted using other supported methods. There are two Kconfig options for Secure Launch:

"Secure Launch Alternate PCR 19 usage"
"Secure Launch Alternate PCR 20 usage"

The help indicates their usage as alternate post launch PCRs to separate measurements for more flexibility (both disabled by default).

To reduce the Trusted Computing Base (TCB) of the MLE 1, the build configuration should be pared down as narrowly as one’s use case allows. The fewer drivers (less active hardware) and features reduces the attack surface. E.g. in the extreme, the MLE could only have local disk access and no other hardware support. Or only network access for remote attestation.

It is also desirable if possible to embed the initrd used with the MLE kernel image to reduce complexity.

The following are a few important configuration necessities to always consider:

24.3.1. KASLR Configuration

Secure Launch does not interoperate with KASLR. If possible, the MLE should be built with KASLR disabled:

"Processor type and features" -->
    "Build a relocatable kernel" -->
        "Randomize the address of the kernel image (KASLR) [ ]"

This unsets the Kconfig value CONFIG_RANDOMIZE_BASE.

If not possible, KASLR must be disabled on the kernel command line when doing a Secure Launch as follows:

nokaslr

24.3.2. IOMMU Configuration

When doing a Secure Launch, the IOMMU should always be enabled and the drivers loaded. However, IOMMU passthrough mode should never be used. This leaves the MLE completely exposed to DMA after the PMR’s 2 are disabled. First, IOMMU passthrough should be disabled by default in the build configuration:

"Device Drivers" -->
    "IOMMU Hardware Support" -->
        "IOMMU passthrough by default [ ]"

This unset the Kconfig value CONFIG_IOMMU_DEFAULT_PASSTHROUGH.

In addition, passthrough must be disabled on the kernel command line when doing a Secure Launch as follows:

iommu=nopt iommu.passthrough=0

24.4. Interface

The primary interfaces between the various components in TXT are the TXT MMIO registers and the TXT heap. The MMIO register banks are described in Appendix B of the TXT MLE 1 Developement Guide.

The TXT heap is described in Appendix C of the TXT MLE 1 Development Guide. Most of the TXT heap is predefined in the specification. The heap is initialized by firmware and the pre-launch environment and is subsequently used by the SINIT ACM. One section, called the OS to MLE Data Table, is reserved for software to define. This table is the Secure Launch binary interface between the pre- and post-launch environments and is defined as follows:

/*
 * Secure Launch defined MTRR saving structures
 */
struct txt_mtrr_pair {
        u64 mtrr_physbase;
        u64 mtrr_physmask;
} __packed;

struct txt_mtrr_state {
        u64 default_mem_type;
        u64 mtrr_vcnt;
        struct txt_mtrr_pair mtrr_pair[TXT_OS_MLE_MAX_VARIABLE_MTRRS];
} __packed;

/*
 * Secure Launch defined OS/MLE TXT Heap table
 */
struct txt_os_mle_data {
        u32 version;
        u32 boot_params_addr;
        u64 saved_misc_enable_msr;
        struct txt_mtrr_state saved_bsp_mtrrs;
        u32 ap_wake_block;
        u32 ap_wake_block_size;
        u64 evtlog_addr;
        u32 evtlog_size;
        u8 mle_scratch[64];
} __packed;

Description of structure:

Field

Use

version

Structure version, current value 1

boot_params_addr

Physical address of the zero page/kernel boot params

saved_misc_enable_msr

Original Misc Enable MSR (0x1a0) value stored by the pre-launch environment. This value needs to be restored post launch - this is a requirement.

saved_bsp_mtrrs

Original Fixed and Variable MTRR values stored by the pre-launch environment. These values need to be restored post launch - this is a requirement.

ap_wake_block

Pre-launch allocated memory block to wake up and park the APs post launch until SMP support is ready. This block is validated by the MLE before use.

ap_wake_block_size

Size of the ap_wake_block. A minimum of 16384b (4x4K pages) is required.

evtlog_addr

Pre-launch allocated memory block for the TPM event log. The event log is formatted both by the pre-launch environment and the SINIT ACM. This block is validated by the MLE before use.

evtlog_size

Size of the evtlog_addr block.

mle_scratch

Scratch area used post-launch by the MLE kernel. Fields:

  • SL_SCRATCH_AP_EBX area to share %ebx base pointer among CPUs

  • SL_SCRATCH_AP_JMP_OFFSET offset to abs. ljmp fixup location for APs

24.5. Resources

The TrenchBoot project including documentation:

https://github.com/trenchboot

TXT documentation in the Intel TXT MLE Development Guide:

https://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-txt-software-development-guide.pdf

TXT instructions documentation in the Intel SDM Instruction Set volume:

https://software.intel.com/en-us/articles/intel-sdm

AMD SKINIT documentation in the System Programming manual:

https://www.amd.com/system/files/TechDocs/24593.pdf

GRUB pre-launch support patchset (WIP):

https://lists.gnu.org/archive/html/grub-devel/2020-05/msg00011.html

24.6. Error Codes

The TXT specification defines the layout for TXT 32 bit error code values. The bit encodings indicate where the error originated (e.g. with the CPU, in the SINIT ACM, in software). The error is written to a sticky TXT register that persists across resets called TXT.ERRORCODE (see the TXT MLE Development Guide). The errors defined by the Secure Launch feature are those generated in the MLE software. They have the format:

0xc0008XXX

The low 12 bits are free for defining the following Secure Launch specific error codes.

Name:

SL_ERROR_GENERIC

Value:

0xc0008001

Description:

Generic catch all error. Currently unused.

Name:

SL_ERROR_TPM_INIT

Value:

0xc0008002

Description:

The secure launch code failed to get an access to the TPM hardware interface. This is most likely to due to misconfigured hardware or kernel. Ensure the TPM chip is enabled and the kernel TPM support is built in (it should not be built as a module).

Name:

SL_ERROR_TPM_INVALID_LOG20

Value:

0xc0008003

Description:

The secure launch code failed to find a valid event log descriptor for TPM version 2.0 or the event log descriptor is malformed. Usually this indicates that incompatible versions of the pre-launch environment (GRUB) and the MLE kernel. GRUB and the kernel share a structure in the TXT heap and if this structure (the OS-MLE table) is mismatched, this error is often seen. This TXT heap area is setup by the pre-launch environment so the issue may originate there. It could be the sign of an attempted attack.

Name:

SL_ERROR_TPM_LOGGING_FAILED

Value:

0xc0008004

Description:

There was a failed attempt to write a TPM event to the event log early in the secure launch process. This is likely the result of a malformed TPM event log buffer. Formatting of the event log buffer information is done by the pre-launch environment (GRUB) so the the issue issue most likely originates there.

Name:

SL_ERROR_REGION_STRADDLE_4GB

Value:

0xc0008005

Description:

During early validation a buffer or region was found to straddle the 4Gb boundary. Because of the way TXT does DMA memory protection, this is an unsafe configuration and is flagged as an error. This is most likely a configuration issue in the pre-launch environment. It could also be the sign of an attempted attack.

Name:

SL_ERROR_TPM_EXTEND

Value:

0xc0008006

Description:

There was a failed attempt to extend a TPM PCR in the secure launch platform module. This is most likely to due to misconfigured hardware or kernel. Ensure the TPM chip is enabled and the kernel TPM support is built in (it should not be built as a module).

Name:

SL_ERROR_MTRR_INV_VCNT

Value:

0xc0008007

Description:

During early secure launch validation an invalid variable MTRR count was found. The pre-launch environment passes a number of MSR values to the MLE to restore including the MTRRs. The values are restored by the secure launch early entry point code. After measuring the values supplied by the pre-launch environment, a discrepancy was found validating the values. It could be the sign of an attempted attack.

Name:

SL_ERROR_MTRR_INV_DEF_TYPE

Value:

0xc0008008

Description:

During early secure launch validation an invalid default MTRR type was found. See SL_ERROR_MTRR_INV_VCNT for more details.

Name:

SL_ERROR_MTRR_INV_BASE

Value:

0xc0008009

Description:

During early secure launch validation an invalid variable MTRR base value was found. See SL_ERROR_MTRR_INV_VCNT for more details.

Name:

SL_ERROR_MTRR_INV_MASK

Value:

0xc000800a

Description:

During early secure launch validation an invalid variable MTRR mask value was found. See SL_ERROR_MTRR_INV_VCNT for more details.

Name:

SL_ERROR_MSR_INV_MISC_EN

Value:

0xc000800b

Description:

During early secure launch validation an invalid miscellaneous enable MSR value was found. See SL_ERROR_MTRR_INV_VCNT for more details.

Name:

SL_ERROR_INV_AP_INTERRUPT

Value:

0xc000800c

Description:

The application processors (APs) wait to be woken up by the SMP initialization code. The only interrupt that they expect is an NMI; all other interrupts should be masked. If an AP gets some other interrupt other than an NMI it will cause this error. This error is very unlikely to occur.

Name:

SL_ERROR_INTEGER_OVERFLOW

Value:

0xc000800d

Description:

A buffer base and size passed to the MLE caused an integer overflow when added together. This is most likely a configuration issue in the pre-launch environment. It could also be the sign of an attempted attack.

Name:

SL_ERROR_HEAP_WALK

Value:

0xc000800e

Description:

An error occurred in TXT heap walking code. The underlying issue is a failure to early_memremap() portions of the heap, most likely due to a resource shortage.

Name:

SL_ERROR_HEAP_MAP

Value:

0xc000800f

Description:

This error is essentially the same as SL_ERROR_HEAP_WALK but occured during the actual early_memremap() operation.

Name:

SL_ERROR_REGION_ABOVE_4GB

Value:

0xc0008010

Description:

A memory region used by the MLE is above 4GB. In general this is not a problem because memory > 4Gb can be protected from DMA. There are certain buffers that should never be above 4Gb though and one of these caused the violation. This is most likely a configuration issue in the pre-launch environment. It could also be the sign of an attempted attack.

Name:

SL_ERROR_HEAP_INVALID_DMAR

Value:

0xc0008011

Description:

The backup copy of the ACPI DMAR table which is supposed to be located in the TXT heap could not be found. This is due to a bug in the platform’s ACM module or in firmware.

Name:

SL_ERROR_HEAP_DMAR_SIZE

Value:

0xc0008012

Description:

The backup copy of the ACPI DMAR table in the TXT heap is to large to be stored for later usage. This error is very unlikely to occur since the area reserved for the copy is far larger than the DMAR should be.

Name:

SL_ERROR_HEAP_DMAR_MAP

Value:

0xc0008013

Description:

The backup copy of the ACPI DMAR table in the TXT heap could not be mapped. The underlying issue is a failure to early_memremap() the DMAR table, most likely due to a resource shortage.

Name:

SL_ERROR_HI_PMR_BASE

Value:

0xc0008014

Description:

On a system with more than 4G of RAM, the high PMR 2 base address should be set to 4G. This error is due to that not being the case. This PMR value is set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_HI_PMR_SIZE

Value:

0xc0008015

Description:

On a system with more than 4G of RAM, the high PMR 2 size should be set to cover all RAM > 4G. This error is due to that not being the case. This PMR value is set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_LO_PMR_BASE

Value:

0xc0008016

Description:

The low PMR 2 base should always be set to address zero. This error is due to that not being the case. This PMR value is set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_LO_PMR_MLE

Value:

0xc0008017

Description:

This error indicates the MLE image is not covered by the low PMR 2 range. The PMR values are set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_INITRD_TOO_BIG

Value:

0xc0008018

Description:

The external initrd provided is larger than 4Gb. This is not a valid configuration for a secure launch due to managing DMA protection.

Name:

SL_ERROR_HEAP_ZERO_OFFSET

Value:

0xc0008019

Description:

During a TXT heap walk an invalid/zero next table offset value was found. This indicates the TXT heap is malformed. The TXT heap is initialized by the pre-launch environment so the issue most likely originates there. It could also be a sign of an attempted attack. In addition, ACM is also responsible for manipulating parts of the TXT heap so the issue could be due to a bug in the platform’s ACM module.

Name:

SL_ERROR_WAKE_BLOCK_TOO_SMALL

Value:

0xc000801a

Description:

The AP wake block buffer passed to the MLE via the OS-MLE TXT heap table is not large enough. This value is set by the pre-launch environment so the issue most likely originates there. It also could be the sign of an attempted attack.

Name:

SL_ERROR_MLE_BUFFER_OVERLAP

Value:

0xc000801b

Description:

One of the buffers passed to the MLE via the OS-MLE TXT heap table overlaps with the MLE image in memory. This value is set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_BUFFER_BEYOND_PMR

Value:

0xc000801c

Description:

One of the buffers passed to the MLE via the OS-MLE TXT heap table is not protected by a PMR. This value is set by the pre-launch environment so the issue most likey originates there. It could also be the sign of an attempted attack.

Name:

SL_ERROR_OS_SINIT_BAD_VERSION

Value:

0xc000801d

Description:

The version of the OS-SINIT TXT heap table is bad. It must be 6 or greater. This value is set by the pre-launch environment so the issue most likely originates there. It could also be the sign of an attempted attack. It is also possible though very unlikely that the platform is so old that the ACM being used requires an unsupported version.

Name:

SL_ERROR_EVENTLOG_MAP

Value:

0xc000801e

Description:

An error occurred in the secure launch module while mapping the TPM event log. The underlying issue is memremap() failure, most likely due to a resource shortage.

Name:

SL_ERROR_TPM_NUMBER_ALGS

Value:

0xc000801f

Description:

The TPM 2.0 event log reports an unsupported number of hashing algorithms. Secure launch currently only supports a maximum of two: SHA1 and SHA256.

Name:

SL_ERROR_TPM_UNKNOWN_DIGEST

Value:

0xc0008020

Description:

The TPM 2.0 event log reports an unsupported hashing algorithm. Secure launch currently only supports two algorithms: SHA1 and SHA256.

Name:

SL_ERROR_TPM_INVALID_EVENT

Value:

0xc0008021

Description:

An invalid/malformed event was found in the TPM event log while reading it. Since only trusted entities are supposed to be writing the event log, this would indicate either a bug or a possible attack.

1(1,2,3,4)

MLE: Measured Launch Environment is the binary runtime that is measured and then run by the TXT SINIT ACM. The TXT MLE Development Guide describes the requirements for the MLE in detail.

2(1,2,3,4,5)

PMR: Intel VTd has a feature in the IOMMU called Protected Memory Registers. There are two of these registers and they allow all DMA to be blocked to large areas of memory. The low PMR can cover all memory below 4Gb on 2Mb boundaries. The high PMR can cover all RAM on the system, again on 2Mb boundaries. This feature is used during a secure launch by TXT.

3(1,2)

ACM: Intel’s Authenticated Code Module. This is the 32b bit binary blob that is run securely by the GETSEC[SENTER] during a measured launch. It is described in the Intel documentation on TXT and versions for various chipsets are signed and distributed by Intel.