The Linux Foundation Projects
Skip to main content
All Posts By

Confidential Computing Consortium

Basics of Trusted Execution Environments (TEEs): The Heart of Confidential Computing

By Blog No Comments

Authored by Sal Kimmich

Authored by Sal KimmichAs we delve deeper into our exploration of Confidential Computing, this week we turn our attention to a critical component that plays a central role in this technology: Trusted Execution Environments, or TEEs. Understanding TEEs is key to appreciating how Confidential Computing enhances data security.

What are Trusted Execution Environments (TEEs)?

At its simplest, a Trusted Execution Environment is a secure area within a processor. It guarantees that the code and data loaded inside it are protected with respect to confidentiality and integrity. Essentially, TEEs provide a kind of ‘safe room’ for sensitive operations, ensuring that even if a system is compromised, the data within the TEE remains secure.

How Do TEEs Work?

TEEs operate by isolating specific computations, data, or both, from the rest of the device or network. This isolation is hardware-based, which makes it highly resistant to external attacks, including those from the operating system itself. Within a TEE, code can run without risk of interference or snooping from other processes.

The Role of TEEs in Confidential Computing

In the context of Confidential Computing, TEEs are invaluable. They allow sensitive data to be processed in a secure environment, ensuring that it remains encrypted and inaccessible to unauthorized users or processes. This is particularly crucial when handling personal data, intellectual property, or any information requiring strict confidentiality.

Applications of TEEs

The applications of TEEs are vast and varied. They are used in mobile device security, cloud computing, IoT devices, and more. In each case, TEEs provide a layer of security that is vital in today’s interconnected and often vulnerable digital landscape.

A Look Back at Computing History

As we discuss these advanced concepts, it’s fascinating to reflect on how far we’ve come. Consider the ENIAC, unveiled in 1946 and considered the first general-purpose electronic computer. The journey from such rudimentary computing to today’s sophisticated TEEs underscores the incredible advancements in technology.

Next Steps in Our Journey

Understanding TEEs is just the beginning. As we continue our series, we’ll explore how these environments are implemented and the various challenges and solutions associated with them. 

Stay Tuned

Up next we will delve into the role of open source in Confidential Computing. Open source initiatives are pivotal in the development and adoption of TEEs, offering transparency and collaborative opportunities that are essential in today’s cybersecurity landscape.

Explore the four-part series on Confidential Computing—a vital innovation for data privacy and security. Dive in now!

Part I –  Introduction to Confidential Computing:  A Year Long Exploration

Part IIThe Evolution of Cybersecurity:  From Early Threats to Modern Challenges

Part IVCollaborative Security:  The Role of Open Source in Confidential Computing

O’Reilly Media report: Azure Confidential Computing and Zero Trust

By In The News No Comments

At the Confidential Compute Consortium, we’re committed to fostering a secure and privacy first digital future. The recently published O’Reilly Media report: Azure Confidential Computing and Zero Trust echoes the growing importance of safeguarding sensitive data across industries.
The Confidential computing Consortium stands at the forefront of this movement, championing a paradigm shift towards fortified data protection. This report underlines the non-negotiable aspect of privacy and security in our digital world. The insights shared in the O’Reilly Media report reinforce the urgency and relevance of our endeavors. By championing confidential computing, we’re reshaping the narrative, driving innovation, and setting new benchmarks for data security and privacy standards.

Confidential Computing Mini Summit at OSS EU in Bilbao

By Blog No Comments

We’re delighted to announce that the Confidential Computing Consortium is hosting a Mini Summit co-located with Open Source Summit Europe in Bilbao in September.  The Mini Summit will take place during the afternoon of Monday, 18th September, the day before the main OSS EU conference. 

Call for Proposals for the Confidential Computing Mini Summit are open! We welcome submissions on any relevant content to present at this summit. Submit your proposal here!

Important Dates:

  • CFP deadline: Aug 13, 2023
  • Speaker notification: Aug 18, 2023

Session type:

  • 30 min session

Topic area:

  • Use case deep dive
  • EU open source project & communities
  • (Open) Surprise us with a hot topic!

It’s a great opportunity to meet other members of the community, hear sessions from leaders in the industry and enjoy a little more time in Spain!  In-person registration is just $10 to your existing OSS EU ticket, and virtual registration is free.  We look forward to seeing you there!

More details are available at https://events.linuxfoundation.org/open-source-summit-europe/features/co-located-events/#confidential-computing-mini-summit

Latest SUSE Linux Enterprise goes all in with confidential computing

By In The News No Comments

SUSE’s latest release of SUSE Linux Enterprise 15 Service Pack 5 (SLE 15 SP5) has a focus on security, claiming it as the first distro to offer full support for confidential computing to protect data.

According to SUSE, the latest version of its enterprise platform is designed to deliver high-performance computing capabilities, with an inevitable mention of AI/ML workloads, plus it claims to have extended its live-patching capabilities.

The release also comes just weeks after the community release openSUSE Leap 15.5 was made available, with the two sharing a common core. The Reg’s resident open source guru noted that Leap 15.6 has now been confirmed as under development, which implies that a future SLE 15 SP6 should also be in the pipeline.

SUSE announced the latest version at its SUSECON event in Munich, along with a new report on cloud security issues claiming that more than 88 percent of IT teams have reported at least one cloud security incident over the the past year.

This appears to be the justification for the claim that SLE 15 SP5 is the first Linux distro to support “the entire spectrum” of confidential computing, allowing customers to run fully encrypted virtual machines on their infrastructure to protect applications and their associated data.

Confidential computing relies on hardware-based security mechanisms in the processor to provide this protection, so enterprises hoping to take advantage of this will need to ensure their servers have the necessary support, such as AMD’s Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) and Intel’s Trust Domain Extensions (TDX).

SUSE also said that its cut of SLE for running SAP applications comes with improvements in High Availability (HA) and speedier deployment thanks to enhanced automation in SP5. These include automatic discovery of servers, SAP HANA databases, SAP S/4HANA, and NetWeaver applications and clusters, plus continuous checks on HA configurations with recommended fixes.

On the management side, the SUSE Manager 4.3.6 tool is now claimed to support over 15 different Linux distributions, including Rocky Linux, Alma Linux and all variations of Red Hat Enterprise Linux 9, in addition to SUSE’s own platform.

SUSE said that this will be available in the AWS marketplace on a pay-as-you-go basis later this year, allowing customers to manage their infrastructure from the cloud with a scalable instance on a metered basis.

While not strictly part of SLE, SUSE said it has added security-focused updates to its Rancher platform for managing Kubernetes and containers, such as support for hardened virtual machines and improved vulnerability and compliance management. The premium version, Rancher Prime, is getting the inevitable overhaul of its built-in AI Assistant with OpenAI and other generative AI technologies, since why not?

There is also a new release of its container security tool, with NeuVector 5.2 adding updates for common vulnerabilities, exposure database search, and NIST 800-53 report mapping.

NeuVector will apparently be available on the AWS Marketplace from July, and SUSE said it will also be available on Azure and Google Cloud later this summer.

“Every enterprise must maximize their business resilience to face increasingly sophisticated and potentially devastating digital attacks,” SUSE CTO Dr. Thomas Di Giacomo said. ®

VMware, AMD, Samsung and RISC-V push for confidential computing standards

By In The News No Comments

VMware has joined AMD, Samsung, and members of the RISC-V community to work on an open and cross-platform framework for the development and operation of applications using confidential computing hardware.

Revealing the effort at the Confidential Computing Summit 2023 in San Francisco, the companies say they aim to bring about an industry transition to practical confidential computing by developing the open source Certifier Framework for Confidential Computing project.

Among other goals, the project aims to standardize on a set of platform-independent developer APIs that can be used to develop or adapt application code to run in a confidential computing environment, with a Certifier Service overseeing them in operation.

VMware claims to have researched, developed and open sourced the Certifier Framework, but with AMD on board, plus Samsung (which develops its own smartphone chips), the group has the x86 and Arm worlds covered. Also on board is the Keystone project, which is developing an enclave framework to support confidential computing on RISC-V processors.

Confidential computing is designed to protect applications and their data from theft or tampering by protecting them inside a secure enclave, or trusted execution environment (TEE). This uses hardware-based security mechanisms to prevent access from everything outside the enclave, including the host operating system and any other application code.

Such security protections are likely to be increasingly important in the context of applications running in multi-cloud environments, VMware reckons.

Another scenario for confidential computing put forward by Microsoft, which believes confidential computing will become the norm – is multi-party computation and analytics. This sees several users each contribute their own private data to an enclave, where it can be analyzed securely to produce results much richer than each would have got purely from their own data set.

This is described as an emerging class of machine learning and “data economy” workloads that are based on sensitive data and models aggregated from multiple sources, which will be enabled by confidential computing.

However, VMware points out that like many useful hardware features, it will not be widely adopted until it becomes easier to develop applications in the new paradigm.

Cutting effort

The cloud and virtualization giant claims that this is the purpose of the Certifier Framework, which provides platform-independent support for specifying and enforcing trust policies to secure workloads across on-premises and third-party infrastructure, including multi-cloud environments, while the companies will work together on a set of developer APIs across the x86, Arm and RISC-V ecosystems.

According to VMware, the Certifier Framework comprises two parts: one is an application development library (the API) that allows a developer to either port an existing “well-written” application, or develop a fresh one with minimal effort.

The API is said to support multiple confidential computing platforms, so there is no need to rewrite an application that uses the Framework when moving to another platform, it is claimed, and porting an app to a confidential computing environment may only require “half a dozen or so calls to the API.

Open source project

The second part of the framework is the Certifier Service, made up of a number of server applications that evaluate policy and manage trust relationships in a security domain. The purpose of this Certifier Service is to provide a scalable means to deploy many confidential computing applications and enforce security policy.

The group says showed off the technology at the Confidential Computing Summit, including demos of “universal” client-cloud trust management across multiple hardware platforms.

Intel is notably absent from the Certifier Framework group, despite being a premier member of the Confidential Computing Consortium and sponsor of the Confidential Computing Summit itself.

However, AMD’s Raghu Nambiar, VP for Data Center Ecosystems and Solutions, said that working with industry players such as VMware is critical for boosting adoption of confidential computing.

“No matter the size or technical sophistication of an organization, or where a workload is deployed, the Certifier Framework will help more customers realize the benefits of confidential computing,” he said in a statement.

Yong Ho Hwang, Samsung Electronics VP and Head of Security and Privacy, also endorsed it, adding: “We are pleased to be a supporter of the Certifier Framework and share the common goal of accelerating the adoption of confidential computing through a developer-friendly API for confidential computing trust management.”

Readers interested in the initiative can have a look at the Certifier Framework for Confidential Computing on Github. ®

Broad industry representation at Confidential Computing Summit

By Blog No Comments

On Thursday, 29th June 2023, the first Confidential Computing Summit was held at the Marriott Marquis in San Francisco.  Organized by Opaque Systems and the Confidential Computing Consortium, it comprised 38 sessions delivered by 44 speakers and panelists, with 244 attendees – over twice the expected number.  Although initially planned as a single track event, the number of responses to the Call for Papers was so large that the agenda was split into three tracks, with keynotes starting and ending the event.

Sessions covered a broad range of topics, from state of the industry and outlook, to deep-dive technical discussions.  One of the key themes of the Summit, however, was the application of Confidential Computing to real-life use cases, with presentations by end users as well as suppliers of Confidential Computing technologies.  The relevance of Confidential Computing to AI was a recurring topic as data and model privacy is emerging as a major concern for many users, particularly those with requirements to share data with untrusted parties whether partners or even competitors for multi-party collaboration.  Other use cases included private messaging, anti-money laundering, Edge computing, regulatory compliance, Big Data, examination security and data sovereignty.  Use cases for Confidential Computing ranged across multiple sectors, including telecommunications, banking, insurance, healthcare and AdTech. Sessions ranged from high-level commercial use case discussions to low-level technical considerations.

There was an exhibitor hall which doubled as meeting space and included booths from the CCC and Opaque Systems plus the Summit’s premier sponsors (Microsoft, Intel, VMware, Arm, Anjuna, Fortanix, Edgeless Systems, Cosmian).  The venue also had sufficient space (and seating with branded cushions!) for a busy “hallway track”.  For many attendees, the ability to meet other industry professionals in person for the first time was as valuable a reason to attend the Summit as the session – while virtual conferences can have value, the conversations held face-to-face at the conference provided opportunities for networking that would have been impossible without real-world interactions.

Videos of many of the sessions will be made available on the conference website in the coming weeks: https://confidentialcomputingsummit.com/ (the agenda of sessions presented is also available).

The Confidential Computing Consortium would like to thank Opaque Systems and the program committee for their hard work in organizing this event.  Given the success of the Summit, plans are already underway for a larger instance next year.  Please keep an eye on this blog and other news outlets for information.  We look forward to seeing you there!

Confidential Computing: logging and debugging

By Blog No Comments

Mike Bursell

This article is a slightly edited version of an article originally published at https://blog.enarx.dev/confidential-computing-logging-and-debugging/

Debugging applications is an important part of the development process, and one of the mechanisms we use for it is logging: providing extra details about what’s going on in (and around) the application to help us understand problems, manage errors and (when we’re lucky!) monitor normal operation.  Logging then, is useful not just for abnormal, but also for normal (“nominal”) operations.  Log entries and other error messages can be very useful, but they can also provide information to other parties – sometimes information which you’d prefer they didn’t have.  This is particularly true when you are thinking about Confidential Computing: running applications or workloads in environments where you really want to protect the confidentiality and integrity of your application and its data.  This article examines some of the issues that we need to consider when designing Confidential Computing frameworks, the applications we run in them, and their operations.  It is written partly from the point of view of the Enarx project, but that is mainly to provide some concrete examples: these have been generalised where possible.  Note that this is quite a long article, as it goes into detailed discussion of some complex issues, and tries to examine as many of the alternatives as possible.

First, let us remind ourselves of one of the underlying assumptions about Confidential Computing in general which is that you don’t trust the host. The host, in this context, is the computer running your workload within a TEE instance – your Confidential Computing workload (or simply workload). And when we say that we don’t trust it, we really mean that: we don’t want to leak any information to the host which might allow it (the host) to infer information about the workload that is running, either in terms of the program itself (and any associated algorithms) or the data.

Now, this is a pretty tall order, particularly given that the state of the art at the moment doesn’t allow for strong protections around resource utilisation by the workload. There’s nothing that the workload can do to stop the host system from starving it of CPU resources, and slowing it down, or even stopping it running altogether.  This presents the host with many opportunities for artificially imposed timing attacks against which it is very difficult to protect.  In fact, there are other types of resource starvation and monitoring around I/O as well, which are also germane to our conversation.

Beyond this, the host system can also attempt to infer information about the workload by monitoring its resource utilisation without any active intervention. To give an example, let us say that the host notices that the workload creates a network socket to an external address. It (the host) starts monitoring the data sent via this socket, and notices that it is all encrypted using TLS. The host may not be able to read the data, but it may be able to infer that a specific short burst of activity just after the opening of the socket corresponds to the generation of a cryptographic key. This information on its own may be sufficient for the host to fashion passive or active attacks to weaken the strength of this key.

None of this is good news, but let’s extend our thinking beyond just normal operation of the workload and consider debugging generally and the error handling more particularly. For the sake of clarity, we will posit a tenant with a client process on a separate machine (considered trusted, unlike the host), and that TEE instance on the host has four layers, including the associated workload. This may not be true for all applications or designs, but is a useful generalisation, and covers most of the issues that are likely to arise.  This architecture models a cloud workload deployment. Here’s a picture.

TEE layers and components

These layers may be defined thus:

  1. application layer – the application itself, which may or may not be aware that it is running within a TEE instance. For many use cases, this, from the point of view of a tenant/client of the host, is the workload as defined above.
  2. runtime layer – the context in which the application runs. How this is considered is likely to vary significantly between TEE type and implementations, and in some cases (where the workload is a full VM image, including application and operating system, for instance), there may be little differentiation between this layer and the application layer (the workload includes both). In many cases, however, the runtime layer will be responsible for loading the application layer – the workload.
  3. TEE loading layer – the layer responsible for loading at least the runtime layer, and possibly some other components into the TEE instance. Some parts of this are likely to exist outside of the TEE instance, but others (such as a UEFI loader for a VM) may exist within it. For this reason, we may choose to separate “TEE-internal” from “TEE-external” components within this layer. For many implementations, this layer may disappear (cease to run and be removed from memory) once the runtime has started.
  4. TEE execution layer – the layer responsible for actually executing the runtime above it, and communicating with the host. Like the TEE loading layer, this is likely to exist in two parts – one within the TEE instance, and one outside it (again, “TEE-internal” and “TEE-external”. 

An example of relative lifecycles is shown here.

Component lifecycles

Now we consider logging for each of these.

Application layer

The application layer generally communicates via a data plane to other application components external to the TEE, including those under the control of the tenant, some of which may sit on the client machine.  Some of these will be considered trusted from the point of view of the application, and these at least will typically require an encrypted communication channel so that the host is unable to snoop on the data (others may also require encryption).  Exactly how these channels are set up will vary between implementations, but application-level errors and logging should be expected to use these communication channels, as they are relevant to the application’s operation. This is the simplest case, as long as channels to external components are available. Where they cease to be available, for whatever reason, the application may choose to store logging information for later transfer (if possible) or communicate a possible error state to the runtime layer.

The application may also choose to communicate other runtime errors, or application errors that it considers relevant or possibly relevant to runtime, to the runtime layer.

Runtime layer

It is possible that the runtime layer may have access to communication channels to external parties that the application layer does not – in fact, if it is managing the loading and execution of the runtime layer, this can be considered a control plane. As the runtime layer is responsible for the execution of the application, it needs to be protected from the host, and it resides entirely within the TEE instance. It also has access to information associated with the application layer (which may include logging and error information passed directly to it by the application), which should also be protected from the host (both in terms of confidentiality and integrity), and so any communications it has with external parties must be encrypted.

There may be a temptation to consider that the runtime layer should be reporting errors to the host, but this is dangerous. It is very difficult to control what information will be passed: not only primary information, but also inferred information. There does, of course, need to be communication between the runtime layer and the host in order to allow execution – whether this is system calls or another mechanism – but in the model described here, that is handled by the TEE execution layer.

TEE loading layer

This layer is one where we start having to make some interesting decisions.  There are, as we noted, two different components which may make up this layer: TEE-internal and TEE-external.

TEE loading – TEE-internal

The TEE-internal component may generate logging information associated either with successful or unsuccessful loading of a workload.  Some errors encountered may be recoverable, while others are unrecoverable.  In most cases, it may generally be expected that a successful loading event is considered non-sensitive and can be exposed to the TEE-external component, as the host will generally be able to infer successful loading as execution will continue onto the next phase (even when the TEE loading layer and TEE execution layer do have not explicitly separate external components), but the TEE-internal component still needs to be careful about the amount of information exposed to the host, as even information around workload size or naming may provide a malicious entity with useful information.  In such cases, integrity protection of messages may be sufficient: failure to provide integrity protection could lead the host to misreport successful loading to a remote tenant, for example – not necessarily a major issue, but a possible attack vector nevertheless.

Error events associated with failure to load the workload (or parts of it) are yet more tricky.  Opportunities may exist for the host to tamper with the loading process with the intention of triggering errors from which information may be gleaned – for instance, pausing execution at particular points and seeing what error messages are generated.  The more data exported by the TEE loading internal component, the more data the external component may be able to make available to malicious parties.  One of the interesting questions to consider is what to do with error messages generated before a communications channel (the control plane) back to the provisioning entity has been established.  Once this has been established (and is considered “secure” to the appropriate level required), then transferring error messages via it is a pretty straightforward proposition, though this channel may still be subject to traffic analysis and resource starvation (meaning that any error states associated with timing need to be carefully examined).  Before this communication channel has been established, the internal component has three viable options (which are not mutually exclusive):

  1. Pass to the external component for transmission to the tenant “out of band”, by the external component.
  2. Pass to the external component for storage and later consumption and transmission over the control plane by the internal component if the control plane can be established in the future.
  3. Consign to internal storage, assuming availability of RAM or equivalent assigned for this purpose.

In terms of attacks, options 1 and 2 are broadly similar as long as the control plane fails to exist.  Additionally, in case 1, the external component can choose not to transmit all (or any) of the data to the tenant, and in case 2, it may withhold data from the internal component when requested.

If we take the view (as proposed above) that at least the integrity, and possibly the confidentiality of error messages is of concern, then option 1 would only be viable if a shared secret has already been established between the TEE loading internal component and the tenant or the identity of the TEE loading internal component already established with the tenant, which is impossible unless the control plane has already been created.  For option 2, the internal component can generate a key which it can use to encrypt the data sent to the external component, and store this key for decryption when (and if) the external component returns the data.

TEE loading – TEE-external

Any information which is available to any TEE-external component must be assumed to be unprotected and untrusted.  The only exceptions are if data is signed (for integrity) or encrypted (for confidentiality, though integrity is typically also transparently assured when data is encrypted), as noted above.  The TEE-external may choose to store or transmit error messages from the TEE-internal component, as noted above, but it may also generate log entries of its own.  There are five possible (legitimate) consumers of these entries:

  1. The host system – the host (general logging, operating system or other components) may consume information around successful loading to know when to start billing, for instance, or consume information around errors for its own purposes or to transmit to the tenant (where the TEE loading component is not in direct contact with the client, or other communication channels are preferred).
  2. The TEE loading internal component – there may be both success and failure events which are useful to communicate to the TEE loading internal component to allow it to make decisions.  Communications to this component assume, of course, that loading was sufficiently successful to allow the TEE loading internal component to start execution.
  3. The TEE runtime external component – if the lifecycle has proceeded to the stage where the TEE runtime component is executing, the TEE loading external component can communicate logging information to it, either directly (if they are executing concurrently) or via another entity such as storage.
  4. The TEE runtime internal component – similarly to case #3 above, the TEE loading external component may be able to communicate to the TEE runtime internal component, either directly or indirectly.
  5. The client – as noted in #1 above, the host may communicate logging information to the client.  An alternative, if an appropriate communications channel exists, is for the TEE loading external component to communicate directly with it.  The client should always treat all communications with this component as untrusted (unless they are being transmitted for the internal component, and are appropriately integrity/confidentiality protected).

The TEE runtime layer

TEE runtime – TEE-internal

While the situation for this component is similar to that for the TEE loading internal component, it is somewhat simpler because the fact that this stage of the lifecycle has been reached means that the application has, by definition, been loaded and is running.  This means that there are a number of different channels for communication of error messages: the application data plane, the runtime control plane and the TEE runtime external component.  Most logging information will generally be directed either to the application (for decision making or transmission over its data plane at the application’s discretion) or to the client via the control plane. Standard practice can generally be applied as to which of these is most appropriate for which use cases.

Transmission of data to the TEE runtime external component needs to be carefully controlled, as the runtime component (unless it is closely coupled with the application) is unlikely to be in a good position to judge what information might be considered sensitive if available to components or entities external to the TEE.  For this reason, either error communication to the TEE runtime external component should be completely avoided, or standardised (and carefully designed) error messages should be employed – which makes standard debugging techniques extremely difficult.

Debugging

Any form of debugging for TEE instances is extremely difficult, and there are two fairly stark choices:

  1. Have a strong security profile and restrict debugging to almost nothing.
  2. Have a weaker security profile and acknowledge that it is almost impossible to ensure the protection of the confidentiality and integrity of the workload (the application and its data).

There are times, particularly during the development and testing of a new application when the latter is the only feasible approach.  In this case, we can recommend two principles:

  1. Create a well-defined set of error states which can be communicated via untrusted channels (that is, which are generally unprotected from confidentiality and integrity attacks), and which do not allow for “free form” error messages (which are more likely to leak information to a host).
  2. Ensure that any deployment with a weaker profile is closely controlled (and never into production).

These two principles can be combined, and a deployment lifecycle might allow for different profiles: e.g. a testing profile on local hardware allowing free form error messages and a staging profile on external hardware which only allows for “static” error messages.

Standard operation

Standard operation must assume the worst case scenario, which is that the host may block, change and interfere with all logging and error messages to which it has access, and may use them to infer information about the workload (application and associated data), affecting its confidentiality, integrity and normal execution.  Given this, the default must be that all TEE-internal components should minimise all communications to which the host may have access.

Application

To restrict application data plane communication is clearly infeasible in most cases, though all communications should generally be encrypted for confidentiality and integrity protection and designers and architects with particularly strong security policies may wish to consider how to restrict data plane communications.

Runtime component

Data plane communications from the runtime component are likely to be fewer than application data plan communications in most cases, and there may also be some opportunities to design these with security in mind.

TEE loading and TEE runtime components

These are the components where the most care must be taken, as we have noted above, but also where there may be the most temptation to lower levels of security if only to allow for easier debugging and error management.

Summary

In a standard cloud deployment, there is little incentive to consider strong security controls around logging and debugging, simply because the host has access not only to all communications to and from a hosted workload, but also to all the code and data associated with the workload at runtime.  For Confidential Computing workloads, the situation is very different, and designers and architects of the TEE infrastructure and even, to a lesser extent, of potential workloads themselves, need to consider very carefully the impact of the host gaining access to messages associated with the workload and the infrastructure components.  It is, realistically, infeasible to restrict all communication to levels appropriate for deployment, so it is recommended that various profiles are created which can be applied to different stages of a deployment, and whose use is carefully monitored, logged (!) and controlled by process.

Why is Attestation Required for Confidential Computing?

By Blog No Comments

Alec Fernandez (alfernandez@microsoft.com)

At the end of 2022, the Confidential Computing Consortium amended the definition of Confidential Computing. We added attestation as an explicit part of the definition, but beyond updating our whitepaper we did not explain to the community why we made this change.

First off, an attestation is the evidence that you use to evaluate whether or not to trust a Confidential Computing program or environment. It’s sometimes built into a common protocol as in RA-TLS / Attested TLS. In other uses it might be built into the boot flow of a Confidential VM or built into an asynchronous usage like attaching it to the result of a Confidential Process.

To many of us attestation was an implicit part of Confidential Computing architecture. However it is so central to the idea of Confidential Computing that it really needed to be part of the formal definition.

Hardware and software providers have long offered assurances of security and these assurances have oftentimes fallen short of expectations. A historical analysis of the track record for placing trust in individual organizations to protect data raises important questions for security professionals. The recurrence of data breaches has led to understandably deep skepticism of technologies that purport to provide new security protections.

Users desire to see for themselves the evidence that new technologies are actually safeguarding their data.

Attestation is the process by which customers can alleviate their skepticism by getting answers to these questions:  

  • Can the TEE provide evidence showing that its security assurances are in effect?
  • Who is providing this evidence?  
  • How is the evidence obtained?
  • Is the evidence valid, authentic, and delivered through a secure chain of custody?
  • Who judges the evidence? 
  • Is the judge separate from the evidence provider?
  • Who provides the standards against which the evidence is judged?
  • Can evidence assure that the code and data protection claims are in effect? 

Hardware based attestation evidence is produced by a trusted hardware root-of-trust component of the computing environment. The hardware root-of-trust is a silicon chip or a set of chips that have been specifically designed to be highly tamper resistant. Some have been reviewed by researchers at standards organizations such as NIST, NSA, ICO, ENISA and academic institutions around the world and the technical community at large. While a critique of the analyses behind hardware roots of trust is beyond the scope of this paper, we take them to represent the current state of the art in computer security. They represent a significant improvement over available alternatives. See reference material at the end of this blog for more information.

Providing Attestation Evidence

Attestation evidence is delivered in a message containing authentic, accurate and timely measurements of system components such as hardware, firmware, BIOS and the software and data state of the computer being evaluated. Importantly, this attestation evidence is digitally signed by a key known only to the hardware root-of-trust (often the physical CPU) and not extractable. This means that the attestation evidence is secured. It cannot be altered, once it leaves the hardware without the alteration being detected. It is impervious to attacks by the host operating system, the kernel, the cloud platform provider. This eliminates chain of custody concerns as the evidence flows from the producer to the consumer.

Validating the Authenticity of Attestation Evidence

Before examining the attestation evidence, the source of the evidence must be established. This is done by matching the digital signature in the attestation evidence with a certificate issued by the manufacturer of the hardware root of trust, for example the manufacturer of the physical CPU in the computer. If the signature on the attestation evidence matches the manufacturer’s certificate, then this proves that the attestation report was produced by the CPU hardware. This means that if you trust the company that manufactured the hardware, then you can trust the attestation report.

Who Judges the Attestation Evidence? Are they Separate from the Evidence Provider?

Having the attestation evidence delivered in a message that is digitally signed by hardware allows for TEE users to establish for themselves that the security assurances provided by the TEE are in place. This can be done without the provider of the computing infrastructure or intervening parties being able to alter the evidence during delivery.

Attestation evidence is highly technical and oftentimes it is not feasible for an organization to judge the evidence themselves. This is especially true when the organization is not specialized in computing infrastructure security. In cases such as these, having a different entity, a third party with security expertise, evaluate the evidence offers a good balance between security and complexity. In this scenario, the computing infrastructure or device user is implicitly trusting the entity that verifies the attestation evidence (the verifier). In such scenarios, it is imperative for the device user to have access to effective mechanisms to verify the authenticity and reliability of the verifier to ensure that the attestation results produced by the verifier are legitimate and trustworthy.

Who provides the standards against which the evidence is judged?

The attestation evidence contains claims about the physical characteristics and the configuration settings of the execution environment. Examples include:

  • CPU Manufacturer, model and version and identifier.
  • Microcode and firmware version.
  • Configuration settings, e.g., whether memory encryption is enabled.
  • Encryption configuration, e.g., whether a different key is used to protect each individual VM

The values supplied in the attestation evidence are compared against reference values. For example, the firmware supplier might recommend that it be patched to a specific version due to the discovery of a security vulnerability. The attestation evidence will accurately reflect the current firmware version. But who decides which are acceptable firmware versions?

  • Since the firmware is typically the responsibility of the hardware manufacturer and they have intimate knowledge of the details behind its security baseline, they should certainly be consulted.
  • The owner of the device or computing infrastructure should also be consulted since they could be responsible for any risks of data exfiltration.
  • In a public cloud environment, the computing infrastructure provider controls patching the firmware to the hardware manufacturer’s recommended version but they do not make use of the resulting environment. The user of the TEE is responsible for data placed in the environment and must ensure that firmware complies with their security policy

Remote attestation provides a way to evaluate evidence that shows the actual firmware version provided by the TEE. This evidence is provided directly by the hardware on which the TEE is executing and allows the attestation verifier to independently verify when the patching was completed.

More generally, attestation can be used to check whether all available security standards and policies have been met. This practically eliminates the possibility that a configuration error on the part of the computer or device owner will result in a security guarantee being falsely reported. The computer or device owner might be incorrectly configured in a way that goes undetected, but the attestation evidence comes directly from the hardware component that is executing the TEE and so remains accurate.

Relying on Attestation Evidence to Secure a TEE

An example of using attestation to provide data security is secure key release (SKR). One excellent use case for SKR is configuring your key management infrastructure (KMI) to evaluate the attestation evidence against a policy controlled by the verifier which is deemed to be trustworthy owner of the TEE and configuring your KMI to refuse to supply the key needed to decrypt the computer’s OS disk unless the attestation evidence shows the computer to be in compliance. In this example, the attestation evidence is generated when the computer is powered on and sent to the KMI. If the attestation evidence indicates that the TEE is not in compliance with the policy (perhaps because the CPU firmware was not an acceptable version) then the KMI would not release the decryption key to the compute infrastructure and this would prevent data from being decrypted and this prevent the risk of data exfiltration.

Conclusion

Confidential computing, through the use of hardware-based, attested TEEs and remote attestation protects sensitive data and code against an increasingly common class of threats occurring during processing while data is in use. These were previously difficult, if not impossible to mitigate. Additionally, Confidential Computing allows for protecting data against the owner of the system and public cloud platforms which traditionally had to simply be trusted to not use their elevated permissions to access the data.

 

References

https://nvlpubs.nist.gov/nistpubs/ir/2022/Nist.IR.8320.pdf

https://tools.ietf.org/html/draft-ietf-rats-architecture

CCC-A-Technical-Analysis-of-Confidential-Computing-v1.3_Updated_November_2022.pdf (confidentialcomputing.io)

Common-Terminology-for-Confidential-Computing.pdf (confidentialcomputing.io)

CCC_outreach_whitepaper_updated_November_2022.pdf (confidentialcomputing.io)