The Linux Foundation Projects
Skip to main content
All Posts By

Confidential Computing Consortium

Broad industry representation at Confidential Computing Summit

By Blog No Comments

On Thursday, 29th June 2023, the first Confidential Computing Summit was held at the Marriott Marquis in San Francisco.  Organized by Opaque Systems and the Confidential Computing Consortium, it comprised 38 sessions delivered by 44 speakers and panelists, with 244 attendees – over twice the expected number.  Although initially planned as a single track event, the number of responses to the Call for Papers was so large that the agenda was split into three tracks, with keynotes starting and ending the event.

Sessions covered a broad range of topics, from state of the industry and outlook, to deep-dive technical discussions.  One of the key themes of the Summit, however, was the application of Confidential Computing to real-life use cases, with presentations by end users as well as suppliers of Confidential Computing technologies.  The relevance of Confidential Computing to AI was a recurring topic as data and model privacy is emerging as a major concern for many users, particularly those with requirements to share data with untrusted parties whether partners or even competitors for multi-party collaboration.  Other use cases included private messaging, anti-money laundering, Edge computing, regulatory compliance, Big Data, examination security and data sovereignty.  Use cases for Confidential Computing ranged across multiple sectors, including telecommunications, banking, insurance, healthcare and AdTech. Sessions ranged from high-level commercial use case discussions to low-level technical considerations.

There was an exhibitor hall which doubled as meeting space and included booths from the CCC and Opaque Systems plus the Summit’s premier sponsors (Microsoft, Intel, VMware, Arm, Anjuna, Fortanix, Edgeless Systems, Cosmian).  The venue also had sufficient space (and seating with branded cushions!) for a busy “hallway track”.  For many attendees, the ability to meet other industry professionals in person for the first time was as valuable a reason to attend the Summit as the session – while virtual conferences can have value, the conversations held face-to-face at the conference provided opportunities for networking that would have been impossible without real-world interactions.

Videos of many of the sessions will be made available on the conference website in the coming weeks: https://confidentialcomputingsummit.com/ (the agenda of sessions presented is also available).

The Confidential Computing Consortium would like to thank Opaque Systems and the program committee for their hard work in organizing this event.  Given the success of the Summit, plans are already underway for a larger instance next year.  Please keep an eye on this blog and other news outlets for information.  We look forward to seeing you there!

Confidential Computing: logging and debugging

By Blog No Comments

Mike Bursell

This article is a slightly edited version of an article originally published at https://blog.enarx.dev/confidential-computing-logging-and-debugging/

Debugging applications is an important part of the development process, and one of the mechanisms we use for it is logging: providing extra details about what’s going on in (and around) the application to help us understand problems, manage errors and (when we’re lucky!) monitor normal operation.  Logging then, is useful not just for abnormal, but also for normal (“nominal”) operations.  Log entries and other error messages can be very useful, but they can also provide information to other parties – sometimes information which you’d prefer they didn’t have.  This is particularly true when you are thinking about Confidential Computing: running applications or workloads in environments where you really want to protect the confidentiality and integrity of your application and its data.  This article examines some of the issues that we need to consider when designing Confidential Computing frameworks, the applications we run in them, and their operations.  It is written partly from the point of view of the Enarx project, but that is mainly to provide some concrete examples: these have been generalised where possible.  Note that this is quite a long article, as it goes into detailed discussion of some complex issues, and tries to examine as many of the alternatives as possible.

First, let us remind ourselves of one of the underlying assumptions about Confidential Computing in general which is that you don’t trust the host. The host, in this context, is the computer running your workload within a TEE instance – your Confidential Computing workload (or simply workload). And when we say that we don’t trust it, we really mean that: we don’t want to leak any information to the host which might allow it (the host) to infer information about the workload that is running, either in terms of the program itself (and any associated algorithms) or the data.

Now, this is a pretty tall order, particularly given that the state of the art at the moment doesn’t allow for strong protections around resource utilisation by the workload. There’s nothing that the workload can do to stop the host system from starving it of CPU resources, and slowing it down, or even stopping it running altogether.  This presents the host with many opportunities for artificially imposed timing attacks against which it is very difficult to protect.  In fact, there are other types of resource starvation and monitoring around I/O as well, which are also germane to our conversation.

Beyond this, the host system can also attempt to infer information about the workload by monitoring its resource utilisation without any active intervention. To give an example, let us say that the host notices that the workload creates a network socket to an external address. It (the host) starts monitoring the data sent via this socket, and notices that it is all encrypted using TLS. The host may not be able to read the data, but it may be able to infer that a specific short burst of activity just after the opening of the socket corresponds to the generation of a cryptographic key. This information on its own may be sufficient for the host to fashion passive or active attacks to weaken the strength of this key.

None of this is good news, but let’s extend our thinking beyond just normal operation of the workload and consider debugging generally and the error handling more particularly. For the sake of clarity, we will posit a tenant with a client process on a separate machine (considered trusted, unlike the host), and that TEE instance on the host has four layers, including the associated workload. This may not be true for all applications or designs, but is a useful generalisation, and covers most of the issues that are likely to arise.  This architecture models a cloud workload deployment. Here’s a picture.

TEE layers and components

These layers may be defined thus:

  1. application layer – the application itself, which may or may not be aware that it is running within a TEE instance. For many use cases, this, from the point of view of a tenant/client of the host, is the workload as defined above.
  2. runtime layer – the context in which the application runs. How this is considered is likely to vary significantly between TEE type and implementations, and in some cases (where the workload is a full VM image, including application and operating system, for instance), there may be little differentiation between this layer and the application layer (the workload includes both). In many cases, however, the runtime layer will be responsible for loading the application layer – the workload.
  3. TEE loading layer – the layer responsible for loading at least the runtime layer, and possibly some other components into the TEE instance. Some parts of this are likely to exist outside of the TEE instance, but others (such as a UEFI loader for a VM) may exist within it. For this reason, we may choose to separate “TEE-internal” from “TEE-external” components within this layer. For many implementations, this layer may disappear (cease to run and be removed from memory) once the runtime has started.
  4. TEE execution layer – the layer responsible for actually executing the runtime above it, and communicating with the host. Like the TEE loading layer, this is likely to exist in two parts – one within the TEE instance, and one outside it (again, “TEE-internal” and “TEE-external”. 

An example of relative lifecycles is shown here.

Component lifecycles

Now we consider logging for each of these.

Application layer

The application layer generally communicates via a data plane to other application components external to the TEE, including those under the control of the tenant, some of which may sit on the client machine.  Some of these will be considered trusted from the point of view of the application, and these at least will typically require an encrypted communication channel so that the host is unable to snoop on the data (others may also require encryption).  Exactly how these channels are set up will vary between implementations, but application-level errors and logging should be expected to use these communication channels, as they are relevant to the application’s operation. This is the simplest case, as long as channels to external components are available. Where they cease to be available, for whatever reason, the application may choose to store logging information for later transfer (if possible) or communicate a possible error state to the runtime layer.

The application may also choose to communicate other runtime errors, or application errors that it considers relevant or possibly relevant to runtime, to the runtime layer.

Runtime layer

It is possible that the runtime layer may have access to communication channels to external parties that the application layer does not – in fact, if it is managing the loading and execution of the runtime layer, this can be considered a control plane. As the runtime layer is responsible for the execution of the application, it needs to be protected from the host, and it resides entirely within the TEE instance. It also has access to information associated with the application layer (which may include logging and error information passed directly to it by the application), which should also be protected from the host (both in terms of confidentiality and integrity), and so any communications it has with external parties must be encrypted.

There may be a temptation to consider that the runtime layer should be reporting errors to the host, but this is dangerous. It is very difficult to control what information will be passed: not only primary information, but also inferred information. There does, of course, need to be communication between the runtime layer and the host in order to allow execution – whether this is system calls or another mechanism – but in the model described here, that is handled by the TEE execution layer.

TEE loading layer

This layer is one where we start having to make some interesting decisions.  There are, as we noted, two different components which may make up this layer: TEE-internal and TEE-external.

TEE loading – TEE-internal

The TEE-internal component may generate logging information associated either with successful or unsuccessful loading of a workload.  Some errors encountered may be recoverable, while others are unrecoverable.  In most cases, it may generally be expected that a successful loading event is considered non-sensitive and can be exposed to the TEE-external component, as the host will generally be able to infer successful loading as execution will continue onto the next phase (even when the TEE loading layer and TEE execution layer do have not explicitly separate external components), but the TEE-internal component still needs to be careful about the amount of information exposed to the host, as even information around workload size or naming may provide a malicious entity with useful information.  In such cases, integrity protection of messages may be sufficient: failure to provide integrity protection could lead the host to misreport successful loading to a remote tenant, for example – not necessarily a major issue, but a possible attack vector nevertheless.

Error events associated with failure to load the workload (or parts of it) are yet more tricky.  Opportunities may exist for the host to tamper with the loading process with the intention of triggering errors from which information may be gleaned – for instance, pausing execution at particular points and seeing what error messages are generated.  The more data exported by the TEE loading internal component, the more data the external component may be able to make available to malicious parties.  One of the interesting questions to consider is what to do with error messages generated before a communications channel (the control plane) back to the provisioning entity has been established.  Once this has been established (and is considered “secure” to the appropriate level required), then transferring error messages via it is a pretty straightforward proposition, though this channel may still be subject to traffic analysis and resource starvation (meaning that any error states associated with timing need to be carefully examined).  Before this communication channel has been established, the internal component has three viable options (which are not mutually exclusive):

  1. Pass to the external component for transmission to the tenant “out of band”, by the external component.
  2. Pass to the external component for storage and later consumption and transmission over the control plane by the internal component if the control plane can be established in the future.
  3. Consign to internal storage, assuming availability of RAM or equivalent assigned for this purpose.

In terms of attacks, options 1 and 2 are broadly similar as long as the control plane fails to exist.  Additionally, in case 1, the external component can choose not to transmit all (or any) of the data to the tenant, and in case 2, it may withhold data from the internal component when requested.

If we take the view (as proposed above) that at least the integrity, and possibly the confidentiality of error messages is of concern, then option 1 would only be viable if a shared secret has already been established between the TEE loading internal component and the tenant or the identity of the TEE loading internal component already established with the tenant, which is impossible unless the control plane has already been created.  For option 2, the internal component can generate a key which it can use to encrypt the data sent to the external component, and store this key for decryption when (and if) the external component returns the data.

TEE loading – TEE-external

Any information which is available to any TEE-external component must be assumed to be unprotected and untrusted.  The only exceptions are if data is signed (for integrity) or encrypted (for confidentiality, though integrity is typically also transparently assured when data is encrypted), as noted above.  The TEE-external may choose to store or transmit error messages from the TEE-internal component, as noted above, but it may also generate log entries of its own.  There are five possible (legitimate) consumers of these entries:

  1. The host system – the host (general logging, operating system or other components) may consume information around successful loading to know when to start billing, for instance, or consume information around errors for its own purposes or to transmit to the tenant (where the TEE loading component is not in direct contact with the client, or other communication channels are preferred).
  2. The TEE loading internal component – there may be both success and failure events which are useful to communicate to the TEE loading internal component to allow it to make decisions.  Communications to this component assume, of course, that loading was sufficiently successful to allow the TEE loading internal component to start execution.
  3. The TEE runtime external component – if the lifecycle has proceeded to the stage where the TEE runtime component is executing, the TEE loading external component can communicate logging information to it, either directly (if they are executing concurrently) or via another entity such as storage.
  4. The TEE runtime internal component – similarly to case #3 above, the TEE loading external component may be able to communicate to the TEE runtime internal component, either directly or indirectly.
  5. The client – as noted in #1 above, the host may communicate logging information to the client.  An alternative, if an appropriate communications channel exists, is for the TEE loading external component to communicate directly with it.  The client should always treat all communications with this component as untrusted (unless they are being transmitted for the internal component, and are appropriately integrity/confidentiality protected).

The TEE runtime layer

TEE runtime – TEE-internal

While the situation for this component is similar to that for the TEE loading internal component, it is somewhat simpler because the fact that this stage of the lifecycle has been reached means that the application has, by definition, been loaded and is running.  This means that there are a number of different channels for communication of error messages: the application data plane, the runtime control plane and the TEE runtime external component.  Most logging information will generally be directed either to the application (for decision making or transmission over its data plane at the application’s discretion) or to the client via the control plane. Standard practice can generally be applied as to which of these is most appropriate for which use cases.

Transmission of data to the TEE runtime external component needs to be carefully controlled, as the runtime component (unless it is closely coupled with the application) is unlikely to be in a good position to judge what information might be considered sensitive if available to components or entities external to the TEE.  For this reason, either error communication to the TEE runtime external component should be completely avoided, or standardised (and carefully designed) error messages should be employed – which makes standard debugging techniques extremely difficult.

Debugging

Any form of debugging for TEE instances is extremely difficult, and there are two fairly stark choices:

  1. Have a strong security profile and restrict debugging to almost nothing.
  2. Have a weaker security profile and acknowledge that it is almost impossible to ensure the protection of the confidentiality and integrity of the workload (the application and its data).

There are times, particularly during the development and testing of a new application when the latter is the only feasible approach.  In this case, we can recommend two principles:

  1. Create a well-defined set of error states which can be communicated via untrusted channels (that is, which are generally unprotected from confidentiality and integrity attacks), and which do not allow for “free form” error messages (which are more likely to leak information to a host).
  2. Ensure that any deployment with a weaker profile is closely controlled (and never into production).

These two principles can be combined, and a deployment lifecycle might allow for different profiles: e.g. a testing profile on local hardware allowing free form error messages and a staging profile on external hardware which only allows for “static” error messages.

Standard operation

Standard operation must assume the worst case scenario, which is that the host may block, change and interfere with all logging and error messages to which it has access, and may use them to infer information about the workload (application and associated data), affecting its confidentiality, integrity and normal execution.  Given this, the default must be that all TEE-internal components should minimise all communications to which the host may have access.

Application

To restrict application data plane communication is clearly infeasible in most cases, though all communications should generally be encrypted for confidentiality and integrity protection and designers and architects with particularly strong security policies may wish to consider how to restrict data plane communications.

Runtime component

Data plane communications from the runtime component are likely to be fewer than application data plan communications in most cases, and there may also be some opportunities to design these with security in mind.

TEE loading and TEE runtime components

These are the components where the most care must be taken, as we have noted above, but also where there may be the most temptation to lower levels of security if only to allow for easier debugging and error management.

Summary

In a standard cloud deployment, there is little incentive to consider strong security controls around logging and debugging, simply because the host has access not only to all communications to and from a hosted workload, but also to all the code and data associated with the workload at runtime.  For Confidential Computing workloads, the situation is very different, and designers and architects of the TEE infrastructure and even, to a lesser extent, of potential workloads themselves, need to consider very carefully the impact of the host gaining access to messages associated with the workload and the infrastructure components.  It is, realistically, infeasible to restrict all communication to levels appropriate for deployment, so it is recommended that various profiles are created which can be applied to different stages of a deployment, and whose use is carefully monitored, logged (!) and controlled by process.

Why is Attestation Required for Confidential Computing?

By Blog No Comments

Alec Fernandez (alfernandez@microsoft.com)

At the end of 2022, the Confidential Computing Consortium amended the definition of Confidential Computing. We added attestation as an explicit part of the definition, but beyond updating our whitepaper we did not explain to the community why we made this change.

First off, an attestation is the evidence that you use to evaluate whether or not to trust a Confidential Computing program or environment. It’s sometimes built into a common protocol as in RA-TLS / Attested TLS. In other uses it might be built into the boot flow of a Confidential VM or built into an asynchronous usage like attaching it to the result of a Confidential Process.

To many of us attestation was an implicit part of Confidential Computing architecture. However it is so central to the idea of Confidential Computing that it really needed to be part of the formal definition.

Hardware and software providers have long offered assurances of security and these assurances have oftentimes fallen short of expectations. A historical analysis of the track record for placing trust in individual organizations to protect data raises important questions for security professionals. The recurrence of data breaches has led to understandably deep skepticism of technologies that purport to provide new security protections.

Users desire to see for themselves the evidence that new technologies are actually safeguarding their data.

Attestation is the process by which customers can alleviate their skepticism by getting answers to these questions:  

  • Can the TEE provide evidence showing that its security assurances are in effect?
  • Who is providing this evidence?  
  • How is the evidence obtained?
  • Is the evidence valid, authentic, and delivered through a secure chain of custody?
  • Who judges the evidence? 
  • Is the judge separate from the evidence provider?
  • Who provides the standards against which the evidence is judged?
  • Can evidence assure that the code and data protection claims are in effect? 

Hardware based attestation evidence is produced by a trusted hardware root-of-trust component of the computing environment. The hardware root-of-trust is a silicon chip or a set of chips that have been specifically designed to be highly tamper resistant. Some have been reviewed by researchers at standards organizations such as NIST, NSA, ICO, ENISA and academic institutions around the world and the technical community at large. While a critique of the analyses behind hardware roots of trust is beyond the scope of this paper, we take them to represent the current state of the art in computer security. They represent a significant improvement over available alternatives. See reference material at the end of this blog for more information.

Providing Attestation Evidence

Attestation evidence is delivered in a message containing authentic, accurate and timely measurements of system components such as hardware, firmware, BIOS and the software and data state of the computer being evaluated. Importantly, this attestation evidence is digitally signed by a key known only to the hardware root-of-trust (often the physical CPU) and not extractable. This means that the attestation evidence is secured. It cannot be altered, once it leaves the hardware without the alteration being detected. It is impervious to attacks by the host operating system, the kernel, the cloud platform provider. This eliminates chain of custody concerns as the evidence flows from the producer to the consumer.

Validating the Authenticity of Attestation Evidence

Before examining the attestation evidence, the source of the evidence must be established. This is done by matching the digital signature in the attestation evidence with a certificate issued by the manufacturer of the hardware root of trust, for example the manufacturer of the physical CPU in the computer. If the signature on the attestation evidence matches the manufacturer’s certificate, then this proves that the attestation report was produced by the CPU hardware. This means that if you trust the company that manufactured the hardware, then you can trust the attestation report.

Who Judges the Attestation Evidence? Are they Separate from the Evidence Provider?

Having the attestation evidence delivered in a message that is digitally signed by hardware allows for TEE users to establish for themselves that the security assurances provided by the TEE are in place. This can be done without the provider of the computing infrastructure or intervening parties being able to alter the evidence during delivery.

Attestation evidence is highly technical and oftentimes it is not feasible for an organization to judge the evidence themselves. This is especially true when the organization is not specialized in computing infrastructure security. In cases such as these, having a different entity, a third party with security expertise, evaluate the evidence offers a good balance between security and complexity. In this scenario, the computing infrastructure or device user is implicitly trusting the entity that verifies the attestation evidence (the verifier). In such scenarios, it is imperative for the device user to have access to effective mechanisms to verify the authenticity and reliability of the verifier to ensure that the attestation results produced by the verifier are legitimate and trustworthy.

Who provides the standards against which the evidence is judged?

The attestation evidence contains claims about the physical characteristics and the configuration settings of the execution environment. Examples include:

  • CPU Manufacturer, model and version and identifier.
  • Microcode and firmware version.
  • Configuration settings, e.g., whether memory encryption is enabled.
  • Encryption configuration, e.g., whether a different key is used to protect each individual VM

The values supplied in the attestation evidence are compared against reference values. For example, the firmware supplier might recommend that it be patched to a specific version due to the discovery of a security vulnerability. The attestation evidence will accurately reflect the current firmware version. But who decides which are acceptable firmware versions?

  • Since the firmware is typically the responsibility of the hardware manufacturer and they have intimate knowledge of the details behind its security baseline, they should certainly be consulted.
  • The owner of the device or computing infrastructure should also be consulted since they could be responsible for any risks of data exfiltration.
  • In a public cloud environment, the computing infrastructure provider controls patching the firmware to the hardware manufacturer’s recommended version but they do not make use of the resulting environment. The user of the TEE is responsible for data placed in the environment and must ensure that firmware complies with their security policy

Remote attestation provides a way to evaluate evidence that shows the actual firmware version provided by the TEE. This evidence is provided directly by the hardware on which the TEE is executing and allows the attestation verifier to independently verify when the patching was completed.

More generally, attestation can be used to check whether all available security standards and policies have been met. This practically eliminates the possibility that a configuration error on the part of the computer or device owner will result in a security guarantee being falsely reported. The computer or device owner might be incorrectly configured in a way that goes undetected, but the attestation evidence comes directly from the hardware component that is executing the TEE and so remains accurate.

Relying on Attestation Evidence to Secure a TEE

An example of using attestation to provide data security is secure key release (SKR). One excellent use case for SKR is configuring your key management infrastructure (KMI) to evaluate the attestation evidence against a policy controlled by the verifier which is deemed to be trustworthy owner of the TEE and configuring your KMI to refuse to supply the key needed to decrypt the computer’s OS disk unless the attestation evidence shows the computer to be in compliance. In this example, the attestation evidence is generated when the computer is powered on and sent to the KMI. If the attestation evidence indicates that the TEE is not in compliance with the policy (perhaps because the CPU firmware was not an acceptable version) then the KMI would not release the decryption key to the compute infrastructure and this would prevent data from being decrypted and this prevent the risk of data exfiltration.

Conclusion

Confidential computing, through the use of hardware-based, attested TEEs and remote attestation protects sensitive data and code against an increasingly common class of threats occurring during processing while data is in use. These were previously difficult, if not impossible to mitigate. Additionally, Confidential Computing allows for protecting data against the owner of the system and public cloud platforms which traditionally had to simply be trusted to not use their elevated permissions to access the data.

 

References

https://nvlpubs.nist.gov/nistpubs/ir/2022/Nist.IR.8320.pdf

https://tools.ietf.org/html/draft-ietf-rats-architecture

CCC-A-Technical-Analysis-of-Confidential-Computing-v1.3_Updated_November_2022.pdf (confidentialcomputing.io)

Common-Terminology-for-Confidential-Computing.pdf (confidentialcomputing.io)

CCC_outreach_whitepaper_updated_November_2022.pdf (confidentialcomputing.io)