The Confidential Computing Consortium is delighted to announce ManaTEE, a new open-source project designed to enable secure data collaboration without compromising the privacy of individual data. Published by TikTok in June 2024 as part of their ongoing Privacy Innovation efforts, ManaTEE started as a core use case of TikTok. Now part of the Confidential Computing Consortium, ManaTEE addresses the growing challenges of balancing privacy, usability, and accuracy in enterprise data collaboration.
The Challenge of Data Collaboration
While data collaboration is essential, designing and building a secure framework involves significant effort and numerous caveats. Existing solutions, such as differential privacy or commercial data clean rooms, often fail to provide a balance between privacy, accuracy, and usability, particularly when handling large-scale data.
Introducing ManaTEE: A Two-Stage Data Clean Room
ManaTEE introduces a two-stage data clean room model to provide an interactive interface for exploring data while protecting private data during processing. It combines different privacy-enhancing technologies (PETs) across two stages:
- Programming Stage: Data consumers explore datasets using low-risk data, employing different PETs such as pseudonymization or differentially private synthetic data generation.
- Secure Execution Stage: Workloads run in a trusted execution environment (TEE), which provides attestable integrity and confidentiality guarantees for the workload in the cloud.
Key Benefits of ManaTEE
- Cloud-Ready: ManaTEE can be easily deployed to existing cloud TEE backends such as Google Confidential Space. We plan to support other backends as well, eliminating the need to build the entire TEE infrastructure to set up the framework.
- Flexible PET: Data providers can control the protection mechanisms at each stage to tailor to specific privacy requirements of the data.
- Trusted Execution Environment: By leveraging TEEs, ManaTEE ensures a high level of confidence in data confidentiality and program integrity for both data providers and data consumers.
- Accuracy and Utility: ManaTEE employs a two-stage design to ensure that result accuracy is not compromised for the sake of privacy.
Features of ManaTEE’s Data Clean Room
- Interactive Programming: Integrated with Jupyter Notebook, allowing data consumers to work with Python and other popular languages.
- Multiparty Collaboration: Enables collaboration with multiple data providers.
- Flexibility: Adaptable to specific enterprise needs.
ManaTEE Use Cases
- Trusted Research Environments (TREs): Secure data analysis for public health, economics, and more, while maintaining data privacy.
- Advertising & Marketing: Lookalike segment analysis and private ad tracking without compromising user data.
- Machine Learning: Enables private model training without exposing sensitive data or algorithms.
Open Collaboration and Governance
ManaTEE encourages open collaboration within its growing community. Currently led by TikTok’s founding developers, ManaTEE plans to expand its leadership through a Technical Steering Committee (TSC). Eventually, future project milestones and growth plans will be publicly discussed and governed transparently.
The ManaTEE project welcomes anyone interested in confidential computing and private data collaboration to participate and contribute.
Conclusion
ManaTEE is a significant step forward in secure data collaboration, balancing privacy, usability, and accuracy. Organizations can safely collaborate on sensitive datasets by leveraging TEEs and a two-stage clean room approach.
To learn more, visit the Confidential Computing Consortium or explore ManaTEE on GitHub.