Protopia AI Glossary
Protopia AI’s glossary explains the terms behind inference privacy: what protected representations are, how Stained Glass Transform works, and how the most common privacy-preserving technologies for AI inference compare. Each entry includes how the technology works, what it protects against, and what it does not.
A
Automatic Differentiation
A method used to efficiently and accurately compute the gradient of a function with respect to its inputs. The gradient provides information about the rate of change of the function in relation to its inputs and is used in optimization algorithms to update the parameters of a model.
Automatic differentiation is used in machine learning and artificial intelligence to compute gradients for training models. It is beneficial for training deep neural networks, where the computation graph can be complex, and the gradients are required for many optimization algorithms.
AI Factory
An AI Factory is a purpose-built data center environment designed to produce AI inference and training outputs at scale, combining accelerated compute, high-speed networking, storage, and a software stack into a single operational system.
The term traces to NVIDIA’s “data centers for manufacturing intelligence” framing. NVIDIA publishes Enterprise Reference Architectures as validated blueprints, and major system vendors ship AI Factory solutions on top of them, including Dell, HPE, Cisco, Lenovo, Supermicro, and Red Hat.
The economic case for AI Factories depends on running multiple tenants and workloads through the same GPU pool, driving utilization well above the 20-40% range typical of single-tenant deployments. Data sensitivity is the constraint that limits this consolidation. Protected representations remove that constraint by allowing sensitive workloads to run alongside others without exposing plaintext to the shared serving environment.
Related: Multi-tenant inference, Stained Glass Transform, Trust boundary.
See also: Protopia’s AI Factories solution page.
C
Computational Cost
A method used to efficiently and accurately compute the gradient of a function with respect to its inputs. The gradient provides information about the rate of change of the function in relation to its inputs and is used in optimization algorithms to update the parameters of a model.
Automatic differentiation is used in machine learning and artificial intelligence to compute gradients for training models. It is beneficial for training deep neural networks, where the computation graph can be complex, and the gradients are required for many optimization algorithms.
Confidential Computing
Confidential Computing is a hardware-based approach to protecting data in use by isolating computation inside a Trusted Execution Environment (TEE), where the data is encrypted in memory and decrypted only inside an enclave on the CPU or GPU. The category is defined by the Confidential Computing Consortium, a Linux Foundation project established in 2019.
On modern AI hardware, NVIDIA H100 and H200 GPUs support a Confidential Computing mode that extends the TEE boundary from a host CPU (Intel TDX or AMD SEV-SNP) to the GPU via an encrypted PCIe channel. Remote attestation lets the data owner cryptographically verify that the enclave is running expected, unmodified code.
Confidential Computing protects data from the infrastructure operator: the cloud provider, the hypervisor, and co-tenants. The entity that processes data inside the enclave, such as model operators, still see plaintext prompts and context. SGT and Confidential Computing are complementary: combined, neither operator sees plaintext.
Related: Trusted Execution Environment, Remote attestation, Model operator, Infrastructure operator.
D
Data Access
Data access is the ability for users to access their data given physical, software, or legal and policy-driven constraints. Data access can be fine-grained and allows users to carry out various operations such as creating, replacing, updating, deleting, or moving data for ML and AI operations.
The level of access granted to a user can vary depending on the data’s sensitivity and the user’s role in the organization. Data access can be managed by using authentication and authorization mechanisms, such as user accounts, passwords, and access control lists, to ensure that only authorized users can access sensitive data. Effective data access management helps to ensure the confidentiality, integrity, and availability of data and is critical for protecting sensitive information and maintaining the security of an organization’s information systems.
Data Anonymization
Data anonymization is the removal of personal data so that the data subject cannot be identified, directly or indirectly, by any reasonably available means. The standard set by GDPR Recital 26 treats data that meets this threshold as falling outside the regulation entirely because it is no longer considered personal data.
Common techniques include suppression (removing fields entirely), generalization (replacing specific values with ranges), aggregation (reporting only group-level statistics), and noise addition. Anonymization is distinct from pseudonymization, where data can still be re-identified using a separately held key; pseudonymized data remains subject to GDPR.
Anonymization is well-suited for structured datasets used in analytics, research, or training. It is less well-suited for AI inference at runtime, where sensitive context is often embedded in unstructured language. Anonymizing or stripping that context before sending it to a model can degrade output quality and still leaves the model operator with whatever sensitive content remains. SGT addresses this by transforming the entire input rather than removing fields from it.
Related: Data masking, Data tokenization (vault substitution), Stained Glass Transform.
Data Governance
Data governance is the overall management of the availability, usability, integrity, and security of the data used in an organization. It involves the development of policies, procedures, and standards for collecting, storing, processing, and sharing data, in addition to ensuring that the data is of high quality and protected from unauthorized access.
The goal of data governance is to ensure that data is effectively used to support decision-making, meet regulatory obligations, and achieve the organization’s goals. Data governance involves many stakeholders, including IT, business leaders, and data stewards, who collaborate to manage and govern data as a critical organizational asset.
Data Labeling
Data labeling identifies objects on raw data such as images, text, videos, and audio. The goal is to provide one or more informative labels to provide context so that a machine learning model can learn from it.
For example, labels might indicate whether a photo contains a train or a horse, which words were said in a voice recording, or if a lab image contains a tumor. Data labeling is required for various use cases, including computer vision, natural language processing, and speech recognition. Supervised machine learning models are built on large volumes of high-quality labeled data.
Data Masking
Data masking is the practice of obscuring sensitive data values, with the goal of preserving the structure and usability of the data while removing the sensitive content. Tokenization, redaction, scrambling, substitution, shuffling, and generalization are all techniques that fall under data masking.
In its most common form, substitute values are fictional but realistic-looking, preserving format and statistical properties so downstream systems continue to function. Masking can be reversible (using a secure vault) or destructive (one-way). NIST and major data protection regulations (GDPR, HIPAA, PCI DSS) treat masking as a primary de-identification mechanism.
For AI inference, the limitation is what masking cannot detect. Sensitive context expressed in natural language typically falls outside what masking tools can identify and replace. When destructive masking is applied, the placeholders also strip context the model needs to produce useful output. Published testing across leading LLMs reports that placeholder masking drops output quality to 54-68% of baseline, while deterministic tokenization preserves 91-96% (NoPII, April 2026).
Related: Data tokenization, Data redaction, Data anonymization, Stained Glass Transform.
Data Ownership
Data ownership refers to the rights and responsibilities of individuals or organizations in the collection, storage, use, and distribution of data. It defines who has the legal right to control the access to and use of specific data and who is responsible for ensuring its security, accuracy, and compliance with relevant laws and regulations.
In organizations, data ownership is typically assigned to specific individuals or departments, such as the chief information security officer (CISO) or the data custodian, who is responsible for managing and safeguarding the data. Data ownership can also be shared between multiple individuals or organizations, such as in the case of partnerships.
Clearly defining data ownership is important for ensuring the security, accuracy, and privacy of data, as well as for managing the risk associated with the collection, storage, and use of data. It also helps to ensure that data is used ethically and transparently and that any conflicts over data access or use are resolved fairly and efficiently
Data Perturbation
Data perturbation changes an original dataset by applying techniques that round numbers and add random noise. The range of values needs to be in proportion to the perturbation. A small base may lead to weak anonymization, while a large base can reduce the dataset’s utility.
Data Control
Data control refers to the measures and processes put in place to manage the access, use, and dissemination of data within an organization. It involves defining who can access and use specific data, what actions they can perform, and under what circumstances. Data control is important for maintaining the confidentiality, integrity, and security of data, as well as for ensuring that data is used responsibly and lawfully.
Data control can be implemented through a combination of technical, administrative, and physical controls, such as access control lists, firewalls, encryption, and data backup and recovery processes.
Data Redaction
Data redaction is a destructive form of data masking that removes sensitive information from a document or data stream, replacing it with blanks, blocks, or placeholder labels such as [REDACTED] or [NAME].
Redaction has a long history in physical document handling and is straightforward for visual documents and structured PII fields. For AI inference, redacted prompts can lose the semantic content the model needs to produce useful output. Redacting patient names from a clinical note may leave diagnosis and treatment context intact, but redacting symptoms, locations, or relationships often strips the meaning the model was being asked to reason about.
In testing across leading LLMs, placeholder redaction is reported to drop output quality to 54-68% of baseline. SGT takes a different approach: rather than removing content, it transforms the entire input into a protected representation that preserves utility for the model.
Related: Data masking, Data tokenization, Plaintext exposure.
Data Sharing
Data sharing is making data available to other individuals or organizations. It involves data exchange between individuals, groups, or organizations, either within the same organization or across multiple organizations. Data sharing can occur in many different forms, such as file sharing, database sharing, cloud sharing, and API sharing.
Data Tokenization
Data tokenization is a form of data masking that replaces sensitive data values with non-sensitive substitutes called tokens, with the originals stored in a secure vault that authorized parties can use to reverse the substitution. This is distinct from LLM tokenization, the subword segmentation step every language model performs on input text.
Tokenization in this sense is widely used in production, especially in payments, where it is a PCI DSS primary control. Vendors include Skyflow, Protegrity, and the open-source Microsoft Presidio.
For AI inference, tokenization works for structured PII such as names, SSNs, emails, and account numbers, but is more difficult to apply to sensitive context embedded in unstructured language. Vault compromise re-exposes the original data. SGT transforms the entire input rather than detecting and replacing fields.
Related: Data masking, Data anonymization, LLM tokenization, Stained Glass Transform.
Data-Type Agnostic
A data-type agnostic methodology is one that can be applied across text, code, tabular data, images, and video without per-data-type detection rules or pattern libraries.
Stained Glass Transform’s training methodology is data-type agnostic. The same approach has been used to produce SGTs for text, code, tabular, image, and video models.
Each SGT artifact is paired to a specific target model and modality. Covering a multimodal target requires a multimodal model and an SGT trained against it.
Related: Stained Glass Transform, Stained Glass Engine, Protected representation.
Data Pipeline
In data engineering and machine learning, a data pipeline refers to the steps involved in extracting, transforming, loading, and processing data. The goal of a data pipeline is to automate the flow of data from raw data sources to the final storage location, enabling organizations to quickly and easily access and analyze their data.
Data pipelines can be built using various tools and technologies, including batch processing frameworks, data processing engines, and cloud-based data platforms. Having an effective data pipeline can help organizations improve the quality and accuracy of their data, gain insights and make better decisions, and increase the efficiency and speed of their data processing.
Deployment Data
Deployment data refers to the data used to deploy and run a machine-learning model in a production environment. This data includes information about the hardware and software resources the model will be deployed on, the input and output data formats, and any pre-processing or post-processing steps that must be performed.
Deployment data for machine learning models and aI also includes information about how the model will be integrated into a larger system or application. This may include information about the APIs or services that will be used to access the model, as well as any security or access controls that need to be put in place.
Having accurate and complete deployment data is important for ensuring machine learning models are deployed correctly and perform well in production. Effective management of deployment data helps to ensure that machine learning models and AI can be deployed and used effectively in real-world applications.
Differential Privacy
Differential privacy is a mathematical framework for releasing statistical information about a dataset while bounding how much any individual record can influence the released output. Originally formalized in 2006 by Cynthia Dwork and colleagues, it provides a quantifiable privacy guarantee.
Production implementations include the Google Differential Privacy Library and OpenDP, used in deployments at the US Census Bureau, Apple, and Microsoft.
Differential privacy addresses training-data privacy, not inference-time prompt privacy. When a user sends a prompt to a DP-trained model, the model operator still sees the prompt in plaintext. DP and SGT solve different problems and are complementary: DP for training-data protection, SGT for inference-time prompt protection.
Related: Stained Glass Transform, Model operator.
Differentiable Programming
A technique in machine learning and artificial intelligence that involves training models using gradient-based optimization. Differentiable programs are programs that rewrite themselves at least one component by optimizing along a gradient like neural networks do use optimization algorithms such as gradient descent.
Differentiable programming makes it possible to train models with large amounts of data and complex architectures, such as deep neural networks.
E
Edge
An edge system is a computing system that is located at the “edge” of a network, close to the source of the data. Edge systems are built to process, analyze, and act on data in real-time, without transmitting all of the data to a centralized processing system. Edge systems are used in various applications, including the Internet of Things (IoT), where devices at the edge of the network generate and collect large amounts of data. By processing this data at the edge, edge systems can reduce the amount of data that needs to be transmitted to centralized systems, reducing latency, improving response times, and reducing the demands on centralized computing resources.
Edge systems may include a range of technologies and devices, including sensors, gateways, embedded computers, and smart devices. They may be powered by various processors and platforms, ranging from microcontrollers and single-board computers to powerful servers and cloud computing platforms. By processing data at the edge, edge systems can help organizations to make decisions in real-time, to respond quickly to changing conditions, and to provide a more responsive and effective service to their customers.
Encryption
Encryption is the process of converting plaintext data into a coded form of data that can only be unlocked by someone who has the appropriate key. The purpose of encryption is to protect sensitive information from unauthorized access and to ensure that data is transmitted securely across networks or stored safely on devices. Encryption algorithms use mathematical formulas to scramble the data, and the key is used to unscramble the data when needed.
Encryption is used in various applications, including digital communications, file storage, and secure online transactions. It is also used to protect sensitive data in transit, such as financial information and data stored on smartphones. Encryption is an essential tool for protecting privacy and data security and is a critical component of many security and privacy protocols. However, encryption can also be a source of complexity, and it is vital to understand the strengths and weaknesses of different encryption algorithms and to use encryption appropriately to protect sensitive information.
F
Federated Learning
Federated Learning is a machine learning technique that enables training on a decentralized dataset distributed across multiple devices. Instead of sending data to a central server for processing, the training occurs locally on each device, and only model updates are transmitted to a central server.
G
Gradient-based Optimization
Gradient-based optimization is a method used in machine learning and artificial intelligence to update the parameters of a model to minimize a loss function. The loss function measures the error between the model’s predictions and the actual outputs and is used in the optimization process.
In gradient-based optimization, the gradient of the loss function with respect to the model parameters is computed using automatic differentiation. The gradient provides information about how the model parameters should be updated to reduce the loss.
Generative AI
Generative AI is a type of artificial intelligence that creates new data or content based on the patterns and structures it has learned from existing data. Generative AI produces new instances that share similarities with the input data and can include text, art music, and design. By understanding the underlying patterns and structures in the input data, generative AI can develop new and unique output.
H
Homomorphic Encryption
A form of encryption that allows computations to be performed directly on ciphertext without the need first to decrypt the data. With homomorphic encryption, data remains encrypted throughout the computation, and the result of the computation is also encrypted. Homomorphic encryption enables computations on encrypted data without exposing the underlying sensitive information.
There are two main types of homomorphic encryption: fully homomorphic encryption and partial or somewhat homomorphic encryption. The first allows for any computations on encrypted data, while the latter only supports limited computations.
Homomorphic Encryption can be slow and computationally expensive, to the point that it’s not commercially currently practical.
Generative AI
Generative AI is a type of artificial intelligence that creates new data or content based on the patterns and structures it has learned from existing data. Generative AI produces new instances that share similarities with the input data and can include text, art music, and design. By understanding the underlying patterns and structures in the input data, generative AI can develop new and unique output.
Inference Data
Inference data is used to make predictions or inferences with a trained machine learning (ML) model. It is the input data that is fed into the trained model, which then produces a prediction based on the relationships it has learned from the training data.
Inference data can differ from the training data used to train the ML model. For example, the training data might be historical data from a certain period, while the inference data might be real-time data from the present.
Inference Privacy
Inference privacy is the protection of input data (prompts, queries, documents, contextual information) from exposure during AI model inference, including from parties operating the inference service.
It is distinct from training-data privacy, which addresses what a model can memorize from its training set and is typically handled with differential privacy. It is also distinct from infrastructure-level privacy, which protects data from the cloud provider or hypervisor and is typically handled with Confidential Computing. The protection target for inference privacy is the prompt itself: the question being asked, the document being summarized, the patient record being analyzed.
Several technology approaches address inference privacy with different tradeoffs. Confidential Computing isolates inference inside a hardware enclave. Homomorphic encryption performs computation on ciphertext. Tokenization and masking remove or substitute sensitive fields before inference. Protected-representation approaches transform the input mathematically before it reaches the model. Each makes different tradeoffs across hardware dependency, performance overhead, accuracy retention, and what part of the inference path it protects.
Related: Plaintext exposure, Private inference, Confidential Computing, Homomorphic encryption, Stained Glass Transform.
J
Jupyter Notebook
Jupyter Notebook is an open-source interactive computing platform that allows users to create and share documents that contain live code, equations, visualizations, and narrative text. Jupyter Notebook supports various programming languages, including Python, R, Julia, and Scala.
Jupyter Notebooks provide a convenient way to perform data analysis and scientific computing, allowing users to mix code, output, and explanations in a single document. They are widely used in data science, machine learning, and scientific computing for tasks such as data cleaning and transformation, and statistical modeling.
L
Labeled Data
Labeled data refers to data that has already been annotated or categorized with labels or tags that describe the content of the data. This data type is commonly used to train machine learning models for specific tasks, such as image classification, object detection, and natural language processing.
Machine learning algorithms use labels to learn the relationships between the inputs and outputs and to make predictions about new, unseen data. Unlabeled data is typically collected and annotated by human experts, who assign labels to each data sample based on its content.
Large language model (LLM)
A language model (LLM) is a form of artificial intelligence that relies on massive amounts of textual data to understand and produce language similar to humans. LLMs learn from the data by identifying patterns, structures, and connections among words and phrases, enabling them to answer questions, compose text, and a variety of language-related tasks. They are used for tasks such as text generation, translation, and summarization. ChatGPT is an example of a LLM.
M
Machine Learning Models
A machine learning model is a mathematical representation of a system capable of learning from data and making predictions or decisions. A machine learning model aims to capture patterns and relationships in data in a way that allows it to make accurate predictions on new and unseen data.
Some different types of standard machine learning models include linear and logistic regression, decision trees, random forests, support vector machines, and neural networks. The choice of model depends on the type of problem being solved and the data’s attributes.
A machine learning model consists of a set of parameters learned from training data and an algorithm that uses the parameters to make predictions. Once a model has been trained, it can be used to predict new and actual data. Machine learning models can be used for various applications, including image classification, speech recognition, and natural language processing (NLP). Machine learning models are being adopted as tools for automating decision-making and for discovering patterns and insights in data.
Model Drift
Model drift, also known as concept drift, refers to a phenomenon in machine learning where the distribution of the data changes over time, and the trained model’s performance degrades. This happens when the relationship between the input features and the target variable changes – causing the model to make incorrect predictions.
For example, consider a model trained to predict the performance of a football team based on its historical data. Over time, the team managers and players might change, causing the relationship between its team data and to change. In this case, the model trained on the historical data might not perform well on new data and would require retraining to maintain its accuracy.
Model drift can decrease accuracy, false predictions, and incorrect decisions. To prevent model drift, it is important to continuously monitor the performance of a model and retrain it as necessary. Techniques such as drift detection can be used to detect model drift.
Model Operator
The model operator is the entity that runs the AI model: the team or company that deploys the inference service, loads the model weights, and processes incoming prompts. The model operator sees plaintext prompts and plaintext outputs because they control the code that handles them.
Stained Glass Transform prevents plaintext exposure to the model operator. The transform is applied at the data owner’s trust boundary before the prompt leaves their environment. The model operator processes a protected representation rather than the original prompt.
Related: Infrastructure operator, Confidential Computing, Trust boundary, Stained Glass Transform.
MLOps
MLOps, an abbreviation for Machine Learning Operations, is a set of practices and processes for managing the end-to-end lifecycle of machine learning models. It includes tasks such as model development, testing, deployment, monitoring, and maintenance. The goal is to ensure the reliable and efficient delivery of machine learning models into production.
MLOps aims to address some of the challenges associated with deploying and maintaining machine learning models in production, such as model governance, model monitoring, model deployment, automating the deployment of models into production, and updating them as needed, and model lifecycle management.
ML Lifecycle
The machine learning (ML) life cycle refers to the stages in building, deploying, and maintaining a machine learning model. The problem that the ML model will solve needs to be clearly defined and understood. This includes identifying the type of data that will be used, the desired outcome, and the metrics that will be used to evaluate the model’s performance.
The ML life cycle can consist of but is not limited to the following stages: data collection, training unlabeled data, labeling, and model selection, evaluation, deployment, monitoring, and maintenance. The goal of the ML lifecycle is to develop and deploy machine learning models that are accurate, reliable, and can scale to deliver value to businesses or society.
Model Deployment
Model deployment refers to making a trained ML model accessible and usable in a real-world production environment by integrating it into a production system and monitoring its performance. The goal of model deployment is to ensure that the ML model can be used to make accurate predictions on new data in a reliable, scalable, and efficient manner.
The steps involved in deploying an ML model can vary but typically include:
- ML model packaging: The ML model is packaged in a format easily integrated into the production system, such as a REST API or Python package.
- Integration with production system: The packaged ML model is integrated into the production system to receive input data and produce predictions.
- Performance monitoring: The ML model is monitored in real-world conditions to ensure that it is making accurate predictions and functioning correctly.
- Maintaining and updating: The deployed ML model may require updates and maintenance over time to address issues such as model drift, changes in the data distribution, or changes in the underlying technology.
Multitenant Inference
Multi-tenant inference is the practice of running AI inference workloads from multiple tenants (business units, customers, partners, or use cases) on shared compute infrastructure rather than dedicated per-tenant hardware.
Multi-tenancy is central to the economics of accelerated computing. Static, single-tenant GPU allocation has been measured at 20-40% average utilization, while pooled multi-tenant deployments can reach 65-75% or higher.
Operating multi-tenant inference at scale involves several challenges, including GPU resource scheduling, performance isolation, VRAM allocation, cost attribution, and data sensitivity controls. Common technical approaches include NVIDIA Multi-Instance GPU (MIG), Kubernetes namespace and virtual cluster isolation, Confidential Computing, and Protopia Stained Glass.
Related: AI Factory, Plaintext exposure, Trust boundary.
N
Natural Language Processing (NLP)
Natural Language Processing (NLP) concerns interactions between computers and human (natural) languages. The goal is to enable computers to process, understand, and generate human language. NLP involves various tasks, such as text classification, sentiment analysis, translation, question-answering, and text summarization. These tasks are typically achieved through machine learning algorithms and deep learning models, such as neural networks.
Examples of NLP applications include chatbots, virtual assistants, language translation services, and text-to-speech systems.
Neural Network
Neural networks are a machine learning model loosely inspired by the structure and function of the human brain. Neural networks are designed to recognize patterns and relationships in data and make predictions based on this information. A neural network comprises collections of interconnected nodes (organized into layers) called artificial neurons, which process and transmit information. Each artificial neuron receives inputs, performs a computation, and produces an output passed to the subsequent layers of neurons. The connections between the neurons are associated with weights and biases, which are adjusted during the training process to improve the accuracy of the predictions.
Neural networks are widely used in various applications of AI, such as image recognition, speech recognition, and natural language processing. They are helpful in solving complex, non-linear problems and can be trained on large and diverse datasets, allowing them to learn from a wide range of data.
O
Optimization Step
The optimization step is the iteration process of finding the best set of parameters or weights for a machine learning model to predict the outputs based on the inputs accurately. The process involves adjusting the model parameters to minimize a loss function, which measures the difference between the predicted outputs and the actual outputs. The goal is to find the set of parameters that minimizes the loss function and results in the most accurate predictions.
Several optimization algorithms can be used for this purpose, including gradient, stochastic and gradient descent. The optimization algorithm depends on the specific requirements of the machine learning problem and the type of model being used. The optimization step is a key part of the machine learning life cycle, as it determines the best parameters for the model to achieve good performance and accuracy.
Open Banking
Open banking is a financial services model that allows third-party providers, such as fintech companies, to access bank customers’ financial data with their consent. The goal of open banking is to promote competition and innovation in the financial services industry by enabling customers to securely share their financial data with trusted providers who can offer a broader range of financial products and services.
Open banking is typically implemented through application programming interfaces (APIs), which allow third-party providers to access bank customers’ financial data in a secure and controlled environment. Banks must make their APIs available to third-party providers, who can develop new financial products and services that leverage this data. Open banking raises concerns about the security and privacy of customer data and the potential for financial fraud and abuse. Open banking is typically subject to strong regulatory oversight and requires banks and third-party providers to implement robust security and privacy measures.
P
Personally Identifiable Information (PII)
Personally Identifiable Information, abbreviated as PII, refers to any information that can be used to identify a specific individual, such as name, address, driver’s license number, etc. When training AI models, it is important to protect PII as it is sensitive information and must be handled in compliance with privacy regulations, such as the General Data Protection Regulation (GDPR) or the Health Insurance Portability and Accountability Act (HIPAA).
Plaintext Exposure
Plaintext exposure is the visibility of unencrypted, unprotected data to systems or operators outside the data owner’s trust boundary during AI inference. It is the gap that motivated the Confidential Computing Consortium’s definition of “data in use” as a third state of data requiring protection, alongside data at rest and data in transit.
Plaintext can be exposed across many operational surfaces during inference: request and response logs, caches, scheduler metadata, GPU and host memory, observability tools, debug snapshots, and backup archives. Encryption in transit (TLS) protects data on the wire but not on these surfaces. Multi-tenant scheduling controls who gets compute, not what data is visible to the serving environment.
Related: Trust boundary, Inference privacy, Confidential Computing, Stained Glass Transform.
Private Inference
Private inference is AI model inference performed under guarantees that the input data is not exposed to parties outside the data owner’s trust boundary, including the entity operating the inference service.
Related: Inference privacy, Confidential Computing, Homomorphic encryption, Stained Glass Transform.
Protected Representations
A protected representation is the output of applying Stained Glass Transform to an input: a stochastic, mathematically transformed version of the original data that the target AI model can process with accuracy parity to plaintext.
Protected representations are what flow to the inference endpoint when SGT is in place. The model performs inference on them as it would on any input embedding. Logs, caches, GPU memory, scheduler metadata, and observability tools on the host only see protected representations.
PyTorch
PyTorch is an open-source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing. PyTorch is a dynamic computational graph framework, which means that the user can change the graph on the spot during runtime. This makes it favorable to work with, compared to static computational graph frameworks like TensorFlow.
PyTorch is known for being user-friendly and flexible. It provides APIs for building and training neural networks, as well as tools for data loading and preprocessing. PyTorch also supports a wide range of features for computer vision, natural language processing, and reinforcement learning. PyTorch has a growing ecosystem of tools and libraries.
R
Remote Attestation
Remote attestation is the cryptographic process by which a system inside a Trusted Execution Environment proves to a remote party that it is running expected, unmodified code on genuine hardware.
Remote attestation is hardware-rooted: the proof traces back to keys provisioned at chip manufacture. This is a property specific to TEE-based approaches and is not shared by software-only inference privacy methods.
Related: Confidential Computing, Trusted Execution Environment.
R
Responsible AI
Responsible AI is a set of practices and principles designed to ensure and promote the safe and ethical use of AI, accounting for aspects such as the safety and benefit of users and organizations, compliance with legal regulations, fairness, transparency, privacy, and accountability, to name a few.
S
Sensitive Data
Sensitive data is confidential, private, or protected by regulations. This may include personal information, financial information, medical information, and other types of data that must be protected from unauthorized access or misuse.
Examples of sensitive data include
- Personal information such as names, addresses, government id numbers, and date of birth.
- Financial information such as bank account numbers and credit card numbers.
- Behavioral data such as internet search history and purchasing patterns.
- Medical data that contains diagnoses and test results.
The handling and protection of sensitive data is a key issue in the development and deployment of AI systems because any unauthorized access or misuse of this information can have catastrophic consequences, such as identity theft, financial fraud, and loss of privacy.To protect sensitive data, AI practitioners employ techniques such as Stained Glass Transformation™, data anonymization, data perturbation, or other methods to reduce the risk of unauthorized use. They may also implement strict data access controls and monitor the use of sensitive data to ensure that it is being used responsibly in accordance with legal requirements.
Stained Glass Engine
The Stained Glass Engine is the software Protopia AI uses to create Stained Glass Transforms. Given a target AI model and a representative training dataset, the Engine produces an SGT artifact paired with that model.
SGT creation is a post-training step. The output is an artifact that runs alongside the target model.
Related: Stained Glass Transform, Stained Glass Proxy.
Stained Glass Proxy
Stained Glass Transform Proxy is a forward proxy that accepts requests in the OpenAI Chat Completions API specification, transforms the prompt content using Stained Glass Transform, and forwards the transformed prompt embeddings to a downstream inference server. Applications send requests to the Proxy as they would to any OpenAI-compatible endpoint.
Stained Glass Transform
Stained Glass Transform (SGT)
Stained Glass Transform is Protopia AI’s inference privacy layer. An SGT converts sensitive input to a target AI model into a stochastic representation at the data owner’s trust boundary. The model processes that representation with accuracy parity to plaintext, and the model operator does not see the original input. The methodology has been applied across text, code, tabular data, image, and video models.
Each SGT is a software-only artifact paired with a specific target model and modality. It runs on commodity GPU infrastructure, requires no model retraining, and integrates through an OpenAI-compatible proxy.
SGT is designed to protect prompt data from the model operator. It can be deployed alongside hardware-based privacy approaches like Confidential Computing, which address different parts of the threat surface.
Related: Protected representation, Stochastic representation, Stained Glass Engine, Stained Glass Proxy, Inference privacy.
Structured Data
Structured data refers to data that is organized in a tabular format with well-defined columns and rows. It is often stored in databases, spreadsheets, or other data structures designed to be easily processed and analyzed. Examples of structured data include data from transactional systems, such as purchase histories and inventory records, and demographic data, such as age and income. Structured data can be easily input into machine learning algorithms for training and prediction. Machine learning algorithms are also optimized for structured data and can often achieve high accuracy and performance when trained on this data type.
Not all relevant information can be captured in structured data. Incorporating unstructured data, such as text, images, and audio can provide additional valuable insights and opportunities for machine learning models. Structured and unstructured data are often combined and used in machine learning for optimal results. Structured data can also be referred to as tabular data.
Synthetic Data
Synthetic data is artificially generated data that is used to mimic real-world data. Synthetic data is often used for testing and training machine learning models, for benchmarking and performance evaluation, and protecting sensitive data in applications such as healthcare and finance.
It is generated using algorithms that model real-world data’s statistical and structural properties. This allows synthetic data to closely resemble real-world data regarding its distribution, relationships, and patterns while still being wholly artificial and not containing any sensitive information. Synthetic data can generate large amounts of data quickly, and bypass the privacy and security risks associated with using real-world data.It is important to evaluate the use of synthetic data carefully and to understand its limitations in various applications.
Semi-structured data
Semi-structured data does not follow any data model because it does not have a fixed schema. Unlike structured/tabular data, it lacks any rigid form. This type of data typically contains elements of both structured and unstructured data and is often found in databases and data warehouses. Examples of semi-structured data include JSON and XML files, which have both structured elements, such as key-value pairs, and unstructured elements, such as text and multimedia data. Processing and analyzing this type of data requires specialized techniques and solutions capable of handling both structured and unstructured data.
To effectively process and analyze semi-structured data for machine learning purposes, it is often necessary to first pre-process and convert the data into a format that can be used as input for machine learning algorithms. This may involve techniques such as data labeling. Semi-structured data can provide insights and opportunities for machine learning models, as it often contains rich and diverse information.
T
Tabular Data
Tabular data refers to data organized in a table format with rows and columns. Each row represents an instance or an observation, and each column represents a feature or an attribute of the cases. Tabular data is commonly used as input for machine learning algorithms. It is easy to process and analyze using traditional data management techniques, and machine learning algorithms are often optimized for this data type. Tabular data is synonymous with structured data.
Not all relevant information can be captured in tabular data. Incorporating other data types, such as text, images, and audio, can provide additional valuable insights and opportunities for machine learning models and AI.
TensorFlow
TensorFlow is an open-source software library for dataflow and differentiable programming across various tasks. It is a platform for building machine learning models, focusing on the training and inference of deep neural networks.
TensorFlow allows users to define and execute computations as a graph of tensors, which are multi-dimensional arrays. The library provides a range of tools and libraries for implementing and training machine learning models, as well as deploying models in a production environment. TensorFlow supports a wide range of platforms, from computers to cloud-based systems, and can be used for both research and production purposes.
Torch
Torch is an open-source machine learning library written in the Lua programming language. Torch is a scientific computing framework that provides an easy-to-use and flexible platform for building and training machine learning models. It offers several tools for building and training neural networks, as well as for data loading and preprocessing.
Torch is focused on speed and efficiency, making it suitable for large-scale machine-learning tasks. Torch also provides a number of pre-trained models and a large ecosystem of packages and libraries, making it easier to build and train new models.
Training Data
Training data is a set of data used to teach a machine learning model to make predictions or perform a specific task. It includes input and output data that the model uses to learn patterns and relationships in the data. The model adjusts its parameters during the training process to minimize errors between its predictions and the actual output to achieve good performance on the task for new, unseen data.
Trust Boundary
The trust boundary is the perimeter around the systems and operators a data owner trusts with plaintext access to their sensitive data. Inside the boundary, plaintext is acceptable. Outside the boundary, only protected forms of the data should appear.
In practice, the boundary is determined by where the data owner has direct control: their corporate network, on-premise cluster, or controlled cloud account. The boundary’s placement determines what each privacy technology must defend against and what protected forms of the data have to look like when they cross it.
Different inference privacy approaches manage the boundary differently. Hardware enclaves extend a boundary around a specific compute region. Encryption protects data as it crosses a boundary in transit or at rest. Mathematical transforms change what crosses the boundary from plaintext into a representation that downstream systems can process but not interpret.
Related: Plaintext exposure, Inference privacy, Confidential Computing, Stained Glass Transform.
Training Model
A training model refers to a quantitative representation of a problem that is used to learn patterns and relationships in training data. A training model is designed to optimize its parameters for accurate predictions on unseen data. The training process involves providing the model with a set of labeled examples and adjusting the parameters of the model based on the accuracy of its predictions.
Once the model is trained, it can predict new, unseen data by applying the learned patterns and relationships to the input data. There are many different types of training models in machine learning, including linear models, decision trees, and neural networks. The choice of model will depend on the problem being solved, the nature of the data, and the desired level of accuracy. Training models are a fundamental step in developing machine learning systems, enabling the model to learn from data and make accurate predictions on new data.
Trusted Execution Environment (TEE)
Trusted Execution Environment (TEE)
A Trusted Execution Environment is a hardware-isolated region of a processor where code and data are protected from the operating system, hypervisor, and other processes on the same machine.
TEEs combine memory encryption, access control, and remote attestation to provide isolation rooted in hardware.
TEEs protect against the infrastructure operator. They do not protect against the model operator, who controls the code running inside the enclave and sees plaintext prompts. TEEs also have a hardware dependency, requiring specific GPU and CPU SKUs not available at the edge or on commodity infrastructure.
Related: Confidential Computing, Remote attestation, Model operator, Infrastructure operator.
Task-agnostic
Task-agnostic refers to algorithms or models that can be applied to various jobs, regardless of the specific task being performed. The same model or algorithm can be used for different problems, such as classification, regression, or generation. Task-agnostic approaches can be applied to a wider range of problems without additional modifications. They also make it easier to develop and implement machine learning systems, as they can be trained on multiple tasks and reused across different applications.
Examples of task-agnostic approaches in machine learning include transfer learning and multi-task learning. In transfer learning, a pre-trained model is fine-tuned on a new task, allowing the model to leverage its existing knowledge and reduce the amount of data required for training.
Trust Boundary
A trust boundary refers to a clear distinction between the parts of a system that are trusted to behave correctly and securely and those that are not. This distinction is essential in AI systems, as it helps ensure that sensitive information and critical decisions are made by trusted components of the system while reducing the risk of bad actors compromising the system or making incorrect decisions.
Examples of trust boundaries in AI systems include:
- Data processing: Trust boundaries can be established to ensure that sensitive data is only processed by trusted system components and is not accessed or manipulated by external parties.
- Model training: Trust boundaries can be established to ensure that machine learning models are only trained on trusted data and that outside parties do not compromise the training process.
- Model inference: Trust boundaries can be established to ensure that machine learning models are only used for prediction and decision-making purposes by trusted components of the system and that untrusted parties do not manipulate the predictions made by the models.
Establishing and maintaining trust boundaries in AI systems is essential for ensuring their reliability and security and for maintaining the trust of users and stakeholders in the system.