A
A method used to efficiently and accurately compute the gradient of a function with respect to its inputs.
Read More
C
Computational cost refers to the computing resources required to complete a specific task. These resources can be memory, computation time, bandwidth, or
Read More
Confidential Computing offers a hardware-based security solution designed to protect data during use with application-isolation technology.
Read More
D
Data access is the ability for users to access their data given physical, software, or legal and policy-driven constraints.
Read More
Data anonymization protects private or sensitive information by erasing or encrypting identifiers that connect an individual to the data.
Read More
Data governance is the overall management of the availability, usability, integrity, and security of the data used in an organization
Read More
Data labeling identifies objects on raw data such as images, text, videos, and audio. The goal is to provide one or more informative labels to provide context so that a machine learning model can learn from it.
Read More
Data masking obscures or replaces sensitive information in a dataset to minimize exposure while maintaining the data’s functional value.
Read More
Data ownership refers to the rights and responsibilities of individuals or organizations in the collection, storage, use, and distribution of data.
Read More
Data perturbation changes an original dataset by applying techniques that round numbers and add random noise.
Read More
Data control refers to the measures and processes put in place to manage the access, use, and dissemination of data within an organization.
Read More
Data redaction refers to removing certain pieces of information from data, designed to keep that data from being linked to individuals or used for wrongdoing.
Read More
Data sharing is making data available to other individuals or organizations. It involves data exchange between individuals, groups, or organizations,
Read More
Data Tokenization is a process by which sensitive data is replaced by non-sensitive characters known as a token.
Read More
Data-type agnostic is the property of a system, process, or algorithm that can handle and process different types of data without any bias towards any particular type of data.
Read More
In data engineering and machine learning, a data pipeline refers to the steps involved in extracting, transforming, loading, and processing data.
Read More
Deployment data refers to the data used to deploy and run a machine-learning model in a production environment.
Read More
Differential privacy is a system for sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals
Read More
A technique in machine learning and artificial intelligence that involves training models using gradient-based optimization.
Read More
F
Federated Learning is a machine learning technique that enables training on a decentralized dataset distributed.
Read More
G
Gradient-based optimization is a method used in machine learning and artificial intelligence to update the parameters of a model to minimize a loss function.
Read More
H
A form of encryption that allows computations to be performed directly on ciphertext without the need first to decrypt the data.
Read More
I
Inference data is used to make predictions or inferences with a trained machine learning (ML) model.
Read More
J
Jupyter Notebook is an open-source interactive computing platform that allows users to create and share documents that contain live code, equations, visualizations, and narrative text.
Read More
L
Labeled data refers to data that has already been annotated or categorized with labels or tags that describe the content of the data.
Read More
M
A machine learning model is a mathematical representation of a system capable of learning from data and making predictions or decisions.
Read More
Model drift, also known as concept drift, refers to a phenomenon in machine learning where the distribution of the data changes over time, and the trained model’s performance degrades.
Read More
MLOps, an abbreviation for Machine Learning Operations, is a set of practices and processes for managing the end-to-end lifecycle of machine learning models.
Read More
The machine learning (ML) life cycle refers to the stages in building, deploying, and maintaining a machine learning model.
Read More
Model deployment refers to making a trained ML model accessible and usable in a real-world production environment by integrating it into a production system and monitoring its performance.
Read More
N
Natural Language Processing (NLP) concerns interactions between computers and human (natural) languages. The goal is to enable computers to process, understand, and generate human language.
Read More
Neural Network is a wide term in the field of AI that refers to any type of network that is trained to process data
Read More
O
The optimization step is the iteration process of finding the best set of parameters or weights for a machine learning model to predict the outputs based on the inputs accurately.
Read More
Open banking is a financial services model that allows third-party providers, such as fintech companies, to access bank customers’ financial data with their consent.
Read More
P
Personally Identifiable Information, abbreviated as PII, refers to any information that can be used to identify a specific individual, such as name, address, driver’s license number, etc.
Read More
PyTorch is an open-source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing.
Read More
R
Responsible AI is a set of practices and principles designed to ensure and promote the safe and ethical use of AI,
Read More
S
Sensitive data is confidential, private, or protected by regulations. This may include personal information, financial information,
Read More
Structured data refers to data that is organized in a tabular format with well-defined columns and rows.
Read More
Synthetic data is artificially generated data that is used to mimic real-world data. Synthetic data is often used for testing and training machine learning models
Read More
Semi-structured data does not follow any data model because it does not have a fixed schema. Unlike structured/tabular data, it lacks any rigid form
Read More
T
Tabular data refers to data organized in a table format with rows and columns. Each row represents an instance or an observation, and each column represents a feature or an attribute of the cases.
Read More
TensorFlow is an open-source software library for dataflow and differentiable programming across various tasks.
Read More
Training data is the initial data feed into the system to train the ML algorithm. People (workforce), Process (business rules,
Read More
The training model is a part of the data science lifecycle wherein datasets are used to train machine learning algorithms.
Read More
A training model refers to a quantitative representation of a problem that is used to learn patterns and relationships in training data.
Read More
Task-agnostic refers to algorithms or models that can be applied to various jobs, regardless of the specific task being performed.
Read More
A trust boundary refers to a clear distinction between the parts of a system that are trusted to behave correctly and securely and those that are not.
Read More