AI Dictionary

To help our community get acquainted with basic AI terminology, we have compiled the below dictionary.

Accuracy

The proportion of correctly identified positive and negative cases in a classification model. For example, an AI model that correctly identifies all positive and negative cases of patients with e.g. Covid-19 would have an accuracy of 1.0 (100%). Note that in data protection law, accuracy refers to accurate and up to date record keeping.

Algorithm

A set of instructions that can be followed by a human or computer. For example, the NHS algorithm for detecting Acute Kidney Injury is a set of instructions that can be repeated for multiple patients. In AI, machine learning algorithms use data to make predictions or recommendations which can inform decision making

Algorithmic Impact Assessment

Algorithmic impact assessments (often abbreviated as ‘AIAs’) are tools that set out frameworks and processes for assessing possible societal impacts, both beneficial or adverse, of AI systems before the systems are in use (with ongoing monitoring often advised). An example AIA can be found here.

Anonymisation

The process of removing all identifiable information from data in a way which makes it theoretically infeasible to identify an individual. Anonymised data is not considered as personal data under the GDPR. This means it is not subject to the same restrictions as personal data. For example, by removing direct identifiers such as NHS number and name, and translating e.g. age into an age range (25-40) and grouping postcodes together. You can read more about the challenges and approaches to anonymisation in the ICO code of practice.

Area Under the (Receiver Operator Character) Curve

A single number calculated from a ROC curve to help summarise the performance of a classification model.

Bias

The disproportionate weighting in favour of, or against a specific item or individual. There are many types of bias that exist. AI algorithms are often trained on historical data which can contain bias that exists in society, and are trained by humans who themselves have bias. Unless the algorithms are built with fairness in mind, they may repeat these biases in their predictions. For example, an AI trained on CVs from engineers, who are historically predominantly male, may unfairly penalise CVs from women if the algorithms are not tested for fairness and modified to remove bias. This is an example of selection bias.

Causality

The influence of an event resulting, or causing, another. For example, consuming more calories than you use up will lead to weight gain. While many AI systems use patterns found in data to make predictions, very rarely are these patterns sufficient to determine the underlying cause of a behaviour.

Classification

A type of machine learning model which can predict whether or not you belong to a specific class, or label. Common approaches include logistic regression, decision trees and random forests. For example, a classification model may be built to identify individuals with diabetes. The classes could be Type-I diabetes, Type-II diabetes, gestational diabetes or no diabetes. Normally, a model would return the probability that you belonged to each class, and would assign you to the class with the highest probability.

Clustering

An application of unsupervised machine learning that seeks to 'cluster' data into groups by identifying similar pattens.

Confusion matrix

A table that is used to describe the performance of a classification model on a set of test data for which the true values are known. It can be used to derive a number of measures such as sensitivity and specificity. For example, a model may predict whether a patient has a form of diabetes, or not. It's performance can be described by writing out where it correctly or incorrectly predicted diabetes (positive) or not (negative) where we know the actual results

Convolutional neural network

A type of neural network architecture normally used for computer vision/image processing tasks, for example classifying chest radiographs as normal/abnormal or segmenting areas of possible lung cancer.

Correlation

A measure that expresses the extent to which data are related in a direct, or linear, way. It describes a relationship between data without making any statement about cause and effect (i.e. correlation does not mean causality). However, it can describe a positive or negative, also known as inverse, relationship between these data. Positive would be if one goes up the other goes up too, negative would be if one if one goes up the other one goes down. For example, an individual's weight is correlated to their height and waiting times are correlated to the number of people waiting.

Cross validation

An approach to reducing overfitting during model development, by iteratively selecting different portions of the data to train and validate a predictive (supervised) machine learning model. Cross validation can increase the overall performance of a model, along with data augmentation techniques.

Data augmentation

The process of artificially increasing the amount of data used to train a model, to reduce overfitting and improve model performance.

Deep learning

An approach to building models using neural networks with more than one 'hidden' layer of artificial neurons. This is a common approach when working with image and text data. Deep learning models are able to capture complex relationships but can be difficult to interpret what data leads to a particular outcome.

Descriptive analytics

The analysis of data from the past that can help answer “What happened?” or “What is happening?”.

Explainability

AI can be built on complex algorithms and data, and explainability is a measure of how understandable, or explainable, the decisions of an AI system are to humans. XAI ("eXplainable Artificial Intelligence") is where humans can understand how the results of an AI model were obtained.

F1 score

A metric which describes the accuracy of a classification model, by combining the precision and sensitivity (recall) values into a single number, which ranges from 0 (poor accuracy) to 1 (high accuracy).

Fairness

When individuals are not penalised by algorithms because they are part of a (sensitive) group. For example, an AI algorithm used in law enforcement in the USA was shown to unfairly increase sentence recommendations based on racial background (an example of unfairness caused by discrimination bias).

False negative

The incorrect prediction of a data point or individual not having a specific outcome or class.

False positive

The incorrect prediction of a data point or individual having a specific outcome or class.

Feature

A single type of measurement that is used as the input for a model - Synonymous to variables, covariates or columns.

Federated Learning

An approach to machine learning where a model is trained on data where the data exists (in multiple locations), rather than the traditional approach of moving the data to a central location for model training. This decentralised approach reduces the need to transfer data across different entities, and may reduce the associated information governance and security overheads that this can entail.

Foundation Model

A foundation model is a ML model trained on large data. The data is unlabelled and the model is training by a self-supervised learning algorithm.

Generative Adversarial Network

A class of deep learning where two networks compete with each other to improve the overall performance of the model. A generator network will generate artificial examples based on the training data, and the discriminator will try to judge whether they are real or artificial. This is an iterative process which continues until the generator can adequately "fool" the discriminator. This technique is predominantly used in image generation.

Gradient descent

A common approach used in supervised machine learning, where models are trained to fit the training data, by minimising the error in predictions in an iterative fashion. In gradient descent, small 'steps' are taken in different 'directions' in search of the smallest error.

Graph Neural Network

A class of deep learning methods designed to make predictions on data described by graphs. Graphs are a way of representing data, relationships and their complexity. GNNs are neural networks that can be directly applied to graphs, and provide a way to generate node-level, edge-level, and graph-level predictions. In recent years, a number of variants of GNNs have been developed such as graph convolutional networks (GCN), graph attention networks (GAT) and graph recurrent networks (GRN).

Local explainability

The ability to explain why an AI prediction has been made for a specific data point.

Machine editing

Approaches to efficiently modify the behaviour of a machine learning model on certain inputs, whilst having little impact on unrelated inputs. Machine Editing can be used to inject or update knowledge in the model or modify undesired behaviours.

Machine learning operations

The process of safely deploying, monitoring and updating machine learning models in production, or real-world, environments. Because machine learning models are built on data, and data can change, it is important to build robustness into the system so they can adapt to a changing environment without losing performance.

Machine unlearning

Approaches to efficiently remove the influence of a subset of the training data from the weights of a trained model, without retraining the model from scratch, and whilst retaining the model’s performance on downstream tasks. Machine unlearning could be used to remove the influence of personal data from a model if someone exercises their “Right to be Forgotten”.

Memorization

Machine Learning Models have been shown to memorize aspects of their training data during the training process. This has been demonstrated to correlate with model size (number of parameters).

Metadata

Data about data, or data that provides information on one or more aspects of the data.

Multimodal artificial intelligence

An approach to AI which incorporates multiple types of data. For example, a speech-to-text model that is typically trained on audio and text data, could include image data of lip movements taken from video recordings.

Overfitting

The process of building a model which is based too closely on the data. This results in a model which may be very accurate on the training data, but when tested on additional datasets such as the test data, unseen data or data from a new environment, performs badly. Approaches to reduce overfitting include cross-validation, data augmentation and ensemble techniques (which combine different models).

Precision

Predictive analytics

The analysis of data where the goal is to create predictions of the future based on the past that can help answer “What will happen?"

Prescriptive analytics

The use of predictive analytics to recommend or automatically action an activity.

Pseudonymisation

A technique that separates data from direct identifiers (for example name, surname, NHS number) and replaces them with a pseudonym (for example, a reference number), so that identifying an individual from that data is not possible without additional information. The organisation that conducted pseudonymisation will be able to re-identify individuals if required.

Receiver Operator Characteristic

Used to create a plotted curve which demonstrates how the trade off between true positive rate and false positive rate changes as you vary the operating point of a classification model.

Recurrent neural network

A type of neural network architecture used to model sequential data

Reinforcement learning

A type of machine learning where you define an environment and a goal, and iteratively attempt at maximising the goal by reinforcing actions that increase the goal. For example, Deepmind successfully used reinforcement learning to master the board game Go by defining the game parameters, the goal and suggesting the best moves to play against real human players.

Reproducible analytical pipeline

An automated set of code that processes data in line with best practice and ensures that it delivers the same results each time it runs.

Segmentation

A computer vision task that involves separating out regions of interest from an image, for example identifying the lung areas from a chest radiograph.

Self-supervised machine learning

An approach to machine learning where you do not have the outcomes of your data, but use the structure of your data to help determine what these outcomes are.

Semi-supervised machine learning

An approach to machine learning that combines known data (supervised machine learning) with unknown data to improve its ability to act on data it has not seen before.

Sensitivity

The ability of a classification model to correctly identify individuals with a condition. This is also known as the true positive rate and recall. It is defined as the number of true positives over true positives and false negatives.

Shadow Deployment

A method of testing a candidate model for production where production data runs through the model without the model actually returning predictions to the service or customers. Essentially, simulating how the model would perform in the production environment.

Specificity

The ability of a classification model to correctly identify individuals without a condition. This is also known as the true negative rate.

Structured data

Information which is well structured, such as a spreadsheet of information. This means there are defined columns and you can expect additional data to follow the same or similar structure (perhaps with some missing values). Historically, data analysis was limited to structured data as it is well defined. More recently, advances in AI mean unstructured data is now more accessible.

Supervised machine learning

A type of machine learning where you know the outcome of the data you are looking to model. This includes regression and classification techniques.

Test data

Data that is not included in the training data, and used to test that a model has accurately identified the patterns in the data that result in the desired behaviour

Training

Most types of AI require a training process, which uses historical data to build a model able to predict future cases.

Training data

The data required to train, or “teach” a machine learning algorithm when developing a model. Good quality training data that is reflective of the population, unbiased and large enough to ensure a robust model is a key prerequisite for AI.

Training data leakage

Aspects of the training data can be memorized by a machine learning model during training and are consequently vulnerable to being inferred or extracted verbatim from the model alone. This is possible as the behaviour of the model on samples which were members of the training data is distinguishable from samples the model has not seen before. This leakage has been demonstrated on a range of machine learning models including Transformer-based Image and Language Models.

Trusted research environment

A secure computing environment that allows remote access to data for approved researchers. Also known as Secure Data Environment (SDE) or Data Safe Haven.

Underfitting

The process of building a model which is not based closely enough on the data. This results in a model which performs badly and fails to capture the relationships you are looking for. There is a balance to be made between underfitting and overfitting.

Unstructured data

Information which may have some structure (e.g. a name or some simple attributes or metadata), but by definition could contain a range of information. For example, images contain some structured data (image size, image format, image name) but the contents of those images can vary completely. Audio also contains some structure (length, format) but can vary considerably. Recent developments in approaches to AI, computational power and quantity of data have led to an increase in AI in unstructured data, such as medical imaging and speech to text.

Unsupervised machine learning

A type of machine learning where you do not know the outcome or definition of your data, and are looking for patterns. This includes clustering techniques such as k-means and principal component analysis (PCA).

Validation data

Data that is not included in the training data, but is used to check the performance of the model as it is being trained. This is separate to the test data used to check the final performance of the model.

Last updated