Unpacking the AI terminology

These are just some of the terms you will come across on your AI journey. We’ve tapped a variety of sources to unpack what they mean.

Artificial Intelligence

Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems, enabling machines to perceive their environment and use learning and intelligence to take actions that maximise their chances of achieving defined goals.

Some high-profile applications of AI include advanced web search engines, recommendation systems, interacting via human speech, autonomous vehicles, generative and creative tools, and superhuman play and analysis in strategy games.

Generative AI

Generative AI refers to AI techniques that learn a representation of artifacts from data, and use it to generate brand-new, unique artifacts that resemble but don’t repeat the original data. These artifacts can serve benign or nefarious purposes. Generative AI can produce novel content (including text, images, video, audio, structures), computer code, synthetic data, workflows and models of physical objects. It can also be used in art, drug discovery or material design.

Improvements in transformer-based deep neural networks, particularly large language models (LLMs), enabled an AI boom of generative AI systems. These include chatbots text-to-image artificial intelligence image generation systems, and text-to-video AI generators.

Diffusion models

A subset of generative AI, diffusion models create new data by iteratively making controlled random changes to an initial data sample. They start with the original data and add subtle changes (noise), progressively making it less similar to the original. This noise is carefully controlled to ensure the generated data remains coherent and realistic. After adding noise over several iterations, the diffusion model reverses the process. Reverse denoising gradually removes the noise to produce a new data sample that resembles the original.

Generative adversarial networks

Another generative AI subset, the generative adversarial network (GAN) builds upon the diffusion model’s concept, training two neural networks in a competitive manner. The first network, known as the generator, generates fake data samples by adding random noise. The second network, called the discriminator, tries to distinguish between real data and the fake data produced by the generator. During training, the generator continually improves its ability to create realistic data while the discriminator becomes better at telling real from fake. This adversarial process continues until the generator produces data that is so convincing that the discriminator can’t differentiate it from real data.

Variational autoencoders

Variational autoencoders (VAEs), used in generative AI, learn a compact representation of data called latent space. The latent space is a mathematical representation of the data. VAEs use two neural networks—the encoder and the decoder. The encoder neural network maps the input data to a mean and variance for each dimension of the latent space. It generates a random sample from a Gaussian (normal) distribution. This sample is a point in the latent space and represents a compressed, simplified version of the input data. The decoder neural network takes this sampled point from the latent space and reconstructs it back into data that resembles the original input. Mathematical functions are used to measure how well the reconstructed data matches the original data.

Transformer-based models

The transformer-based generative AI model builds upon the encoder and decoder concepts of VAEs. Transformer-based models add more layers to the encoder to improve performance on text-based tasks like comprehension, translation, and creative writing. Transformer-based models use a self-attention mechanism. They weigh the importance of different parts of an input sequence when processing each element in the sequence. They also implement contextual embeddings. The encoding of a sequence element depends not only on the element itself but also on its context within the sequence.

Foundation models

Foundation models (FMs) are machine learning models trained on a broad spectrum of generalised and unlabeled data. They’re capable of performing a wide variety of general tasks. In general, an FM uses learned patterns and relationships to predict the next item in a sequence.

Large language models

Large language models (LLMs) are a class of foundation models specifically focused on language-based tasks such as such as summarisation, text generation, classification, open-ended conversation, and information extraction. LLMs cab perform multiple tasks because they contain many parameters that make them capable of learning advanced concepts.

Machine Learning

Machine learning is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalise to unseen data and thus perform tasks without explicit instructions.

Deep learning

Deep learning is a subset of machine learning, focused on training artificial neural networks with multiple layers – inspired by the structure and function of the human brain – consisting of interconnected nodes (neurons) that transmit signals. By automatically extracting features from raw data through multiple layers of abstraction, these AI algorithms excel at image and speech recognition, natural language processing and many other fields. Deep learning can handle large-scale datasets with high-dimensional inputs, but requires a significant amount of computational power and extensive training due to their complexity.

General Intelligence (or Artificial General Intelligence)

A machine with artificial general intelligence should be able to solve a wide variety of problems with breadth and versatility similar to human intelligence. This is in contrast to narrow AI, which is designed for specific tasks. AGI is considered one of various definitions of strong AI.

Natural Language Processing

Natural language processing (NLP) allows programs to read, write and communicate in human languages such as English. Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering.

Automation

Automation describes a wide range of technologies that reduce human intervention in processes, mainly by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines. Robotic process automation is a form of business process automation that is based on software robots or artificial intelligence agents. RPA should not be confused with artificial intelligence as it is based on automotive technology following a predefined workflow. It is sometimes referred to as software robotics.

GPU

A graphics processing unit (GPU) is a specialised electronic circuit initially designed to accelerate computer graphics and image processing and were later found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

Neural Processor

An AI accelerator, deep learning processor, or neural processing unit is a class of specialised hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. They excel in processing vast amounts of data in parallel, making them ideal for tasks like image recognition, natural language processing, and other AI-related functions.

Perception and Computer Vision

Machine perception is the ability to use input from sensors (such as cameras, microphones, wireless signals, active lidar, sonar, radar, and tactile sensors) to deduce aspects of the world. Computer vision is the ability to analyse visual input. The field includes speech recognition, image classification, facial recognition, object recognition, object tracking, and robotic perception.

Social intelligence

Affective computing is an interdisciplinary umbrella that comprises systems that recognise, interpret, process or simulate human feeling, emotion and mood.