Knowledge

AI glossary: 40 definitions to avoid getting lost in the world of artificial intelligence

Written by

Nanobaly

Published on

2024-10-24

Reading time

This is some text inside of a div block.

min

📘 CONTENTS

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Artificial intelligence (AI) has become an essential pillar of modern technology, impacting fields as diverse as healthcare, finance and education.

‍

However, understanding the subtleties of AI can be complex, not least because of the technical jargon often associated with this discipline.

‍

💡 This glossary offers a compilation of 40 key terms, aimed at clarifying essential AI concepts and making them easier to understand for professionals and novices in the field.

‍

Chatbots

The 🔗 chatbots are computer programs that use artificial intelligence to simulate a conversation with users.

They can automatically answer questions, provide information or perform simple tasks by interacting via text or voice, often used on websites and applications.

‍

Algorithm

An algorithm is a series of precise instructions or steps that a computer program follows to solve a problem or perform a specific task.

In AI, algorithms enable machines to make decisions, learn or process data automatically and efficiently.

‍

Data annotation

Data annotation involves adding specific labels or descriptions to raw data (images, text, video, etc.) to make them comprehensible to AI algorithms.

This enables machine learning models to recognize objects, actions or concepts in this data.

‍

Machine Learning

Machine learning is a branch of artificial intelligence in which machines learn from data without being explicitly programmed.

They identify patterns, make predictions and improve their performance over time using algorithms, as in image recognition or machine translation.

‍

Multi-task learning

Multi-task learning is a method in which an artificial intelligence model is trained simultaneously on several related tasks.

This enables the model to learn more efficiently by sharing knowledge between tasks, thus improving its overall performance on the set of problems to be solved.

‍

Reinforcement learning

Reinforcement learning is an AI technique in which an agent learns to make decisions by interacting with its environment.

It receives rewards or punishments depending on its actions, and adjusts its behavior to maximize long-term rewards, as in video games or robotics.

‍

Supervised learning

Supervised learning is an AI method in which a model is trained on the basis of labeled examples.

Each piece of training data is associated with a correct response, enabling the model to learn to predict similar results for new, unseen data, such as recognizing objects in images or classifying e-mails.

‍

Unsupervised learning

L'🔗 unsupervised learning is an AI method where a model is trained on data without predefined labels or responses.

It has to discover hidden patterns or structures on its own, such as clustering similar objects or detecting anomalies, without direct human supervision.

‍

Algorithmic bias

Algorithmic bias occurs when an algorithm makes unfair or inequitable decisions due to biases in the data used to train it.

This can lead to discriminatory results or inequalities, affecting specific groups of people or situations, such as in recruitment or facial recognition.

‍

Big Data

Big Data refers to large, complex data sets, often too large or varied to be processed by traditional methods.

This data comes from a variety of sources (social networks, sensors, etc.) and requires advanced techniques, such as AI and machine learning, to analyze it and extract useful information.

‍

Classification

Classification is a machine learning technique in which a model is trained to assign predefined categories or labels to new data.

For example, classify e-mails as "spam" or "non-spam", or recognize objects in images, such as cats or dogs.

‍

Clustering

Clustering is an unsupervised learning method that groups similar data into sets called "clusters".

Unlike classification, there are no predefined labels. The model discovers similarities in the data to create these groups, used for market analysis or customer segmentation, for example.

‍

Cross Entropy Loss

Cross entropy loss is a loss function used to evaluate the performance of a classification model. It measures the difference between model predictions and true labels.

The more incorrect the prediction, the greater the loss. The aim is to minimize this difference to improve predictions.

‍

Cross-validation

Cross-validation is a technique for evaluating machine learning models. It involves dividing a data set into several subsets (or "folds").

The model is trained on some subsets and tested on others. This makes it possible to estimate model performance more reliably by reducing 🔗 overlearning.

‍

ROC and AUC curves

The ROC (Receiver Operating Characteristic) curve evaluates the performance of a classification model by plotting the rate of true positives against the rate of false positives.

The AUC (Area Under the Curve) measures the area under this curve. The closer the AUC is to 1, the better the model's ability to distinguish between classes.

‍

Dataset

A dataset (🔗 dataset) is an organized collection of data used to train, test or validate artificial intelligence models.

It may contain text, images, videos or other types of information, usually labeled, to enable 🔗 Machine Learning to recognize patterns and make predictions.

‍

What if we helped you create datasets for your artificial intelligence models?

🚀 Our team of Data Labelers and Data Trainers can help you build high-quality voluminous datasets! Don't hesitate to contact us.

‍

ModelTraining

Model training involves using a dataset to teach an AI or machine learning model to perform a specific task, such as classification or prediction.

The model adjusts its parameters according to the examples provided, in order to improve its accuracy on new data.

‍

Feature Engineering

Feature engineering is the process of selecting, transforming or creating new characteristics (or "features") from raw data, in order to improve the performance of a 🔗 Machine Learning model.

These features allow us to better represent the data and facilitate the model's task of identifying patterns or making predictions.

‍

LossFunction

The loss function is a tool used in machine learning to measure the difference between a model's predictions and the actual values. It evaluates the accuracy of the model.

The smaller the loss, the closer the model's predictions are to the expected results. The model learns by minimizing this loss.

‍

Falsepositive alarm

A false alarm (or false positive) occurs when a model incorrectly predicts the presence of a condition or class when it is absent.

For example, a spam detection system would classify a legitimate e-mail as spam. This is a common error in classification models.

‍

Natural language generation(NLG)

Natural language generation (NLG) is a sub-field of artificial intelligence that automatically produces text or speech that can be understood in human language.

It enables a machine to transform raw data into natural sentences or paragraphs, as in automated summaries or virtual assistants.

‍

Hyperparameters

Hyperparameters are parameters defined before an artificial intelligence model is trained, which influence its learning.

Unlike the parameters learned by the model, hyperparameters, such as learning rate or neural layer size, are set manually and adjusted to optimize model performance.

‍

Generative artificial intelligence

Generative artificial intelligence is a branch of AI that creates new content (images, text, music, etc.) from models trained on existing data.

Using algorithms such as GANs (Generative Adversarial Networks), it generates original works by imitating patterns found in training data.

‍

Predictive model

A predictive model is an artificial intelligence algorithm designed to anticipate future results based on historical data.

It analyzes past trends to make predictions on new data, used in various fields such as finance, health, or marketing to predict behaviors or events.

‍

Gradient optimization

Optimizing the 🔗 gradient is a technique used to adjust the parameters of an AI model to minimize the loss function.

It consists in calculating the slope (gradient) of the loss function and modifying the parameters in the direction that reduces this slope, thus improving model performance.

‍

Accuracy

Accuracy is a measure of the performance of a classification model. It represents the percentage of correct predictions out of all predictions made.

This is the ratio of correct predictions (true positives and true negatives) to the total number of predictions. The higher the accuracy, the better the model.

‍

Recall

Recall is a measure of the performance of a classification model. It indicates the model's ability to correctly identify all positive occurrences of a class.

This is the ratio of true positives to the total number of true positives. A high recall means few false negatives.

‍

Image recognition

Image recognition is an artificial intelligence technique in which a model analyzes images to identify objects, people, places or actions.

Used in fields such as security, health and automotive, it enables machines to "see" and visually understand the content of an image for classification or detection purposes.

‍

Voice recognition

The 🔗 speech recognition is an artificial intelligence technology that converts speech into text. It analyzes the sounds emitted by a human voice, identifies the words spoken and transcribes them.

Used in voice assistants, mobile applications or voice control systems, it facilitates man-machine interactions.

‍

Regression

Regression is a machine learning technique used to predict continuous values from data.

Unlike classification, which assigns categories, regression estimates numerical values, such as the price of a house or future sales. It establishes relationships between input variables and output to make predictions.

‍

Artificial neural network

An artificial neural network is an artificial intelligence model inspired by the functioning of the human brain. It is made up of interconnected "neurons", organized in layers, which process information.

Used for complex tasks such as image recognition or language processing, it learns by adjusting the connections between neurons to improve its performance.

‍

Generative Adversarial Networks(GANs)

GANs (Generative Adversarial Networks) are an artificial intelligence architecture composed of two networks: a generator that creates data and a discriminator that evaluates its authenticity.

The two networks compete to improve each other's performance. GANs are used to generate images, videos and other realistic content.

‍

Deep neural networks(Deep Learning)

The 🔗 neural networks or deep learning, are AI models composed of multiple layers of interconnected neurons.

Each layer progressively extracts complex features from the raw data, making it possible to solve complex problems such as image recognition, 🔗 natural language processing or machine translation.

‍

Underfitting

Underfitting occurs when an AI model is too simple to capture the underlying patterns in the data.

The result is poor performance on both training and new data. The model does not learn sufficiently and makes incorrect predictions.

‍

Overfitting

Overfitting occurs when an AI model is too complex and fits the training data too precisely, even capturing noise or anomalies.

Although it performs well on this data, it fails to generalize to new data, making it less reliable for future predictions.

‍

Tokenization

Tokenization is a process in natural language processing that involves dividing text into smaller units called "tokens" (words, phrases or characters).

Each token represents a distinct unit that the AI can process. This step is essential to enable the models to analyze and understand the text.

‍

Natural language processing (NLP)

Natural language processing (🔗 NLP) is a field of artificial intelligence that enables machines to understand, analyze and generate human language.

It is used in applications such as voice assistants, machine translation and text analysis, enabling computers to interact with language in a natural, fluid way.

‍

Transformers

Transformers are a deep learning model architecture used mainly in natural language processing (NLP).

They capture the relationships between different elements of a sequence (words, sentences) in parallel, rather than sequentially like traditional models. Transformers form the basis of high-performance models such as GPT and BERT.

‍

Model tuning

Model tuning involves adjusting the hyperparameters of an artificial intelligence model to optimize its performance.

This process involves testing different combinations of hyperparameters (such as learning rate or layer depth) to find those that offer the best results on a given dataset.

‍

Computer Vision

Computer vision is a branch of artificial intelligence that enables machines to understand and interpret images and videos.

By visually analyzing data, computer vision systems can recognize objects, detect faces, analyze movements, or automate tasks such as quality inspection or autonomous driving.

‍

We hope this glossary has been helpful in demystifying some of the key concepts of artificial intelligence. If you'd like to know more about AI, its applications, or how creating high-quality datasets can contribute to the success of your projects, please don't hesitate to 🔗 contact Innovatiana. Our team of experts is on hand to support you in all your AI and data management initiatives.