Glossary
Activation Function
: Used in deep learning models to help calculate non-linear relationships.
Actuators
: Electro-mechanical devices like motors. They help with the movement of a robot.
AI
: See Artificial Intelligence.
AI Winter
: A prolonged period of time, such as in the 1970s and 1980s, when the AI industry came under much pressure, such as with cutbacks in funding.
Artificial Intelligence
: Where computers are able to learn from experience, which often involves processing data using sophisticated algorithms. Artificial intelligence is a broad category, which includes subsets like machine learning, deep learning, and Natural Language Processing (NLP).
Artificial Neural Network (ANN)
: The most basic structure for a deep learning model. The ANN includes multiple hidden layers that process data through the use of sophisticated algorithms.
Automation Fatigue
: With RPA, there will generally be less improvement as more tasks are automated.
Automated Machine Learning (AutoML)
: A digital tool or platform that allows beginners to create their own AI models.
Backpropagation
: A major breakthrough in deep learning. Backpropagation allows for more efficient assigning of weightings in models.
Bayes’ Theorem
: A statistical measure used in machine learning that helps to provide a more accurate view of the probabilities.
Big Data
: A category of technology that involves processing huge amounts of data. Big Data is often described as having the three Vs—that is, volume, variety, and velocity.
Binning
: Involves organizing data into groups.
Categorical Data
: Data that does not have a numerical meaning but instead has textual meaning, say with describing race or gender.
Cerebral Cortex
: Part of the human brain that has the most similarities to AI. It helps with thinking and other cognitive activities.
Chatbot
: An AI system that communicates with people
Clustering
: A form of unsupervised learning that takes unlabeled data and uses algorithms to put similar items into groups.
Cobot
: A robot that works alongside people.
Cognitive Robotic Process Automation (CRPA)
: An RPA system that leverages AI technologies.
Convolutional Neural Network (CNN)
: A deep learning model that goes through different variations—or convolutions—of analysis on data. CNNs are often used for complex applications like facial recognition.
Data Lake
: Allows for the storage and processing of massive amounts of structured and unstructured data. There is often little to no need to re-format the data.
Data Type
: The kind of information a variable represents, such as a Boolean, integer, string, or floating point number.
Decision Tree
: A machine learning algorithm that is a workflow of decision paths.
Deepfake
: Involves using deep learning models to create images or videos that are misleading or harmful.
Deep Learning
: A type of AI that uses neural networks, which mimic the processes of the brain. Much of the innovation in the field during the past decade has been with deep learning research.
Ensemble Modelling
: Involves using more than one model for generating predictions.
ETL (Extraction, Transformation, and Load)
: A form of data integration that is typically used in a data warehouse.
Ethics Board
: A committee that evaluates the issues of AI projects.
Expert System
: An early type of AI application that emerged in the 1980s. It used sophisticated logic systems to help understand certain areas like medicine, finance, and manufacturing.
Explainability
: The process of understanding the underlying causes of a deep learning model.
False Positive
: When a model prediction shows that the result is true even though it is not.
Feature
: This is a column of data.
Feature Engineering
: See Feature Extraction.
Feature Extraction
: Describes the process of selecting the variables for an AI model.
Feed-Forward Neural Network
: A deep learning model that processes data in a linear direction through the hidden layers. There is no cycling back.
Generative Adversarial Network (GAN)
: Developed by AI researcher Ian Goodfellow, this is a next-generation deep learning model that helps to create new outputs like audio, text, or video.
GPUs (Graphics Processing Units)
: Chips that were originally used for high-speed video games because of the ability to process large amounts of data quickly. But GPUs have also proven to be adept at handling AI applications.
Hadoop
: Allows for managing Big Data, such as by making it possible to create sophisticated data warehouses.
Hidden Layers
: The different levels of analysis in a deep learning model.
Hidden Markov Model (HMM)
: An algorithm that is used to decipher spoken words.
Hyperparameters
: Features in a model that cannot be learned directly from the training process.
Instance
: This is a row of data.
Jupyter Notebook
: A web-based app that makes it easy to code in Python and R to create visualizations and import AI systems.
K-Means Clustering
: An algorithm that is effective for grouping similar unlabeled data.
K-Nearest Neighbor (k-NN)
: A machine learning algorithm that classifies data based on similarities.
Lemmatization
: A process in NLP that removes affixes or prefixes so as to focus on finding similar root words.
Lidar (Light Detection and Ranging):
A device—which is usually at the top of an autonomous car—that shoots laser beams to measure the surroundings.
Linear Regression
: Shows the relationship between certain variables, which can help with predictions for machine learning systems.
Machine Learning
: Where a computer can learn and improve by processing data without having to be explicitly programmed. Machine learning is a subset of AI.
Metadata
: This is data about data—that is, descriptions. For example, a music file can have metadata like the size, length, date of upload, comments, genre, artist, and so on.
Naïve Bayes Classifier
: A method of machine learning that uses Bayes’ theorem to make predictions, but the variables are independent from each other.
Named Entity Recognition
: In the NLP process, this involves identifying words that represent locations, persons, and organizations.
Natural Language Processing (NLP)
: A subset of AI that deals with how computers understand and manipulate language.
Neural Network
: A sophisticated AI model that mimics the brain. A neural network has various layers that attempts to find unique patterns that involve multiple layers of analysis.
Normal Distribution
: A plot of data that looks like a bell and the midpoint is the mean.
NoSQL System
: A next-generation database. The information is based on a document model so as to allow for more flexibility with analysis as well as the handling of structured and unstructured data.
Ordinal Data
: A mix of numerical and categorical data, such as an Amazon.com rating for a product.
Overfitting
: Where a model is not accurate because the data is not reflective of what is being tested or there is a focus on the wrong features.
Pearson Correlation
: Shows the strength of a correlation—from 1 to -1. The closer it is to 1, the more accurate the correlation.
Phonemes
: The most basic units of sound in a language.
Predictive Analytics
: Involves using data to make forecasts.
Python
: A computer language that has become the standard in developing AI models.
PyTorch
: A platform, developed by Facebook, that allows for the creation of sophisticated AI models.
Recurrent Neural Network (RNN)
: A deep learning model that processes prior inputs across time. A common use case is when a person types in characters in a messaging app, as the AI will predict the next word.
Reinforcement Learning
: An approach to creating an AI model where the system is rewarded for the right predictions and punished for the wrong ones.
Relational Database
: A database, whose roots go back to the 1970s, that creates relationships among tables of data and has a scripting language, called SQL.
Robotic Desktop Automation (RDA):
The RPA system works alongside an employee to handle jobs or tasks.
Robotic Process Automation (RPA)
: A category of software that automates routine and mundane tasks within an organization. It is often an initial way to implement AI.
Robot Operating System (ROS)
: An open source middleware system that manages critical parts of a robot.
R-squared
: Provides a way to gauge the accuracy of a regression. An R-squared ranges from 0 to 1. And the closer a model is to 1, the higher the accuracy.
Sentiment Analysis
: This is where you mine social media data and find the trends.
Sensor
: The typical sensor is a camera or a Lidar, which uses a laser scanner to create 3D images.
Sigmoid
: A common activation function for a deep learning model. It has a value that ranges from 0 to 1. What’s more, the closer it is to 1, the higher the accuracy.
Standard Deviation
: Measures the average distance from the mean, which gives a sense of the variation in the data.
Stemming
: Describes the process of reducing a word to its root (or lemma), such as by removing affixes and suffixes.
Strong AI
: This is true AI, in which a machine is able to engage in human-like abilities like open-ended discussions.
Structured Data
: Data that is usually stored in a relational database or spreadsheet, as the information is in a preformatted structure (like Social Security numbers, addresses, and point of sale information).
Tagging Parts of Speech (POS)
: In the NLP process, this involves going through text and designating each word to its proper grammatical form, say nouns, verbs, adverbs, etc.
TensorFlow
: An open source platform, backed by Google, that allows for the creation of sophisticated AI models.
Test Data
: Data that a model’s accuracy is evaluated.
Three Laws of Robotics
: Based on the science fiction writings of Isaac Asimov, these laws provide the basic framework for how robots should interact with society.
Tokenization
: In the NLP process where text is parsed and segmented into various parts.
Topic Modelling
: In the NLP process, this involves looking for hidden patterns and clusters in the text.
Training Data
: Data that is used to create an AI algorithm.
True Positive
: When a model makes a correct prediction.
Turing Test
: Created by Alan Turing, this is a way to determine if a system has achieved true AI. The test involves a person who asks questions to two participants—one human, the other a computer. If it is not clear who is the human, then the Turing Test has been passed.
Unattended Robotic Process Automation (RPA)
: The RPA system is completely autonomous as the bot runs in the background.
Unstructured Data
: Data that does not have predefined formatting, such as images, videos, and audio files.
Supervised Learning
: An AI model that uses labeled data. This is the most common approach.
Unsupervised Learning
: Involves an AI model that uses unlabeled data. Generally, this means there will need to be deep learning systems to detect patterns.
Vanishing Gradient Problem
: Explains how the accuracy decays as a deep learning model gets larger.
Virtual Assistant
: An AI device that helps a person with his or her daily activities.
Weak AI
: This is where AI is used for a particular use case, such as with Apple’s Siri.