An Artificial Intelligence Glossary of Common Terms

jerry9789

05 Dec

Developing a basic understanding of use-cases, trends, and applications for Artificial Intelligence (AI) is helpful to understand the context in which AI is deployed. Keep in mind “AI” is sometimes considered too broad to be a distinct “field.” Rather it is a technology “concepts” with clarifying needed to properly frame discussions on the topic. Therefore we’ve assembled broad definitions to help readers develop a basic vocabulary for communicating about the subject. Think of this glossary as similar to the old Berlitz travel guides of “essential terms” for a given language.

Advanced Analytics

AI introduces a whole new vocabulary to marketing.

A part of data science that uses high-level methods and tools to focus on projecting future trends, events, and behaviors. This gives organizations the ability to perform advanced statistical models such as ‘what-if’ calculations, as well as future-proof various aspects of their operations. The term is an umbrella for several subfields of analytics that work together in their predictive capabilities.

Algorithm

An unambiguous strict interpretation and specification of how to solve a class of problems. Algorithms are programmed to perform calculation, data processing, and automated reasoning tasks based on the defined variables, inputs, and data to be analyzed.

Ambient Intelligence

Refers to electronic environments that are sensitive and responsive to the presence of people. In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices.

Artificial Intelligence

An area of computer science that deals with programming computers to have the ability to “think” as if they have human intelligence.

Bayesian Estimator

A Bayesian estimator is an estimator of an unknown parameter that minimizes the expected loss for all observations x of X. In other words, it’s a term that estimates your unknown parameter in a way that you lose the least amount of accuracy versus compared with having used the true value of that parameter.

Big Data

The term is used to refer to data sets that are too large or complex for traditional data processing methods. Data may be structured, unstructured or both with the key being the volume of data, the speed of accumulation or variety of forms.

Chatbot

Application of artificial intelligence which conducts a “conversation” with humans via auditory or text based methods. Also smartbot, talkbot, chatterbot, bot, IM bot, interactive agent, conversational interface, or artificial conversational entity. Frequently deployed in a customer service environment.

Cognitive Computing

The term refers to software and/or hardware that mimics the functioning of the human brain and helps to improve human decision-making. In this sense, it is a new type of approach to computing with the goal of more accurate models of how the human brain/mind senses, reasons, and responds to stimulus.

Data Mining

The process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

Data Science

An interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured. Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.

With structure and process data need not be intimidating.

Data Warehouse

A system used for reporting and data analysis. A data warehouse is a central repository of integrated data from one or more disparate sources.

Data Lake

A system or repository of data stored in its raw format. A data lake is usually a single store of all enterprise data including raw copies of source system data and transformed data used for tasks such as reporting, visualization, advanced analytics and machine learning. It can include structured data from relational databases (rows and columns), semi-structured data (CSV, logs, XML, JSON), unstructured data (emails, documents, PDFs) and binary data (images, audio, video).

Deep Learning

A branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers with complex structures. It is part of the larger family of machine learning concepts which uses artificial neural networks. The presence and use of multiple layers in the network are reflected in the inclusion of the adjective “deep.”

Eager Learning

A learning method in artificial intelligence in which the system tries to construct a general, input-independent target function during training of the system. It is the opposite of “lazy learning,” where generalization beyond the training data is delayed until a query is made to the system. The main advantage gained is that the target function will be approximated globally during training, thus requiring much less space than using a lazy learning system. Eager learning systems also deal much better with noise in the training data. The main disadvantage with eager learning is that it is generally unable to provide good local approximations in the target function.

Heuristic

A technique designed for solving a problem more quickly or for finding an approximate solution when traditional methods fail. This is achieved by trading optimality, completeness, accuracy, or precision for speed. In a way, it can be considered a shortcut.

Knowledge Extraction

The creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources. The resulting knowledge needs to be in a machine-readable format and must represent knowledge in a manner that facilitates inferencing. It requires either the reuse of existing formal knowledge (reusing identifiers or ontologies) or the generation of a schema based on the source data.

Latent Variables

Sometimes referred to as “hidden variables” because they are not seen or perceived without analysis. These variables are meaningful to the outcomes but are inferred through mathematical models using observable variables. The use of latent variables can serve to reduce the dimensionality of data. Many observable variables can be aggregated in a model to represent an underlying concept, making it easier to understand the data.

Latent Effects Modeling

The modeling of Latent Variables (see above) to identify the effects of previously applied stimulus. This type of advanced modeling can identify previously unobservable effects over time. This approach can be helpful in determining which unobservable variables were most influential on the outcomes.

Lazy Learning

In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to in eager learning, where the system tries to generalize the training data before receiving queries.

Machine Learning

A method of data analysis and subset of more encompassing “Artificial Intelligence” that automates analytical model building. Using algorithms that iteratively learn from data, machine learning allows computers to find hidden insights without being explicitly programmed where to look.

Machine Vision

Imaging-based automatic inspection and analysis for application in industry and the technology and methods used to provide such. Applications are varied and include automatic inspection, process control, and robot guidance. It attempts to integrate existing technologies in new ways and apply them to solve real world problems.

Natural Language Processing

A subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data. Often abbreviated to “NLP.”

Predictive Analytics

Data analysis delivers actionable insights for planning and management.

A variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical data points to make high precision predictions about future events. Artificial intelligence techniques are often deployed in such models.

Python Programming Language

A high-level, general-purpose programming language created by Guido van Rossum and first released in 1991. Python is developed under an OSI-approved open source license, making it freely usable and distributable, even for commercial use. It comes with a large number of pre-built libraries with many supporting artificial intelligence and machine learning applications and analysis.

Qualification Problem

In artificial intelligence applied to knowledge-based systems, the qualification problem is concerned with the impossibility of listing all of the preconditions required for a real-world action to have its intended effect.

R Programming Language

A programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical and data analysis software.

Smart Objects

An object that enhances the interaction between humans as well as other Smart Objects. It is not limited to interaction with physical world objects but also to interaction with virtual objects in computing environments.

Supervised Learning

The machine learning task of learning a function that maps an input to an output based on sample input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

Tell us how we can help you

Cascade Strategies can serve your market research needs from the most straightforward to the most sophisticated project. Don’t hesitate to contact us to tell us about your next project, or your overall research needs in general. You can call (425) 677-7430 and ask for Jerry, Nestor, or Ernie. Or send us an email at info@cascadestrategies.com. We’ll get back to you quickly!