GLOSSARY

30 Percent Rule: One of the foundational principles of this book. Grounded in research on developing proficiency in a foreign language, this rule states that, instead of needing to gain mastery over every digital skill, you only need to develop 30 percent fluency in a handful of technical topics to cultivate a digital mindset. Those topics are the focus of this book.

algorithm: A set of instructions for how to do a series of steps to accomplish a specific goal. This set of instructions takes what it receives (the input) and transforms it into something else entirely: the output. Computational algorithms—those that instruct computers—can work together to help perform any number of complex tasks. There are five core features of algorithms:

  • An algorithm is an unambiguous description that makes clear what has to be implemented. In a computational algorithm, a step such as “Choose a large number” is vague: What is large? Is it 1 million, 1 trillion, or 100? Does the number have to be different each time, or can the same number be used on every run? Better is to describe the step as “Choose a number larger than 10 but smaller than 20.”
  • An algorithm expects a defined set of inputs. For example, it might require two numbers where both numbers are greater than zero. Or it might require a word, or a list of zero or more numbers.
  • An algorithm produces a defined set of outputs. It might output the larger of the two numbers, an all-uppercase version of a word, or pictures from a set with the color blue in them.
  • An algorithm is guaranteed to terminate and produce a result, always stopping after a finite time. If an algorithm could potentially run forever, it wouldn’t be very useful because you might never get an answer.
  • Most algorithms are guaranteed to produce the correct result. It’s rarely useful if an algorithm returns the largest number 99 percent of the time, but 1 percent of the time the algorithm fails and returns the smallest number instead.

artificial intelligence (AI): Any machine that exhibits qualities associated with human intelligence.

back-end system: Structures such as servers, mainframes, and other systems that offer data services. Users don’t see back-end systems because they support applications behind the scenes.

behavioral visibility: The amount of insight someone can gain into your behaviors and patterns from the data you produce, particularly through online activity.

blockchain: An array of distributed ledger technologies that promise to be a foundational technology, like the internet. Blockchain technologies facilitate highly secure, fast, private, peer-to-peer transactions without the use of a third party.

central tendency: Where values of a data set most commonly land.

code (source code): The actual inputs that make up an algorithm to instruct a computer how to execute an operation.

coding: How we actually get a computer to follow the instructions of algorithms. By using certain programming languages, each optimized for certain kinds of work, we can put together a script of instructions for the computer to execute. Coding is an essential skill for data scientists, engineers, and others who work in technical, data-heavy programming spaces, but it can also be used to create websites, design artwork, and more.

computer vision: Converting digital images and video into binary data readable by a computer, which can then perform analyses based on the data.

confidence interval: A plausible range of values for a given summary statistic (such as the mean) in the overall population, based on the features of the sample data set. The most commonly used confidence interval is 95 percent because this number represents a comfortable balance between precision and reliability.

conversational user interface: A means for interacting with AI on human terms, whether through writing or talking (for example, Siri).

cryptocurrency (for example, Bitcoin): Any type of digital currency run on a blockchain. Not affiliated with any bank or national currency, cryptocurrencies facilitate direct peer-to-peer transactions, threatening to disrupt more traditional financial systems.

data cleaning: Converting data into formats that can be digested by AI models.

descriptive statistics: The process of analyzing data with measurements that summarize overarching characteristics in terms of a single value.

digital: Refers to the interplay of data and technology. Data refer to any information that can be used for reference, analysis, or computation. Data can be numbers, or images and text that are turned into numbers that can be processed, stored, and transformed through technology (computing).

digital exhaust: The pieces of metadata created by an individual’s online activity that constitute their digital footprint and together can form a picture of their behaviors and habits.

digital footprint: The collection of data (for example, digital exhaust) that together can represent an individual’s behaviors and habits. The higher one’s behavioral visibility, the more digital exhaust that can be collected, and thus the clearer the digital footprint that can be analyzed.

digital mindset: A set of attitudes and behaviors that enable people and organizations to see new possibilities and chart a path for the future using data, algorithms, AI, and machine learning.

digital presence: The extent to which a person is active and prominent in a technology-mediated remote environment, making contributions that are understood and recognized by team members through active and clear communication across digital tools.

digital transformation: Organizations redesigning their underlying processes and competencies to become more adaptive using data and digital technology, from artificial intelligence and machine learning to the internet of things.

dispersion: Descriptive statistics that analyze how data are spread out.

facticity: The extent to which something (e.g., data) can be considered absolutely true and accurate.

front-end system: All of the technologies and data sources of a technology stack that you, your employees, and your customers will interact with (e.g., user interface on an app).

hypothesis testing: A process for comparing two summary statistics in a data set. The test starts with what’s called a “null hypothesis,” most often a conservative position that assumes the status quo (e.g., “there is no difference between the two parameters being compared”), and then proposes an alternative hypothesis (e.g., “there is in fact a difference”). If there is enough statistical evidence to support the alternative hypothesis, the null hypothesis is rejected in favor of the alternative. However, if there is not enough statistical evidence, the null hypothesis remains.

inferential statistics: The process of using data from a sample to test hypotheses and make estimates of an overall population (in other words, making statements about the probability of summary statistics from a data set being reflective of the overall population).

machine learning: The ability of computer algorithms to automatically identify patterns through data analysis, and continuously refine their statistical techniques of prediction and inference by generating new rules without step-by-step instructions from human programmers. The more data that these algorithms process, the more they learn.

mean: A measurement of central tendency wherein each value of the data is added together and then divided by the number of values.

median: A measurement of central tendency represented by the value that falls at the midpoint in the range of data.

metaknowledge: Knowledge about “who knows what” and “who knows who.”

middleware: The software linkages between the front end and the back end, enabling communication between database and data.

mode: A measurement of central tendency represented by the most common value in the range of data.

mutual knowledge problem: The lack of shared contextual understanding in a remote context.

natural language processing: Converting human language into vectors, or binary code that is readable by computers.

neural networks: A machine learning model to automatically classify features from data that’s trained by using examples and selected approaches to emulate human neural networks.

predictive statistics: Using statistical models to predict outcomes.

Privacy by Design (PbD): A framework for product design developed by Dr. Ann Cavoukian that consists of seven key principles to proactively prioritize privacy rather than treat it as a regulatory afterthought.

programming language: Terms used to enable a specified set of commands and instructions that coders use to create algorithms.

p-value: The probability of observing the parameters of a data set under the assumption that the null hypothesis is true. A smaller p-value means there is stronger evidence in favor of the alternative hypothesis.

range: A measurement of dispersion represented by the difference between the highest and lowest values in a data set.

regression model: A method for analyzing the relationship between two or more variables or factors.

reinforcement learning: An advanced phase of machine learning in which the algorithm automatically refines its rules of categorization as it processes increasing amounts of data.

SaaS: Software as a Service, referring particularly to a business model in which software is licensed on a subscription basis and hosted centrally by the licensing company.

script: A document made up of multiple lines of code, typically designed to construct an algorithm.

significance level: The designated threshold that marks the maximum limit that a p-value can be before it is too high to reject the null hypothesis in favor of the alternative. A common significance level is 0.05: if the p-value is below 0.05, then the null hypothesis is rejected in favor of the alternative. If the p-value is above 0.05, then the alternative is rejected in favor of the null.

smart contracts: An application of blockchain, these contracts can automatically and immediately execute their terms without the need for a third-party mediator. Smart contracts are already changing how royalty payments, leases, wills, deeds, and more are processed.

standard deviation: A measure of how spread out the data are from the mean.

supervised learning: A primary phase of machine learning in which a programmer trains an algorithm to classify data into binary categories.

technical debt: The need for organizations to budget time and resources to continually update existing or old infrastructure due to the rate at which technologies, processes, and capabilities change in the digital ecosystem.

technology stack: All the hardware and software systems needed to develop and run a single application.

transitional state: Fixed periods of time in which an organization moves from a familiar set of structures, processes, and accompanying cultural norms into a completely new repertoire.

type I error: In hypothesis testing, the rejection of the null hypothesis when it is actually true.

type II error: In hypothesis testing, the failure to reject the null hypothesis when the alternative is true.

unsupervised learning: A more advanced phase of machine learning in which the algorithm automatically sorts data into the binary categories on its own.

variance: A measurement of dispersion represented by an estimate of the distance of each value from the mean.

vectors: Strings of numbers readable by computers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.33.87