Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9

The Value of Information and the Internet of Things^a

Ira S. Moskowitz⁎; Stephen Russell^†; Niranjan Suri^† * Information Management and Decision Architectures Branch, Code 5580, Naval Research Laboratory, Washington, DC, United States
^† Battlefield Information Processing Branch, Computational Information Sciences Directorate, Army Research Laboratory, Adelphi, MD, United States

Abstract

We investigate a theory for the Value of Information (VoI) with respect to the Internet of Things (IoT) and IoT's intrinsic artificial intelligence. In an environment of ubiquitous computing and information, information's value takes on a new dimension. Moreover, when the system in which such a volume of information exists is itself intelligent, the ability to elicit value, with consideration of context, will be more complicated. Classical economic theory describes the relationship between value and volume which, though moderated by demand, is highly correlated. In an environment where information is plentiful, such as the IoT, the intrinsic intelligence in the system will be a dominant moderator of demand (e.g., self-adapting, self-operating, and self-protecting; controlling access). We examine Howard's (1966) VoI theory from this perspective and illustrate mathematically that Howard's focus on maximizing value obfuscates another important dimension, the guarantee of minimal value.

Keywords

Internet of Things; VoI theory; Artificial intelligence; Howard's model; Shannon's theory; Self-adapting

Acknowledgments

The authors acknowledge the assistance of Swmbo Heilizer. The authors also thank William F. Lawless for his careful reading of the draft version of this chapter and his helpful suggestions.

9.1 Introduction

Shannon (1948) laid the groundwork for information theory in his seminal work. However, Shannon's theory is a quantitative theory, not a qualitative theory. Shannon's theory tells you how much “stuff” you are sending through a channel, but it does not care if it is a cookie recipe or the plans for a time machine. The quality of “stuff” is irrelevant to Shannon theory. This focus on sending messages, exclusive of understanding or context, is in contrast to Value of Information (VoI) theory, which concerns what, and not necessarily how much, “stuff” we are considering. That is, Shannon is a purely quantitative theory, whereas any theory of information value must include a qualitative aspect that is equal in relevance as any quantitative measures.

This qualitative characteristic finds it way into many information-centric areas, particularly when humans or artificial intelligence (AI) is involved in a decision-making process. For example in Russell, Moskowitz, and Raglin (2017), the authors, not surprisingly, state “We note that a purely quantitative approach to information is far from satisfactory.” They then back this statement with discussions on Paul Revere, the Small Message Criteria (Moskowitz & Kang, 1994), and steganography. This is also discussed in Allwein (2004) where that research merged the work of Barwise and Seligman (1997) and Shannon's theory using channel theory tools from the logic discipline. However, these types of approaches in the literature do not offer immediate help with pragmatic concerns that exist in the Internet of Things (IoT) where information is plentiful and can also be excessive.

The nature of the IoT is one of pervasive data, continuously gathered and acted on by fully or semiautonomous devices and systems. This nature creates an interesting paradox in the context of VoI. If the IoT ushers in unimaginable volumes of information and one considers value as an economic construct, should not the “value” of information decrease? Perhaps in the average sense, for example, all information’s overall value may decrease, but certain information would still retain a value higher than most. This notion calls into question how applicable existing VoI theory would be in the context of IoT information and related decision making. Moreover, the implications of an intelligent IoT system of systems, which the IoT is, in implementation and operation, introduce another complicating factor toward a generalized theory of VoI. The intelligence in the IoT itself necessarily makes VoI determinations in its autonomous operation on behalf of human decision makers. In this manner, the IoT itself is imbued with its own AI, that manifests as self-star (self-⁎) behaviors. Self-⁎ behaviors are (Babaoglu et al., 2005) autonomic behaviors (such as self-management, self-awareness, self-protecting, etc.) that provide a device or system with an understanding of its contribution (or value) to global, greater, or external objectives/goals. The concept of the IoT's AI brings additional constraints to understanding VoI, given such a pervasive information system. Like the limitations of Shannon's information theory (Shannon, 1956), these considerations also create a fundamental issue for a solely quantitative theory of information's applicability to IoT decision making.

We attempt to address this issue by examining a VoI theory in the context of information provided by the IoT. Our thinking is grounded in the work of Ponssard (1975) and especially Howard (1966). These works discuss how VoI is part of decision analysis. We attempt to make an optimal decision, based upon expected utility/value. Howard (1966) discusses how a company decides how much to bid on a contract based upon the a priori information it has available. In this situation, the company attempts to maximize its expected profit. We note though that we disagree with how Howard obtained his “clairvoyant” results in the situation when additional information is available to the decision maker. AI plays a major role in any consideration of the VoI because techniques, such as machine learning, can distill additional information from the IoT, which can be used by a decision maker.

9.2 The Internet of Things and Artificial Intelligence

IoT is touted as the next wave in the era of computing (Gubbi, Buyya, Marusic, & Palaniswami, 2013) and has quickly been relabeled the Internet of Everything (Roy & Chowdhury, 2017). While the definition of the IoT may take many forms, there is little debate about the enormous amount of information it will make available (Barnaghi, Sheth, & Henson, 2013; Papadokostaki et al., 2017; Taherkordi, Eliassen, & Horn, 2017) for decision-related activities. Quoting from Moskowitz, Russell, and Jalaian (2018):

The Internet of Things (IoT) is the realization of interconnected and ubiquitous computing, pervasive sensing, and autonomous systems that can affect the physical world. … The “things” that exist in the IoT can be generally thought of as physical or computational objects that label, sense, communicate, process, or actuate thereby bridging the physical and virtual worlds (Oriwoh & Conrad, 2015; Pande & Padwalkar, 2014). While there is no universally accepted definition of the IoT, the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) defines the IoT as “a global infrastructure for the information society, enabling advanced services by interconnecting (physical and virtual) things …”

In Moskowitz et al. (2018), beyond providing a definition of the IoT, the authors showed how side channels in the IoT architecture can cause information to be covertly/steganographically transmitted from one place in the IoT to another. They argue that IoT will make so much information available that new threats will emerge from hiding in (information’s) plain sight. We posit that the amount of available information in the IoT will change the supply and demand dynamic, resulting in a need for a new understanding of information's value. This relationship will likely follow an econometric view of value, where scarcity increases perceived and/or real value (Hansen & Serin, 1997; Rymaszewska, Helo, & Gunasekaran, 2017; Worchel, 1992). What makes the IoT such an interesting arena for VoI research is that even where the number of bits is the same everywhere in the IoT, the value of those bits can differ depending upon where and when you are in a certain location in the IoT. For example, if my smart refrigerator sends a message that I only have one egg left (extrapolating from Borgonovo, 2017), that information is only valuable to my cook, and it depends upon what my cook is planning on preparing before going to the market again. Since I do not cook, that information is of no value to me. However, if my alarm system sends a message to my smart phone that there is someone in my house when no one is supposed to be home, that information may be of some value to my cook, but it is extremely valuable information to me.

The IoT changes a user's normal perspective on how valuable information is obtained. We have many, many sources potentially sending information to a decision maker. This number of sources and amount of information can be both good and bad. It can be good by enabling us to reduce the uncertainty of some random variables. That is, one may be able to replace a continuous random variable with a large region of support, ideally with a Dirac delta distribution, where we would precisely know the information. That would be the ideal case and is discussed in the later sections of Howard (1966). However, in the following section we will illustrate some mathematical differences in what Howard did, and discuss our findings with regard to perfect information (clairvoyance), which are also different.

The IoT can also have negative effects; particularly when it comes to varied sources of potentially valuable decision-relevant information. Since the IoT is a huge conglomeration of processing and sensing devices, it is possible, and perhaps even likely, that contradictory information is obtained. Furthermore, the IoT will also be artificially intelligent itself (Elvy, 2017; Etzion, 2015). Machine learning algorithms are currently employed in the IoT at the local device and global usage levels (Ren & Gu, 2015). Much of the machine-learning approaches are implemented to provide the IoT with decision-making autonomy. In the next dimension of system intelligence, the IoT has already begun incorporating technologies to add increasing/improving autonomic or self-star (self-⁎) behaviors. Self-⁎ behaviors are those characteristics that form self-awareness and include self-organization, self-adaptation, and self-protection. The dependence on AI in IoT, in this context, is apparent. However, the implications for AI-enabled self-⁎ behaviors to impact information value are less clear. Nonetheless, there is ample documentation in the literature about how AI can and will be employed as a gatekeeper for information (Camerer, 2017; Conitzer, Sinnott-Armstrong, Borg, Deng, & Kramer, 2017; Naseem & Ahmed, 2017).

The overwhelming number of devices and data that they provide has already necessitated a need for machine learning (Witten, Frank, Hall, & Pal, 2016). The IoT easily interconnects devices and the information between them and with other objects and humans, facilitating the ability to transfer data to them without human-to-computer or human-to-human interaction. Reasoning capabilities stemming from machine learning and also from exploiting other (potentially centralized) resources brings beneficial effects in terms of system efficiency and dependability and adaptive physical and behavioral human-system interactions and collaborations for users (Vermesan et al., 2017). Moreover, the machine learning and intelligence in the IoT provides systems with user, ambient, and social awareness, and enables a wide range of innovative applications (Guo et al., 2013). Beyond the volumes of information the IoT provides, the collective intelligence it manifest represents a new form of information-driven value (de Castro Neto & Santo, 2012).

The elicitation of value through the use of machine learning is not a new phenomenon (Dean, 2014). However, the IoT is transformative because it combines embedded machine learning, and thus collective intelligence, with an exponential ubiquity of devices, vast amounts and variety of data, and an ability to provide virtual interfaces to physical objects that can act on the real world. In this manner, advances in machine learning and AI will complement the technological capability of IoT and significantly impact of many facets of the traditional value chain (Kaplan, 1984). Contrast modern organizations’ exploitation of the intelligent IoT with the historical perspective of 1997 research (Plant & Murrell, 1997) that casts AI as:

The ultimate enabler of [organizational] agility through technology is the artificial intelligent component. AI demands greater organizational internal understanding of technology and thus is only applicable to mature organizations that have internally streamlined processes and a high degree of connectivity.

This view of the future from 1997 speaks to the sophistication of most modern organizations and the impact that more timely and accurate information sharing has on increasing interest in the VoI. Moreover, in narrow domains specific AI techniques that target the predictability of information insights necessary for profit generation, for example, the supply-chain, have proven quantitative measures of value and its relationship to the insights delivered (Lumsden & Mirzabeiki, 2008).

It is interesting to consider how AI itself will have to make VoI determinations relative to tasked goals and objectives. This is because the AI will be responsible for ensuring the automation of a variety of tasks and execution of services. Particularly in a collectively intelligent IoT setting, AI necessarily must make decisions that adapt local behaviors to accommodate global missions and dynamics. From this perspective, the decisions that both humans and AI make utilizing IoT information and processes must do so by focusing on the information itself rather than on technology, as the real carrier of value (Glazer, 1993). Glazer's research both (1) reiterates our earlier point that information itself is difficult and contextual to define in a value context, and (2) offers a transactional basis for VoI that involves decision making in a consumer-supplier process:

A completely satisfactory definition of what constitutes information is problematic. The idea of information is context-dependent and multidimensional. Analogous to the level at which modern physics describes the entire universe in terms of the equivalence between matter and energy is the level at which communications theory provides a procedure for specifying anything (including matter and energy) in terms of its formal information content. The formal or quantitative definition and measure of information is that which reduces uncertainty or changes an individual's degree of belief about the world. However, except for the utility of this construct in purely engineering contexts, it has not provided the foundation for a practical information-measurement system in most general applications.

Glazer's work uses a case study to illustrate the relationship between information value and transactions and thus does little toward as theoretic approach to quantifying VoI. Further, while Glazer illustrates processes that may be automated by AI, his treatment does little to support the detailed decision-making process. Within a complex intelligent system such as the IoT, local AI as a decision maker will have to make microdecisions that enable scalability, given the decentralized nature of the IoT network (Ge, Yang, & Han, 2017). This raises the question, how can value be determined quantitatively in an application agnostic or generalizable sense? It is through this merged lens of IoT and AI that we examine a theory of VoI. To provide grounding and leverage the literature that examines value from a transactional basis, we start with the work of Howard.

9.3 Reworking Howard's Initial Example

Howard's work (Howard, 1966) takes a business approach to defining information value. In this section we borrow freely from Howard. We do not quote phrases for the sake of readability. We do not make any claims to this work and just we rework Howard’s; the only novel thing in this section is our choice of notation and exposition.

We begin with Howard's very practical problem of how much a company should bid to win a contract. If the bid is too high, it loses the contract. If the bid is too low, it gets the contract, but loses money on the deal. Therefore the company attempts to place the bid that will get it the contract while maximizing its profit. The information that the company uses to decide its bid is therefore of extreme importance and this range of information is considered to make up the sample space in question.

We assign a random variable $C$ to be the cost of performing on the contract. Unfortunately, this cost is a probabilistic guess. We let the random variable $L$ be the random variable representing the lowest bid of the competitors. The company's bid is given by the random variable $B$ . The company's profit is the random variable $V$ .

If b > l, the company loses the contract, and it is profit is 0. If b < l the company wins the contract and performs the work at a cost of c. Therefore the profit is v = b − c. Hence, similarly to Eq. (9.3) (Howard, 1966), the company gets the contract in case of a tie (b = l). In terms of the random variables:

$\begin{array}{l} V = \begin{array}{l} B - C, & if B \leq L \\ 0, & if B > L \end{array} \end{array}$

si22_e (9.1)

In Fig. 9.1 we see the plot of Eq. (9.1) when c = 3, l = 8. There is no upper bound on what b may be, but v is always 0 for large enough b. Let us consider the density functions following Howard (1966) but with a modified¹ notation of Ross (2002). We have:

$\begin{array}{c} f (v | b) = \iint_{R^{2}} f (v | b, c, l) \cdot f (c, l | b) d c d l \end{array}$

si23_e (9.2)

Eq. (9.2) only makes sense when 0 ≤ b. The company never places a negative amount bid, so any event involving b < 0 has zero probability. Thus the conditional probability is not defined in that range.

Fig. 9.1 Profit $V = v$ , for c = 3, l = 8, $B = b$ .

We are interested in the expected value of profit conditioned on the bid. That is, we wish to determine $E (V | b)$ :

$\begin{array}{l} E (V | b) & = \int_{- \infty}^{\infty} v \cdot f (v | b) d v \\ = ∭_{R^{3}} v \cdot f (v | b, c, l) \cdot f (c, l | b) d c d l d v \\ = \iint_{R^{2}} f (c, l | b) \int_{- \infty}^{\infty} v \cdot f (v | b, c, l) d v) d c d l \\ = \iint_{R^{2}} E (V | b, c, l) \cdot f (c, l | b) d c d l \end{array}$

si25_e (9.3)

Now, Howard makes two assumptions (Eqs. 9.6, 9.7) Howard (1966) to simplify the problem.

Assumption 9.1

The joint distribution of cost and lowest bid $C$ , $L$ is independent of our company's bid $B$ . That is:

$\begin{array}{l} f (c, l | b) = f (c, l) \end{array}$

Assumption 9.2

The company's cost $C$ is independent of the lowest bid $L$ . That is:

$\begin{array}{l} f (c, l) = f (c) f (l) \end{array}$

We realize that one could certainly argue the reality of these assumptions in all cases. Using Assumptions 9.1 and 9.2 we now have that:

$\begin{array}{c} E (V | b) = \iint_{R^{2}} E (V | b, c, l) \cdot f (c) f (l) d c d l \end{array}$

si33_e (9.4)

From Eq. (9.1) we see that once we set the values of $B$ , $C$ , $L$ at b, c, l, respectively, the density function of $V$ becomes deterministic. That is:

Theorem 9.1

$\begin{array}{l} f (v | b, c, l) = \begin{array}{l} δ (v - (b - c)), & if b \leq l \\ δ (v), & if b > l \end{array} \end{array}$

si38_e

and therefore:

$\begin{array}{l} E (V | b, c, l) = \begin{array}{l} b - c, & if b \leq l \\ 0, & if b > l \end{array} \end{array}$

si39_e

The following theorem follows from Eq. (9.4).

Theorem 9.2

Using Assumptions 9.1 and 9.2, we have:

$\begin{array}{l} E (V | b) & = \iint_{R^{2}} E (V | b, c, l) \cdot f (c) f (l) d c d l (as above) \\ = \int_{- \infty}^{\infty} (b - c) \int_{b}^{\infty} f (l) d l) f (c) d c \end{array}$

si40_e (9.5)

$\begin{array}{l} = P (L > b) \cdot \int_{- \infty}^{\infty} (b - c) f (c) d c \end{array}$

si41_e (9.6)

$\begin{array}{l} = b - E (C)] \cdot P (L > b) \end{array}$

(9.7)

The above corresponds to Eq. (9.10) (Howard, 1966). After our above assumptions, to obtain $E (V | b)$ , we only need the distribution of $L$ and $E (C)$ . Howard (1966) models $C$ as a uniform distribution on [0, 1], which implies $E (C) = \frac{1}{2}$ .

Next we relax what Howard did, and model the distribution of $C$ such that $E (C) = \frac{1}{2}$ . We also follow Howard and model $L$ as a uniform distribution on [0, 2].

We say that the base Howard example is $L = U [0, 2]$ and $E (C) = \frac{1}{2}$ .

The above gives us $P (L > b) = \frac{1}{2} (2 - b)$ , b ≤ 2 (0 for b > 2). Of course, we do not consider b < 0 as discussed earlier. And so, we arrive at:

$\begin{array}{l} E (V | b) = \frac{1}{2} (2 - b) b - \frac{1}{2}), 0 \leq b \leq 2 \end{array}$

si54_e (9.8)

We see that $E (V | b) = - \frac{1}{2} b^{2} - \frac{5}{2} b + 1]$ is a simple quadratic and that $\frac{d}{d b} E (V | b) = - b + \frac{5}{4}$ , so $E (V | b)$ obtains a maximum of 9/32 when b = 5/4 (Fig. 9.2).

We define:

$\begin{array}{l} {⌈ 〈 V 〉 ⌉}_{b} ≜ max_{b} E (V | b) \end{array}$

From this definition, we see that when $E (C) = 0.5$ and $L = U [0, 1]$ :

${⌈ 〈 V 〉 ⌉}_{b} = 9 / 32$

We are in agreement with everything that Howard has done to this point. What we do not agree with is how he used the concept of clairvoyance for additional information that may be learned. We note that the concept of clairvoyance is also discussed in (Borgonovo, 2017, Chapter 11). We return to this later in the chapter.

9.4 Value Discussion

We see from the above that the expected value of a random variable is very important to a decision maker; the information that is used has value. This notion of value is important in the IoT because it can/will be the source of information, moderated by AI that provides it, modifies it, or protects it. From this perspective, the IoT may provide all of the information, too much information, or a limited amount of the information. We see in the above example, that we do not need the entire cost, given Howard's assumptions, only the mean of the cost. Therefore, it need not require many bits of valuable information. Extending Howard's notion at this point, what is the information we have and what is its value?

1. Eq. (9.1): Modeling equation
2. Eq. (9.2): Standard probability theory
3. Assumption 9.1: Independence of the company's bid
4. Assumption 9.2: Cost and lowest bid independence
5. Behavior of $C$
6. Behavior of $L$

Let us just concentrate on the last two items for now. What we have actually used to this point is only the mean of $C$ , and for simplicity, we set:

$\begin{array}{l} μ ≜ E (C) \end{array}$

The distribution of $L$ is given by its density function f_L(l). Modifying this information changes the quantity we care about, that is:

What is the “value” of the information in items 5 and 6 of the previous list in how it affects ${⌈ 〈 V 〉 ⌉}_{b}$ ? Does the shape of the graph change, does the maximum behavior change, etc.?

We return to Eq. (9.4) to see the impact of changes in the information for items 5 and 6. First, let us change the distribution of $L$ so that it is uniformly distributed on [0, L], L > 0, instead of [0, 2].

We see that $P (L > b) = \frac{1}{L} (L - b)$ , b ≤ L (0 for b > L). We see that, in general, for and arbitrary positive μ we have:

$\begin{array}{l} E (V | b) & = \frac{1}{L} (L - b) (b - μ), 0 \leq b \leq L \end{array}$

si70_e (9.9)

$\begin{array}{l} = - \frac{1}{L} b^{2} - [L + μ] b + L μ) \end{array}$

si71_e (9.10)

Simple calculus shows that the value b_o that maximizes $E (V | b)$ is either the critical point $b_{c} = \frac{L + μ}{2}$ , if b_c ≤ L, or the boundary point L if μ > L. Thus,

$\begin{array}{l} {⌈ 〈 V 〉 ⌉}_{b} = \begin{array}{l} \frac{{(L - μ)}^{2}}{4 L}, with b_{o} = \frac{L + μ}{2}, & if 0 \leq μ < L \\ 0, with b_{o} = L, & if μ \geq L \end{array} \end{array}$

si74_e (9.11)

We see that the only interesting case is when 0 < μ < L, which makes logical sense. We call this case the nontrivial region and denote the function defined on that region as $《 V 〉〉$ (Fig. 9.3).

Fig. 9.3 Surface plot of nontrivial values, for L ∈ [0, 2], μ ∈ [0, 2], of ${⌈ 〈 V 〉 ⌉}_{b}$ , with point (L = 2, μ = 0.5, ${⌈ 〈 V 〉 ⌉}_{b} = 9 / 32$ ) highlighted.

Note that we also have:

$\begin{array}{l} \frac{\partial {⌈ 〈 V 〉 ⌉}_{b}}{\partial L} = \begin{array}{l} \frac{1}{4} 1 - {\frac{μ}{L})}^{2}) > 0, & if 0 \leq μ < L \\ 0, & if μ \geq L \end{array} \end{array}$

si76_e (9.12)

and

$\begin{array}{l} \frac{\partial 《 V 》}{\partial μ} = \frac{1}{2} \frac{μ}{L} - 1) < 0 \end{array}$

si77_e (9.13)

In the nontrivial region, increasing L increases ${⌈ 〈 V 〉 ⌉}_{b}$ , and decreasing μ decreases ${⌈ 〈 V 〉 ⌉}_{b}$ .

Let us pause and think about VoI. Is there any additional value in learning more about $C$ other than its mean? No! This realization is an important understanding.

Also, if we are at a point in the nontrivial region, what is more important to learn with respect to $《 V 》$ , a change in L or a change in $E (C)$ ? That is, if we have to prioritize the information that is sent to a decision maker and we can only send one “fact” at a time, which one would we send first, information about a change in L or $E (C)$ ? Consider the total differential:

$\begin{array}{l} d 《 V 》 & = \frac{\partial 《 V 》}{\partial L} d L + \frac{\partial 《 V 》}{\partial μ} d μ \end{array}$

si84_e (9.14)

$\begin{array}{l} = \frac{1}{4} 1 - {\frac{μ}{L})}^{2}) d L - \frac{1}{2} 1 - \frac{μ}{L}) d μ \end{array}$

si85_e (9.15)

Then, using 1 − x² = (1 − x)(1 + x), we see that:

$\begin{array}{l} \frac{\partial 《 V 》}{\partial L}| < \frac{\partial 《 V 》}{\partial μ}| < 2 \frac{\partial 《 V 》}{\partial L}| \end{array}$

si86_e (9.16)

In the infinitesimal sense, the value of $E (C)$ is more important than the value of L, but not by much. Therefore, if we have to prioritize information sent to a decision maker, it should be $E (C)$ , then L.

Of course, all of the above is based upon the fact that we know the optimal $b_{o} = \frac{L + μ}{2}$ , which we learned from our above assumptions and calculations (Fig. 9.4).

Fig. 9.4 Surface plot of b_o in the nontrivial region for L ∈ [0, 2], μ ∈ [0, 2].

9.4.1 Generalization

Let us summarize the above as a generality.

1. We are given distributions on $L$ and $C$ .
2.
$\begin{array}{l} V = \begin{array}{l} B - C, & if B < L \\ 0, & if B > L \end{array} \end{array}$
3. $L$ and $C$ are independent of the company's bid $B$ .
4. The company's cost $C$ is independent of the lowest bid $L$ .

Given this summary:

$\begin{matrix} E (V | b) = b - E (C)] \cdot P (L > b) and now, in general \\ {⌈ 〈 V 〉 ⌉}_{b} ≜ {max}_{b} E (V | b) \end{matrix}$

si98_e

Assuming that $\frac{d}{d b} E (C) = \frac{d}{d b} f (l) = 0$ (which is not a far stretch from the statistical independence we have assumed for the underlying random variables), we have $\frac{d}{d b} E (V | b) = P (L > b) - b - E (C)] \cdot f_{L} (b)$ , where the term $f_{L} (b)$ is the density function f(l) of $L$ evaluated at l = b. The optimal b_o in the nontrivial region solves the integral equation:

$\begin{matrix} b = E (C) + \frac{P (L > b)}{f_{L} (b)} = E (C) + \frac{1}{f_{L} (b)} \int_{b}^{\infty} f_{L} (l) d l \\ and in the nontrivial region {⌈ 〈 V 〉 ⌉}_{b} = \frac{{P (L > b_{o}))}^{2}}{f_{L} (b_{o})} \end{matrix}$

si103_e

9.5 Clairvoyance About $C$

Let us go back to Eq. (9.4), but now let us assume that the company knows the cost $C$ . In this case the company will never bid less than the cost or it will lose money! Note that our results in this section differ from Howard's results on clairvoyance.

Assumption 9.3

The company has knowledge of the cost.

We must modify Theorem 9.1 so that:

$\begin{array}{l} E (V | b, c, l) = \begin{array}{l} b - c, & if c \leq b \leq l \\ 0, & otherwise \end{array} \end{array}$

si106_e (9.17)

We have that:

$\begin{array}{l} E (V | b) & = \iint_{R^{2}} E (V | b, c, l) \cdot f (c) f (l) d c d l (as above) \\ = \int_{- \infty}^{b} (b - c) \int_{b}^{\infty} f (l) d l) f (c) d c \end{array}$

si107_e (9.18)

$\begin{array}{l} = P (L > b) \cdot \int_{- \infty}^{b} (b - c) f (c) d c \end{array}$

si108_e (9.19)

$\begin{array}{l} = b \cdot P (C \leq b) - \int_{- \infty}^{b} c f (c) d c] \cdot P (L > b) \end{array}$

si109_e (9.20)

We will go through an example similar to what we did before. Previously, we followed Howard and modeled $C$ so that $E (C) = 1 / 2$ and $L = U [0, 2]$ . Note that, as before, the distribution of $C$ did not matter, only its mean. We see from the above that this is no longer true. Let us try some examples.

Example 9.1

[ $L = U [0, 2]$ and $P (C = 1 / 2) = 1$ ] So we have that f(c) = δ(c − 1/2), and Eq. (9.20) becomes (Fig. 9.5):

si6_e — Fig. 9.5 $E (V | b) \geq 0$ when the company knows the cost, $L = U [0, 2]$ and $P (C = 1 / 2) = 1$ .

$\begin{array}{l} E (V | b) = \begin{array}{l} b - 1 / 2] \cdot P (L > b), & if 1 / 2 < b \leq 2 \\ 0, & otherwise \end{array} \end{array}$

si116_e (9.21)

Eq. (9.21) simplifies to:

$\begin{array}{l} E (V | b) = \begin{array}{l} b - 1 / 2) \frac{2 - b}{2}), & if 1 / 2 < b \leq 2 \\ 0, & otherwise \end{array} \end{array}$

si117_e (9.22)

In Example 9.1, $E (V | b)$ has a maximum value of 18/32, when b = 5/4.

Example 9.2

[ $L = U [0, 2]$ and $C = U [0, 1]$ ] Eq. (9.20) now becomes:

$\begin{array}{l} E (V | b) = \begin{array}{l} b \cdot \frac{b - 0}{1 - 0}) - \int_{0}^{b} c \cdot \frac{1}{1} d c] \cdot \frac{2 - b}{2}), & if 0 \leq b \leq 1 \\ b \cdot P (C \leq 1) - \int_{- \infty}^{1} c d c] \cdot \frac{2 - b}{2}), & if 1 < b \leq 2 \\ 0, & otherwise \end{array} \end{array}$

si121_e (9.23)

Eq. (9.23) simplifies to:

$\begin{array}{l} E (V | b) = \begin{array}{l} \frac{b^{2}}{4} (2 - b), & b \in [0, 1] \\ b - E (C)] \cdot \frac{2 - b}{2}) = \frac{1}{2} (2 - b) b - \frac{1}{2}), & b \in (1, 2] \\ 0, & otherwise \end{array} \end{array}$

si122_e (9.24)

We note with interest that $E (V | b)$ is a (once) differentiable function on [0, 2] (Fig. 9.6).

In Example 9.2, $E (V | b)$ has a maximum value of 9/32 when b = 5/4, which is the same as Howard's base example.

We see that when the company knows the cost $C$ , the distribution and not just the mean, it affects the behavior of $E (V | b)$ . We also see that knowledge of $C$ guarantees that $E (V | b) \geq 0$ . That is, the company never loses money.

9.6 Clairvoyance About $L$

Now we are in the situation where the company knows the competitor's lowest bid, which is represented by $L$ . As before we assume that if the company's bid b ties with the competition's lowest bid l that the company wins the contract. If we know l we bid l; this bid is placed to win the contract and maximize profit. Note, if one finds this result disturbing, we can always make the bid b a tiny amount less than l. Nonetheless, we therefore see that $B$ and $L$ must be the same. The company will always win the bidding, but it may lose money depending on the value of c. Therefore,

$\begin{array}{l} E (V | b) = E (V | l) \end{array}$

(9.25)

Modifying Eq. (9.1), now differently than Eq. (9.17), we have:

$\begin{array}{l} E (V | b, c, l) = \begin{array}{l} l - c, & l \in support of L \\ 0, & otherwise \end{array} \end{array}$

si134_e (9.26)

We have that ( $C$ and $L$ still independent):

$\begin{array}{l} E (V | l) & = \int_{R} E (V | c, l) \cdot f (c) d c \end{array}$

si137_e (9.27)

$\begin{array}{l} = \int_{R} (l - c) \cdot f (c) d c \end{array}$

si138_e (9.28)

$\begin{array}{l} = l \int_{R} f (c) d c - \int_{R} c \cdot f (c) d c \end{array}$

si139_e (9.29)

$\begin{array}{l} = l - E (C) \end{array}$

(9.30)

when l ∈ support of $L$ .

Example 9.3

[ $L = U [0, L]$ and $E (C) = 1 / 2$ ] Below we show a plot of $E (V | l) = l - 0.5$ against l for $L = U [0, 2]$ and $E (C) = 1 / 2$ (Fig. 9.7).

For Example 9.3, ${⌈ 〈 V 〉 ⌉}_{b} = 1.5$ , achieved when b = 2.

Note that $E (V | b)$ is a linear function of b = l and that it can be negative, zero (once), or positive depending on the support of $L$ . Furthermore, the maximum of $E (V | b)$ is achieved when b is the largest value of l in the support of $L$ . Heuristically, another way of saying this is that the maximum is achieved for the largest value of l such that P(L ∈ (l − dx, l))≠0.

Unlike with the clairvoyant knowledge of $C$ , with knowledge of $L$ , $E (V | b)$ may be negative, but the profit may be much larger. Knowledge of $C$ gives the company nonnegative profit, whereas knowledge of $L$ gives it a larger potential profit. This result is in-line with the results Howard obtained.

9.7 Clairvoyance About $C$ and $L$

After we combine both pieces of information, the bid b will never be less than c, and it will always match l, meaning that we must modify Eq. (9.1) again, now different from Eq. (9.25) because $L$ is more restricted, resulting in:

$\begin{array}{l} E (V | b, c, l) = \begin{array}{l} l - c, & if c \leq l, and l \in support of L \\ 0, & otherwise \end{array} \end{array}$

si160_e (9.31)

So we have an assumption of independence between $C$ and $L$ :

$\begin{array}{l} E (V | l) & = \int_{R} E (V | c, l) \cdot f (c) d c \end{array}$

si137_e (9.32)

$\begin{array}{l} = \int_{- \infty}^{l} (l - c) \cdot f (c) d c \end{array}$

si164_e (9.33)

$\begin{array}{l} = l \cdot P (C < l) - \int_{- \infty}^{l} c \cdot f (c) d c \end{array}$

si165_e (9.34)

when l ∈ support of $L$ .

Example 9.4

[ $L = U [0, 2]$ and $C = U [0, 1]$ ]

$\begin{array}{l} E (V | b) = \begin{array}{l} l \cdot \int_{0}^{l} d c - \int_{0}^{l} c d c & if 0 \leq l < 1 \\ l \cdot \int_{0}^{1} d c - \int_{0}^{1} c d c, & if 1 \leq l \leq 2, \\ 0, & otherwise \end{array} \end{array}$

si169_e (9.35)

The results:

$\begin{array}{l} E (V | b) = \begin{array}{l} \frac{l^{2}}{2}, & if 0 \leq l < 1 \\ l - . 5, & if 1 \leq l \leq 2, \\ 0, & otherwise \end{array} \end{array}$

si170_e (9.36)

Below we plot $E (V | b) = E (V | l)$ against b for $L = U [0, 2]$ and $C = U [0, 1]$ (Fig. 9.8).

For Example 9.4, ${⌈ 〈 V 〉 ⌉}_{b} = 1.5$ , achieved when b = 2. Note that the behavior of Examples 9.3 and 9.4 is identical for b > 1. The difference is that if the company know $C$ , it may never place a bid that will lose money.

9.8 Discussion

What does Howard's model teach us from a business perspective? How companies capture value remains largely the same, a function of competitive position and competitive advantage. Companies that control the flow of information in the value creation process enjoy competitive positions that are likelier to afford better opportunities to capture value from other participants in their ecosystem. In other words, they know where to play. Companies that differentiate the way in which they control the flow of information from other companies with similar positions enjoy a competitive advantage. In other words, they know how to win.

This leads to the conclusion that information creates value only when it is used to modify future action in beneficial ways. Ideally, this modified action gives rise to new information, allowing the learning process to continue. Information, then, creates value not in a linear value chain of process steps but, rather, in a never-ending value loop. Whether information is viewed discretely or from a continuous variable perspective the question remains: what is the probability that new value can be derived?

From a systems standpoint, as well as a system-augmented human decision-making perspective, machine learning and AI is implied in the never-ending value-loop. This notion is consistent with and supported by the decision science literature, which generally views decision making as an ongoing process (Simon, 1960). If value, derived from plentiful IoT information is created in a nonlinear loop (Baker, Song, & Jones, 2017), then information's relationship to the decision is inseparable. Further, given an abundance of supply-side information, VoI would decrease proportionally to its decision relevance. This is logical, because if everyone has perfect information in a bidding situation, VoI would correlate highly with its perceived potential for modification (Sánchez-Fernández & Iniesta-Bonillo, 2007). The decision-transactional basis of Howard's on theoretical VoI and thereby our own extension, may introduce contextual bias in this sense, so we provide some discussion here on the nature of sensed information and VoI from the sensor network literature.

Bisdikian, Kaplan, and Srivastava (2013) have conducted a significant amount of research in quality of information (QoI), VoI, and the relationship between them. Their findings are relevant here because their work helps define the differences between quantitative and qualitative characteristics of information. Moreover, they provide these definitions from the perspective of sensor networks making the application to IoT direct, if on a somewhat smaller scale. The work of Bisdikian et al. is a departure from the information theoretic perspective taken in our work and instead provide a descriptive characterization of QoI and VoI. Their definition casts VoI as a function of QoI, where QoI is use-independent facts about information (e.g., percentage of error, age, resolution) and VoI is use-dependent qualitative judgments (e.g., trustworthiness, completeness, readability). Figs. 9.9 and 9.10 show the semantic taxonomy from Bisdikian et al., where it is evident that our treatment of VoI is consistent with the qualitative judgment.

Fig. 9.9 Quality of Information attribute taxonomy. (Adapted from Bisdikian, C., Kaplan, L. M., & Srivastava, M. B. (2013). On the quality and value of information in sensor networks. ACM Transactions on Sensor Networks (TOSN), 9(4), 48.)

Fig. 9.10 Volume of Information attribute taxonomy. (Adapted from Bisdikian, C., Kaplan, L. M., & Srivastava, M. B. (2013). On the quality and value of information in sensor networks. ACM Transactions on Sensor Networks (TOSN), 9(4), 48.)

The semantic descriptions of QoI and VoI suggest that AI's use of the IoT would concentrate primarily on QoI because VoI characteristics are essentially human characteristics that are difficult to capture computationally in open-domain problems. However, QoI characteristics may provide a basis for deriving a probabilistic VoI to be used in transactional (nonclairvoyant) estimates. Further, it may be feasible to learn over QoI values to bound a purely computational VoI. Future research, building on our work, may warrant additional consideration about how machine-learned QoI could translate to a functional VoI.

9.9 Conclusion

IoT will provide a rich environment, supplying VOIs for nearly every aspect of humans’ activities and environments. The IoT will gain ever increasing amounts of AI that will only provide greater degrees of autonomic capabilities and self-star behaviors. This AI-enriched IoT environment will change the fundamental notions of information value for decision making by producing huge quantities of information that are managed by AI functionality. Like Shannon's information theories, our understanding of VoI theory will implicitly go beyond just a quantitative concept to include qualitative notions. However, there is surprisingly little literature that examines VoI in the context of the IoT. In this chapter, we have extended Howard's (1966) VoI theory to examine a generalization of that notion toward a guarantee of a minimal value.

We presented a rework of Howard's theoretical problem and solution identifying some limitations in his treatment of a random variable, relative to VoI. Howard's idea of clairvoyance, or insight into future information (and its value) treats the value of the random variable deterministically rather than probabilistically. By giving the random variable a probabilistic context, such as would be the case of the information provided by AI-enabled IoT, the theoretical handling of clairvoyance changes. We see, as did Howard, that knowledge about $L$ is more important than knowledge about $C$ when it comes to maximizing $E (V | b)$ . But we have shown that a knowledge of $C$ in a bid guarantees that a bidder will never have a negative expected profit. Therefore, the VoI depends on what one is trying to do, or the contextual objective. This qualitative consideration must be kept in mind for future research on VoI. Existing work from the sensor network domain, specifically on QoI, and VoI may provide quantitative measures that can form a probabilistic derivative VoI.

We explained the relevance of our approach in this chapter's section on IoT and AI. We have taken the opportunity to adjust Howard's seminal theory to provide an extended foundation for the VoI theory in the IoT. One must keep in mind that AI techniques, such as machine learning and artificial reasoning, when employed in the IoT for self-star system behaviors, will require additional consideration for managing information provided to a human or machine decision maker. While we continued with Howard's “market” context in this chapter for its explainability and theoretic continuity, our future work will examine the implications of our theoretical VoI guarantee, described herein, in an IoT-specific experimental simulation or empirical study, that incorporates semantic notions of QoI.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 9: The Value of Information and the Internet of Things

Create new playlist

Sign In

Sign Up