Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 16. Information Flow

	BOTTOM: Masters, I am to discourse wonders: butask me not what; for if I tell you, I am no trueAthenian. I will tell you every thing, right as itfell out.
	--A Midsummer Night's Dream, IV, ii, 30–33.

Although access controls can constrain the rights of a user, they cannot constrain the flow of information about a system. In particular, when a system has a security policy regulating information flow, the system must ensure that the information flows do not violate the constraints of the policy. Both compile-time mechanisms and runtime mechanisms support the checking of information flows. Several systems implementing these mechanisms demonstrate their effectiveness.

Basics and Background

Information flow policies define the way information moves throughout a system. Typically, these policies are designed to preserve confidentiality of data or integrity of data. In the former, the policy's goal is to prevent information from flowing to a user not authorized to receive it. In the latter, information may flow only to processes that are no more trustworthy than the data.

Any confidentiality and integrity policy embodies an information flow policy.

Let x be a variable in a program. The notation x refers to the information flow class of x.

Entropy-Based Analysis

We now define precisely the notion of information flow. Intuitively, information flows from an object x to an object y if the application of a sequence of commands c causes the information initially in x to affect the information in y. We use the notion of entropy, or uncertainty (see Chapter 32, “Entropy and Uncertainty”), to formalize this concept.

Let c be a sequence of commands taking a system from state s to another state t. Let x and y be objects in the system. We assume that x exists when the system is in state s and has the value x_s. We require that y exist in state t and have the value y_t. If y exists in state s, it has value y_s.

Definition 16–1. The command sequence c causes a flow of information from x to y if H(x_s | y_t) < H(x_s | y_s). If y does not exist in s, then H(x_s | y_s) = H(x_s).

This definition states that information flows from the variable x to the variable y if the value of y after the commands allows one to deduce information about the value of x before the commands were run.

This definition views information flow in terms of the information that the value of y allows one to deduce about the value in x. For example, the statement

y := x;

reveals the value of x in the initial state, so H(x_s | y_t) = 0 (because given the value y_t, there is no uncertainty in the value of x_s). The statement

y := x / z;

reveals some information about x, but not as much as the first statement.

The final result of the sequence c must reveal information about the initial value of x for information to flow. The sequence

tmp := x;
y := tmp;

has information flowing from x to y because the (unknown) value of x at the beginning of the sequence is revealed when the value of y is determined at the end of the sequence. However, no information flow occurs from tmp to x, because the initial value of tmp cannot be determined at the end of the sequence.

EXAMPLE: Consider the statement

x := y + z;

Let y take any of the integer values from 0 to 7, inclusive, with equal probability, and let z take the value 1 with probability 0.5 and the values 2 and 3 with probability 0.25 each. Let s be the state before this operation is executed, and let t be the state immediately after it is executed. Then H(y_s) = H(y_t) = 3 and H(z_s) = H(z_t) = 1.5. Once the value of x_t is known, y_s can assume at most three values, so H(y_s | x_t) < lg 3 Ý 1.58. Thus, information flows from y to x. Similar results hold for H(z_s | x_t); see Exercise 1.

Definition 16–2. An implicit flow of information occurs when information flows from x to y without an explicit assignment of the form y := f(x), where f(x) is an arithmetic expression with the variable x.

The flow of information occurs, not because of an assignment of the value of x, but because of a flow of control based on the value of x. This demonstrates that analyzing programs for assignments to detect information flows is not enough. To detect all flows of information, implicit flows must be examined.

Information Flow Models and Mechanisms

An information flow policy is a security policy that describes the authorized paths along which that information can flow. Part 3, “Policy,” discussed several models of information flow, including the Bell-LaPadula Model, nonlattice and nontransitive models of information flow, and nondeducibility and noninterference. Each model associates a label, representing a security class, with information and with entities containing that information. Each model has rules about the conditions under which information can move throughout the system.

In this chapter, we use the notation x ≤ y to mean that information can flow from an element of class x to an element of class y. Equivalently, this says that information with a label placing it in class x can flow into class y.

Earlier chapters usually assumed that the models of information flow policies were lattices. We first consider nonlattice information flow policies and how their structures affect the analysis of information flow. We then turn to compiler-based information flow mechanisms and runtime mechanisms. We conclude with a look at flow controls in practice.

Nonlattice Information Flow Policies

Denning [267] identifies two requirements for information flow policies. Both are intuitive. Information should be able to flow freely among members of a single class, providing reflexivity. If members of one class can read information from a second class, they can save the information in objects belonging to the first class. Then, if members of a third class can read information from the first class, they can read the contents of those objects and, effectively, read information from the second class. This produces transitivity. The Bell-LaPadula Model exhibits both characteristics. For example, Cathy dom Betty, and Betty dom Anne, then Cathy dom Anne.

However, in some circumstances, transitivity is undesirable.

If information flow throughout a system is not transitive, then Denning's lattice model of information flow cannot represent the system. But such systems exist, as pointed out above. Lattices may not even model transitive systems.

EXAMPLE: Two faculty members are co-principle investigators of a grant. Graduate students report to both faculty members, and graduate students supervise undergraduate students on the project. The faculty members have equal power, neither being able to overrule the other. Clearly, information flows from the undergraduates to the graduates, and then on to the faculty members, so the system is transitive. But the graduate students have no single least upper bound, because both faculty members dominate them and there is no entity that dominates both faculty members. Hence, the information flow relations in this system do not form a lattice.

We generalize the notion of a confidentiality policy. An information flow policy I is a triple I = (SC_I, ≤_I, join_I), where SC_I is a set of security classes, ≤_I is an ordering relation on the elements of SC_I, and join_I combines two elements of SC_I.

We now present a model of information flow that does not require transitivity and apply it to two cases in which the information flow relations do not form a lattice. In the first case, the relations are transitive; in the second, they are not.

Confinement Flow Model

Foley [362] presented a model of confinement flow. Assume that an object can change security classes; for example, a variable may take on the security classification of its data. Associate with each object x a security class x.

Definition 16–3. [362] The confinement flow model is a 4-tuple (I, O, confine, →) in which I = (SC_I, ≤_I, join_I) is a lattice-based information flow policy; O is a set of entities; →: O × O is a relation with (a, b) ∊ → if and only if information can flow from a to b; and, for each a ∊ O, confine(a) is a pair (a_L, a_U) ∊ SC_I × SC_I, with a_L ≤ _Ia_U, and the interpretation that for a ∊ O, if x ≤ a_U, information can flow from x to a, and if a_L ≤ x, information can flow from a to x.

This means that a_L is the lowest classification of information allowed to flow out of a, and a_U is the highest classification of information allowed to flow into a.

The security requirement for an information flow model requires that if information can flow from a to b, then b dominates a under the ordering relation of the lattice. For the confinement flow model, this becomes

(∀ a, b ∊ O)[ a → b ⇒ a_L ≤_I b_U ]

EXAMPLE: Let a, b, and c ∊ O. Define

confine(a) = [ CONFIDENTIAL, CONFIDENTIAL ]
confine(b) = [ SECRET, SECRET ]
confine(c) = [ TOPSECRET, TOPSECRET ]

The possible information flows are a → b, a → c, b → a, b → c, c → a, and c → b. If only secure flows (those meeting the security requirement of the confinement flow model) are allowed, then a → b, a → c, and b → c are the legal flows (because a_L ≤ _Ib_U, a_L ≤_Ic_U, and b_L ≤_I c_U). Thus, transitivity holds.

Now consider x, y, and z. These three variables can assume values of different classifications:

confine(x) = [ CONFIDENTIAL, CONFIDENTIAL ]
confine(y) = [ SECRET, SECRET ]
confine(z) = [ CONFIDENTIAL, TOPSECRET ]

The possible information flows are x → y, x → z, y → x, y → z, z → x, and z → y. If only secure flows are allowed, then x → y, x → z, y → z, and z → x are the legal flows. But information cannot legally flow from y to x, because y_L ≤_Ix_U is false. Hence, transitivity fails.

This model exhibits weak tranquility. It also binds intervals of security classes, rather than a single security class (as in the Bell-LaPadula Model). The lattice of security classes induces a second lattice on these intervals (see Exercise 2).

Transitive Nonlattice Information Flow Policies

Consider a company in which line managers report income to two different superiors—a business manager and an auditor. The auditor and the business manager are independent. Thus, information flows from the workers to the line managers, and from the line managers to the business manager and the auditor. This model is reflexive (because information can flow freely among entities in the same class) and transitive (because information can flow from the workers to the business manager and auditor). However, there is no way to combine the auditor and the business manager, because there is no “superior” in this system. Hence, the information flow relations do not form a lattice. Figure 16-1 captures this situation.

Definition 16–4. A quasi-ordered set Q = (S_Q, ≤_Q) is a set S_Q and a relation ≤_Q defined on S_Q such that the relation is both reflexive and transitive.

An example of a nonlattice information flow policy. Because the business manager and the auditor are independent, they have no least upper bound. Hence, the structure is not a lattice.

Figure 16-1. An example of a nonlattice information flow policy. Because the business manager and the auditor are independent, they have no least upper bound. Hence, the structure is not a lattice.

The company described above forms a quasi-ordered set. Handling the information flow now becomes a matter of defining a lattice that includes the quasi-ordered set. For all x ∊ S_Q, let f(x) = { y | y ∊ S_Q ∧ y ≤_Q x }. Define the set S_QP = { f(x) | x ∊ S_Q } and the relation ≤_QP = { (x, y) | x, y ∊ S_QP ∧ x ⊆ y }. Then S_QP is a partially ordered set under ≤_QP. f preserves ordering, so x ≤_Q y if and only if f(x) ≤ _QP f(y).

Denning [267] describes how to turn a partially ordered set into a lattice. Add the sets S_Q and Ø to the set S_QP. Define ub(x, y) = { z | z ∊ S_QP ∧ x ⊆ z ∧ y ⊆ z } (here, ub stands for “upper bound,” which contains all sets containing all elements of both x and y). Then define lub(x, y) = ∩ ub(x, y). Define the lower bound lb(x, y), and the greatest lower bound glb(x, y) similarly. The structure (S_QP ∪ {S_Q, Ø}, ≤_QP) is now a lattice.

At this point, the information flow policy simply emulates that of the containing lattice.

Nontransitive Information Flow Policies

Foley [362] has considered the problem of modeling nontransitive systems. He defines a procedure for building lattices from such systems. His procedure adds entities and relations to the model, but the procedure keeps the nontransitive relationships of the original entities and relations intact.

EXAMPLE: A government agency has the policy shown in Figure 16-2. It involves three types of entities: public relations officers (PRO), who need to know more than they can say publicly; analysts (A); and spymasters (S). The accesses of the three types of entities are confined to certain types of data, as follows.

confine(PRO) = [public, analysis]
confine(A) = [analysis, top-level]
confine(S) = [covert, top-level]

An example of a government agency information flow policy. Public information is available to all. All other types of information are restricted, with analysis data and covert data (about secret missions) being distinct types of data. Top-level data is synthesized from both covert and analysis data.

Figure 16-2. An example of a government agency information flow policy. Public information is available to all. All other types of information are restricted, with analysis data and covert data (about secret missions) being distinct types of data. Top-level data is synthesized from both covert and analysis data.

According to the confinement flow model, PRO ≤ A, A ≤ PRO, PRO ≤ S, A ≤ S, and S ≤ A. But covert data cannot flow to the public relations officers; S ≤ A and A ≤ PRO do not imply S ≤ PRO. The system is not transitive.

Government (and private) agencies often use procedures to insulate public relations officers from data that is not to be leaked. Although the agency may trust the public relations officers, people make mistakes, and what the officers don't know, they cannot accidentally blurt out. So the example is realistic.

Definition 16–5. Let R = (SC_R, ≤_R, join_R) represent a reflexive information flow policy. A dual mapping (l_R(x), h_R(x)) maps R to an ordered set P = (S_P , ≤_P):
l_R: SC_R → S_P with l_R(x) = { x }
h_R: SC_R → S_P with h_R(x) = { y | y ∊ S_P ∧ y ≤_R x }

The relation ≤_P indicates “subset,” and the elements in S_P are the set of subsets of SC_R. The dual mapping is called order preserving if and only if

(∀a, b ∊ SC_R) [a ≤_R b ⇔ l_R(a) ≤_P h_R(b) ]

The set S_P formed by the dual mapping of a reflexive information flow policy is a (possibly improper) subset of the power set of SC_R. It is a partially ordered set. Denning's procedure, discussed above, can transform this into a lattice. Hence, without loss of generality, we can assume that the set P = (S_P , ≤_P) is a lattice.

An order-preserving dual mapping preserves the ordering relation under the transformation. It also preserves nonorderings and hence nontransitivity. We now have:

Theorem 16–1. A dual mapping from a reflexive information flow policy R to an ordered set P is order-preserving.
Proof. Let R = (SC_R, ≤_R, join_R) be an information flow policy and let P = (S_P , ≤_P) be an ordered set. Let (l_R(x), h_R(x)) be the dual mapping from R to S_P . Let a, b ∊ SC_R.
- (⇒) Let a ≤_R b. By Definition 16–5, a ∊ l_R(a) and a ∊ h_R(b). Thus, l_R(a) ⊆ h_R(b), or l_R(a) ≤_P h_R(b), as claimed.
- (⇐) Let l_R(a) ≤_P h_R(b). By Definition 16–5, l_R(a) ⊆ h_R(b). Because l_R(a) = { a }, this means that a ∊ h_R(b). Thus, a ∊ S_P and a ≤_R b, as claimed.
We can now interpret the information flow policy requirements. Let
confine(x) = [ x_L, x_U ]

and consider class y. Then information can flow from x to an element of y if and only if x_L ≤_R y, or l_R(x_L) ⊆ h_R(y). Information can flow from an element of y to x if and only if y ≤_Rx_U, or l_R(y) ⊆ h_R(x_U).

EXAMPLE: Return to the government agency with the policy shown in Figure 16-2 and the entity types discussed in the preceding example. Call this policy R. We have the following flow relationships among the security classes.

public ≤_R public
public ≤_R analysis	analysis ≤_R analysis
public ≤_R covert	covert ≤_R covert
public ≤_R top-level	covert ≤_R top-level
analysis ≤_R top-level	top-level ≤_R top-level

The dual mapping elements l_R and h_R are

l_R(public) = { public }	h_R(public) = { public }
l_R(analysis) = { analysis }	h_R(analysis) = { public, analysis}
l_R(covert) = { covert }	h_R(covert) = { public, covert}
l_R(top-level) = { top-level }	h_R(top-level) = { public, analysis, covert, top-level }

Let p, a, and s be entities of the types PRO, A, and S, respectively. In terms of P, they are confined as follows.

confine(p) = [ { public } , { public, analysis} ]
confine(a) = [ { analysis }, { public, analysis, covert, top-level } ]
confine(s) = [ { covert }, { public, analysis, covert, top-level } ]

Thus,

p → a because { public } ⊆ { public, analysis, covert, top-level }
a → p because { analysis } ⊆ { public, analysis }
p → s because { public } ⊆ { public, analysis, covert, top-level }
a → s because { analysis } ⊆ { public, analysis, covert, top-level }
s → a because { covert } ⊆ { public, analysis, covert, top-level }

However, because { covert } ⊄ { public, analysis }, information cannot flow from s to p, reflecting the lack of transitivity of the system.

Nonlattice policies can be embedded into lattices. Hence, analysis of information flows may proceed without loss of generality under the assumption that the information flow model is a lattice.

Compiler-Based Mechanisms

Compiler-based mechanisms check that information flows throughout a program are authorized. The mechanisms determine if the information flows in a program could violate a given information flow policy. This determination is not precise, in that secure paths of information flow may be marked as violating the policy; but it is secure, in that no unauthorized path along which information may flow will be undetected.

Definition 16–6. A set of statements is certified with respect to an information flow policy if the information flow within that set of statements does not violate the policy.

EXAMPLE: Consider the program statement

if x = 1 then y := a;
else y := b;

By the rules discussed earlier, information flows from x and a to y or from x and b to y, so if the policy says that a ≤ y, b ≤ y, and x ≤ y, then the information flow is secure. But if a ≤ y only when some other variable z = 1, the compiler-based mechanism must determine whether z = 1 before certifying the statement. Typically, this is infeasible. Hence, the compiler-based mechanism would not certify the statement. The mechanisms described here follow those developed by Denning and Denning [274] and Denning [269].

Declarations

For our discussion, we assume that the allowed flows are supplied to the checking mechanisms through some external means, such as from a file. The specifications of allowed flows involve security classes of language constructs. The program involves variables, so some language construct must relate variables to security classes. One way is to assign each variable to exactly one security class. We opt for a more liberal approach, in which the language constructs specify the set of classes from which information may flow into the variable. For example,

x: integer class { A, B }

states that x is an integer variable and that data from security classes A and B may flow into x. Note that the classes are statically, not dynamically, assigned. Viewing the security classes as a lattice, this means that x's class must be at least the least upper bound of classes A and B—that is, lub{A, B} ≤ x.

Two distinguished classes, Low and High, represent the greatest lower bound and least upper bound, respectively, of the lattice. All constants are of class Low.

Information can be passed into or out of a procedure through parameters. We classify parameters as input parameters (through which data is passed into the procedure), output parameters (through which data is passed out of the procedure), and input/output parameters (through which data is passed into and out of the procedure).

(* input parameters are named i_s; output parameters, o_s; *)
(* and input/output parameters, io_s, with s a subscript *)
proc something(i₁, ..., i_k; var o₁, ..., o_m, io₁, ..., io_n);
var l₁, ..., l_j;                  (* local variables *)
begin
         S;                       (* body of procedure *)
end;

The class of an input parameter is simply the class of the actual argument:

i_s: type class { i_s }

Because information can flow from any input parameter to any output parameter, the declaration must capture this:

o_s: type class { i₁, ..., i_k, io₁, ..., io_k }

(We implicitly assume that any output-only parameter is initialized in the procedure.) The input/output parameters are like output parameters, except that the initial value (as input) affects the allowed security classes:

io_s: type class { i₁, ..., i_k, io₁, ..., io_k }

The declarations presented so far deal only with basic types, such as integers, characters, floating point numbers, and so forth. Nonscalar types, such as arrays, records (structures), and variant records (unions) also contain information. The rules for information flow classes for these data types are built on the scalar types.

Consider the array

a: array 1 .. 100 of int;

First, look at information flows out of an element a[i] of the array. In this case, information flows from a[i] and from i, the latter by virtue of the index indicating which element of the array to use. Information flows into a[i] affect only the value in a[i], and so do not affect the information in i. Thus, for information flows from a[i], the class involved is lub{ a[i], i }; for information flows into a[i], the class involved is a[i].

Program Statements

A program consists of several types of statements. Typically, they are

Assignment statements
Compound statements
Conditional statements
Iterative statements
Goto statements
Procedure calls
Function calls
Input/output statements.

We consider each of these types of statements separately, with two exceptions. Function calls can be modeled as procedure calls by treating the return value of the function as an output parameter of the procedure. Input/output statements can be modeled as assignment statements in which the value is assigned to (or assigned from) a file. Hence, we do not consider function calls and input/output statements separately.

Assignment Statements

An assignment statement has the form

y := f(x₁, ..., x_n)

where y and x₁, ..., x_n are variables and f is some function of those variables. Information flows from each of the x_i's to y. Hence, the requirement for the information flow to be secure is

lub{x₁, ..., x_n} ≤ y

Compound Statements

A compound statement has the form

begin
     S₁;
     ...
     S_n;
end;

where each of the S_i's is a statement. If the information flow in each of the statements is secure, then the information flow in the compound statement is secure. Hence, the requirements for the information flow to be secure are

S₁ secure
...
S_n secure

Conditional Statements

A conditional statement has the form

if f(x₁, ..., x_n) then
     S₁;
else
     S₂;
end;

where x₁, …, x_n are variables and f is some (boolean) function of those variables. Either S₁ or S₂ may be executed, depending on the value of f, so both must be secure. As discussed earlier, the selection of either S₁ or S₂ imparts information about the values of the variables x₁, ..., x_n, so information must be able to flow from those variables to any targets of assignments in S₁ and S₂. This is possible if and only if the lowest class of the targets dominates the highest class of the variables x₁, ..., x_n. Thus, the requirements for the information flow to be secure are

S₁ secure
S₂ secure
lub{x₁, ..., x_n} ≤ glb{ y | y is the target of an assignment in S₁ and S₂ }

As a degenerate case, if statement S₂ is empty, it is trivially secure and has no assignments.

Iterative Statements

An iterative statement has the form

while f(x₁, ..., x_n) do
      S;

where x₁, ..., x_n are variables and f is some (boolean) function of those variables. Aside from the repetition, this is a conditional statement, so the requirements for information flow to be secure for a conditional statement apply here.

To handle the repetition, first note that the number of repetitions causes information to flow only through assignments to variables in S. The number of repetitions is controlled by the values in the variables x₁, ..., x_n, so information flows from those variables to the targets of assignments in S—but this is detected by the requirements for information flow of conditional statements.

However, if the program never leaves the iterative statement, statements after the loop will never be executed. In this case, information has flowed from the variables x₁, ..., x_n by the absence of execution. Hence, secure information flow also requires that the loop terminate.

Thus, the requirements for the information flow to be secure are

Iterative statement terminates
S secure
lub{x₁, ..., x_n} ≤ glb{ y | y is the target of an assignment in S }

EXAMPLE: Consider the statements

while i < n do
begin
     a[i] := b[i];
     i := i + 1;
end;

This loop terminates. If n ≤ i initially, the loop is never entered. If i < n, i is incremented by a positive integer, 1, and so increases, at each iteration. Hence, after n – i iterations, n = i, and the loop terminates.

Now consider the compound statement that makes up the body of the loop. The first statement is secure if i ≤ a[i] and b[i] ≤ a[i]; the second statement is secure because i ≤ i. Hence, the compound statement is secure if lub{ i, b[i] } ≤ a[i].

Finally, a[i] and i are targets of assignments in the body of the loop. Hence, information flows into them from the variables in the expression in the while statement. So, lub{ i, n } ≤ glb{ a[i], i }. Putting these together, the requirement for the information flow to be secure is lub{ i, n } ≤ glb{ a[i], i } (see Exercise 5).

Goto Statements

A goto statement contains no assignments, so no explicit flows of information occur. Implicit flows may occur; analysis detects these flows.

Definition 16–7. A basic block is a sequence of statements in a program that has one entry point and one exit point.

EXAMPLE: Consider the following code fragment.

proc transmatrix(x: array [1..10][1..10] of int class { x };
             var y: array [1..10][1..10] of int class { y } );
var i, j, tmp: int class { tmp };
begin
         i := 1;                     (* b₁ *)
_________________________________________________________________
l2:      if i > 10 goto l7;          (* b₂ *)
_________________________________________________________________
         j := 1;                     (* b₃ *)
_________________________________________________________________
l4:      if j > 10 then goto l6;     (* b₄ *)
_________________________________________________________________
         y[j][i] := x[i][j];         (* b₅ *)
         j := j + 1;
         goto l4;
_________________________________________________________________
l6:      i := i + 1;                 (* b₆ *)
         goto l2;
_________________________________________________________________
l7:                                  (* b₇ *)
end;

There are seven basic blocks, labeled b₁ through b₇ and separated by lines. The second and fourth blocks have two ways to arrive at the entry—either from a jump to the label or from the previous line. They also have two ways to exit—either by the branch or by falling through to the next line. The fifth block has three lines and always ends with a branch. The sixth block has two lines and can be entered either from a jump to the label or from the previous line. The last block is always entered by a jump.

Control within a basic block flows from the first line to the last. Analyzing the flow of control within a program is therefore equivalent to analyzing the flow of control among the program's basic blocks. Figure 16-3 shows the flow of control among the basic blocks of the body of the procedure transmatrix.

The control flow graph of the procedure transmatrix. The basic blocks are labeled b1 through b7.The conditions under which branches are taken are shown over the edges corresponding to the branches.

Figure 16-3. The control flow graph of the procedure transmatrix. The basic blocks are labeled b₁ through b₇.The conditions under which branches are taken are shown over the edges corresponding to the branches.

When a basic block has two exit paths, the block reveals information implicitly by the path along which control flows. When these paths converge later in the program, the (implicit) information flow derived from the exit path from the basic block becomes either explicit (through an assignment) or irrelevant. Hence, the class of the expression that causes a particular execution path to be selected affects the required classes of the blocks along the path up to the block at which the divergent paths converge.

Definition 16–8. An immediate forward dominator of a basic block b (written IFD(b)) is the first block that lies on all paths of execution that pass through b.

Computing the information flow requirement for the set of blocks along the path is now simply applying the logic for the conditional statement. Each block along the path is taken because of the value of an expression. Information flows from the variables of the expression into the set of variables assigned in the blocks. Let B_i be the set of blocks along an execution path from b_i to IFD(b_i), but excluding these endpoints. (See Exercise 6.) Let x_i₁, ..., x_in be the set of variables in the expression that selects the execution path containing the blocks in B_i. The requirements for the program's information flows to be secure are

All statements in each basic block secure
lub{x_i₁, ..., x_in} ≤ glb{ y | y is the target of an assignment in B_i }

EXAMPLE: Consider the body of the procedure transmatrix. We first state requirements for information flow within each basic block:

b₁: Low ≤ i ⇒ secure
b₃: Low ≤ j ⇒ secure
b₅: lub{ x[i][j], i, j } ≤ y[j][i]; j ≤ j ⇒ lub{ x[i][j], i, j } ≤ y[j][i]
b₆: lub{ Low, i } ≤ i ⇒ secure

The requirement for the statements in each basic block to be secure is, for i = 1, ..., n and j = 1, ..., n, lub{ x[i][j], i, j } ≤ y[j][i]. By the declarations, this is true when lub{x, tmp} ≤ y.

In this procedure, B₂ = { b₃, b₄, b₅, b₆ } and B₄ = { b₅ }. Thus, in B₂, statements assign values to i, j, and y[j][i]. In B₄, statements assign values to j and y[j][i]. The expression controlling which basic blocks in B₂ are executed is i ≤ 10; the expression controlling which basic blocks in B₄ are executed is j ≤ 10. Secure information flow requires that i ≤ glb{ i, j, y[j][i]} and j ≤ glb{ j, y[j][i] }. In other words, tmp ≤ glb{ tmp, y } and tmp ≤ glb{ tmp, y }, or tmp ≤ y.

Combining these requirements, the requirement for the body of the procedure to be secure with respect to information flow is lub{x, tmp} ≤ y.

Procedure Calls

A procedure call has the form

proc procname(i₁, ..., i_m : int; var o₁, ..., o_n : int);
begin
     S;
end;

where each of the i_j's is an input parameter and each of the o_j's is an input/output parameter. The information flow in the body S must be secure. As discussed earlier, information flow relationships may also exist between the input parameters and the output parameters. If so, these relationships are necessary for S to be secure. The actual parameters (those variables supplied in the call to the procedure) must also satisfy these relationships for the call to be secure. Let x₁, ..., x_m and y₁, ..., y_n be the actual input and input/output parameters, respectively. The requirements for the information flow to be secure are

S secure
For j = 1, ..., m and k = 1, ..., n, if i_j ≤ o_k then x_j ≤ y_k
For j = 1, ..., n and k = 1, ..., n, if o_j ≤ o_k then y_j ≤ y_k

Exceptions and Infinite Loops

Exceptions can cause information to flow.

EXAMPLE: Consider the following procedure, which copies the (approximate) value of x to y.^[1]

proc copy(x: int class { x }; var y: int class Low);
var  sum: int class { x };
     z: int class Low;
begin
     z := 0;
     sum := 0;
     y := 0;
     while z = 0 do begin
           sum := sum + x;
           y := y + 1;
     end
end

When sum overflows, a trap occurs. If the trap is not handled, the procedure exits. The value of x is MAXINT / y, where MAXINT is the largest integer representable as an int on the system. At no point, however, is the flow relationship x ≤ y checked.

If exceptions are handled explicitly, the compiler can detect problems such as this. Denning again supplies such a solution.

EXAMPLE: Suppose the system ignores all exceptions unless the programmer specifically handles them. Ignoring the exception in the preceding example would cause the program to loop indefinitely. So, the programmer would want the loop to terminate when the exception occurred. The following line does this.

on overflowexception sum do z := 1;

This line causes information to flow from sum to z, meaning that sum ≤ z. Because z is Low and sum is { x }, this is incorrect and the procedure is not secure with respect to information flow.

Denning also notes that infinite loops can cause information to flow in unexpected ways.

EXAMPLE: The following procedure copies data from x to y. It assumes that x and y are either 0 or 1.

proc copy(x: int 0..1 class { x };
            var y: int 0..1 class Low);
begin
     y := 0;
     while x = 0 do
           (* nothing *);
     y := 1;
end.

If x is 0 initially, the procedure does not terminate. If x is 1, it does terminate, with y being 1. At no time is there an explicit flow from x to y. This is an example of a covert channel, which we will discuss in detail in the next chapter.

Concurrency

Of the many concurrency control mechanisms that are available, we choose to study information flow using semaphores [298]. Their operation is simple, and they can be used to express many higher-level constructs [148, 805]. The specific semaphore constructs are

wait(x): if x = 0 then block until x > 0; x := x - 1;
signal(x): x := x + 1;

where x is a semaphore. As usual, the wait and the signal are indivisible; once either one has started, no other instruction will execute until the wait or signal finishes.

Reitman and his colleagues [34, 836] point out that concurrent mechanisms add information flows when values common to multiple processes cause specific actions. For example, in the block

begin
     wait(sem);
     x := x + 1;
end;

the program blocks at the wait if sem is 0, and executes the next statement when sem is nonzero. The earlier certification requirement for compound statements is not sufficient because of the implied flow between sem and x. The certification requirements must take flows among local and shared variables (semaphores) into account.

Let the block be

begin
     S₁;
     ...
     S_n;
end;

Assume that each of the statements S₁, ..., S_n is certified. Semaphores in the signal do not affect information flow in the program in which the signal occurs, because the signal statement does not block. But following a wait statement, which may block, information implicitly flows from the semaphore in the wait to the targets of successive assignments.

Let statement S_i be a wait statement, and let shared(S_i) be the set of shared variables that are read (so information flows from them). Let g(S_i) be the greatest lower bound of the targets of assignments following S_i. A requirement that the block be secure is that shared(S_i) ≤ g(S_i). Thus, the requirements for certification of a compound statement with concurrent constructs are

S₁ secure
...
S_n secure
For i = 1, ..., n [ shared(S_i) ≤ g(S_i) ]

Loops are handled similarly. The only difference is in the last requirement, because after completion of one iteration of the loop, control may return to the beginning of the loop. Hence, a semaphore may affect assignments that precede the wait statement in which the semaphore is used. This simplifies the last condition in the compound statement requirement considerably. Information must be able to flow from all shared variables named in the loop to the targets of all assignments. Let shared(S_i) be the set of shared variables read, and let t₁, ..., t_m be the targets of assignments in the loop. Then the certification conditions for the iterative statement

while f(x₁, ..., x_n) do
     S;

are

Iterative statement terminates
S secure
lub{x₁, ..., x_n} ≤ glb{ t₁, ..., t_m }
lub{shared(S₁), ,,,, shared(S_n) } ≤ glb{ t₁, ..., t_m }

EXAMPLE: Consider the statements

while i < n do
begin
     a[i] := item;
     wait(sem);
     i := i + 1;
end;

Now consider the compound statement that makes up the body of the loop. The first statement is secure if i ≤ a[i] and item ≤ a[i].The third statement is secure because i ≤ i. The second statement induces an implicit flow, so sem ≤ a[i] and sem ≤ i. The requirements are thus i ≤ a[i], item ≤ a[i], sem ≤ a[i], and sem ≤ i.

Finally, concurrent statements have no information flow among them per se. Any such flows occur because of semaphores and involve compound statements (discussed above). The certification conditions for the concurrent statement

cobegin
     S₁;
     ...
     S_n;
coend;

are

S₁ secure
...
S_n secure

Soundness

Denning and Denning [274], Andrews and Reitman [34], and others build their argument for security on the intuition that combining secure information flows produces a secure information flow, for some security policy. However, they never formally prove this intuition. Volpano, Irvine, and Smith [1023] express the semantics of the above-mentioned information on flow analysis as a set of types, and equate certification that a certain flow can occur to the correct use of types. In this context, checking for valid information flows is equivalent to checking that variable and expression types conform to the semantics imposed by the security policy.

Let x and y be two variables in the program. Let x's label dominate y's label. A set of information flow rules is sound if the value in x cannot affect the value in y during the execution of the program. (The astute reader will note that this is a form of noninterference; see Chapter 8.) Volpano, Irvine, and Smith use language-based techniques to prove that, given a type system equivalent to the certification rules discussed above, all programs without type errors have the noninterference property described above. Hence, the information flow certification rules of Denning and of Andrews and Reitman are sound.

Execution-Based Mechanisms

The goal of an execution-based mechanism is to prevent an information flow that violates policy. Checking the flow requirements of explicit flows achieves this result for statements involving explicit flows. Before the assignment

y = f(x₁, ..., x_n)

is executed, the execution-based mechanism verifies that

lub(x₁, ..., x_n) ≤ y

If the condition is true, the assignment proceeds. If not, it fails. A naïve approach, then, is to check information flow conditions whenever an explicit flow occurs.

Implicit flows complicate checking.

Fenton explored this problem using a special abstract machine.

Fenton's Data Mark Machine

Fenton [345] created an abstract machine called the Data Mark Machine to study handling of implicit flows at execution time. Each variable in this machine had an associated security class, or tag. Fenton also included a tag for the program counter (PC).

The inclusion of the PC allowed Fenton to treat implicit flows as explicit flows, because branches are merely assignments to the PC. He defined the semantics of the Data Mark Machine. In the following discussion, skip means that the instruction is not executed, push(x, x) means to push the variable x and its security class x onto the program stack, and pop(x, x) means to pop the top value and security class off the program stack and assign them to x and x, respectively.

Fenton defined five instructions. The relationships between execution of the instructions and the classes of the variables are as follows.

The increment instruction

x := x + 1

is equivalent to

if PC ≤ x then x := x + 1; else skip

The conditional instruction
```
if x = 0 then goto n else x := x – 1
```
is equivalent to
```
if x = 0 then { push(PC, PC); PC = lub(PC, x); PC := n; }
else          { if PC ≤ x then { x := x – 1; } else skip }
```
This branches, and pushes the PC and its security class onto the program stack. (As is customary, the PC is incremented so that when it is popped, the instruction following the if statement is executed.) This captures the PC containing information from x (specifically, that x is 0) while following the goto.
The return
```
return
```
is equivalent to
```
pop(PC, PC);
```
This returns control to the statement following the last if statement. Because the flow of control would have arrived at this statement, the PC no longer contains information about x, and the old class can be restored.
The branch instruction
```
if' x = 0 then goto n else x := x – 1
```
is equivalent to
```
if x = 0 then { if x ≤ PC then { PC := n; } else skip }
else          { if PC ≤ x then { x := x – 1; } else skip }
```
This branches without saving the PC on the stack. If the branch occurs, the PC is in a higher security class than the conditional variable x, so adding information from x to the PC does not change the PC's security class.
The halt instruction
```
halt
```
is equivalent to
```
if program stack empty then halt execution
```
The program stack being empty ensures that the user cannot obtain information by looking at the program stack after the program has halted (for example, to determine which if statement was last taken).

EXAMPLE: Consider the following program, in which x initially contains 0 or 1.^[2]

1. if x = 0 then goto 4 else x := x – 1
2. if z = 0 then goto 6 else z := z – 1
3. halt
4. z := z + 1
5. return
6. y := y + 1
7. return

This program copies the value of x to y. Suppose that x = 1 initially. The following table shows the contents of memory, the security class of the PC at each step, and the corresponding certification check.

x	Y	z	PC	PC	stack	certification check
1	0	0	1	Low	—
0	0	0	2	Low	—	Low ≤ x
0	0	0	6	x	(3, Low)
0	1	0	7	x	(3, Low)	PC ≤ y
0	1	0	3	Low	—

Fenton's machine handles errors by ignoring them. Suppose that, in the program above, y ≤ x. Then at the fifth step, the certification check fails (because PC = x). So, the assignment is skipped, and at the end y = 0 regardless of the value of x. But if the machine reports errors, the error message informing the user of the failure of the certification check means that the program has attempted to execute step 6. It could do so only if it had taken the branch in step 2, meaning that z = 0. If z = 0, then the else branch of statement 1 could not have been taken, meaning that x = 0 initially.

To prevent this type of deduction, Fenton's machine continues executing in the face of errors, but ignores the statement that would cause the violation. This satisfies the requirements. Aborting the program, or creating an exception visible to the user, would also cause information to flow against policy.

The problem with reporting of errors is that a user with lower clearance than the information causing the error can deduce the information from knowing that there has been an error. If the error is logged in such a way that the entries in the log, and the action of logging, are visible only to those who have adequate clearance, then no violation of policy occurs. But if the clearance of the user is sufficiently high, then the user can see the error without a violation of policy. Thus, the error can be logged for the system administrator (or other appropriate user), even if it cannot be displayed to the user who is running the program. Similar comments apply to any exception action, such as abnormal termination.

Variable Classes

The classes of the variables in the examples above are fixed. Fenton's machine alters the class of the PC as the program runs. This suggests a notion of dynamic classes, wherein a variable can change its class. For explicit assignments, the change is straightforward. When the assignment

y := f(x₁, ..., x_n)

occurs, y's class is changed to lub(x₁, …, x_n). Again, implicit flows complicate matters.

EXAMPLE: Consider the following program (which is the same as the program in the example for the Data Mark Machine).^[3]

proc copy(x : integer class { x };
               var y : integer class { y });
var z : integer class variable { Low };
begin
       y := 0;
       z := 0;
       if x = 0 then z := 1;
       if z = 0 then y := 1;
end;

In this program, z is variable and initially Low. It changes when something is assigned to z. Flows are certified whenever anything is assigned to y. Suppose y < x.

If x = 0 initially, the first statement checks that Low ≤ y (trivially true). The second statement sets z to 0 and z to Low. The third statement changes z to 1 and z to lub(Low, x) = x. The fourth statement is skipped (because z = 1). Hence, y is set to 0 on exit.

If x = 1 initially, the first statement checks that Low ≤ y (trivially true). The second statement sets z to 0 and z to Low. The third statement is skipped (because x = 1). The fourth statement assigns 1 to y and checks that lub(Low, z) = Low ≤ y (again, trivially true). Hence, y is set to 1 on exit.

Information has therefore flowed from x to y even though y < x. The program violates the policy but is nevertheless certified.

Fenton's Data Mark Machine would detect the violation (see Exercise 7).

Denning [266] suggests an alternative approach. She raises the class of the targets of assignments in the conditionals and verifies the information flow requirements, even when the branch is not taken. Her method would raise z to x in the third statement (even when the conditional is false). The certification check at the fourth statement then would fail, because lub(Low, z) = x ≤ y is false.

Denning ([269], p. 285) credits Lampson with another mechanism. Lampson suggested changing classes only when explicit flows occur. But all flows force certification checks. For example, when x = 0, the third statement sets z to Low and then verifies x ≤ z (which is true if and only if x = Low).

Example Information Flow Controls

Like the program-based information flow mechanisms discussed above, both special-purpose and general-purpose computer systems have information flow controls at the system level. File access controls, integrity controls, and other types of access controls are mechanisms that attempt to inhibit the flow of information within a system, or between systems.

The first example is a special-purpose computer that checks I/O operations between a host and a secondary storage unit. It can be easily adapted to other purposes. A mail guard for electronic mail moving between a classified network and an unclassified one follows. The goal of both mechanisms is to prevent the illicit flow of information from one system unit to another.

Security Pipeline Interface

Hoffman and Davis [477] propose adding a processor, called a security pipeline interface (SPI), between a host and a destination. Data that the host writes to the destination first goes through the SPI, which can analyze the data, alter it, or delete it. But the SPI does not have access to the host's internal memory; it can only operate on the data being output. Furthermore, the host has no control over the SPI. Hoffman and Davis note that SPIs could be linked into a series of SPIs, or be run in parallel.

They suggest that the SPI could check for corrupted programs. A host requests a file from the main disk. An SPI lies on the path between the disk and the host (see Figure 16-4.) Associated with each file is a cryptographic checksum that is stored on a second disk connected to the first SPI. When the file reaches the first SPI, it computes the cryptographic checksum of the file and compares it with the checksum stored on the second disk. If the two match, it assumes that the file is uncorrupted. If not, the SPI requests a clean copy from the second disk, records the corruption in a log, and notifies the user, who can update the main disk.

Figure 16-4. Use of an SPI to check for corrupted files.

The information flow being restricted here is an integrity flow, rather than the confidentiality flow of the other examples. The inhibition is not to prevent the corrupt data from being seen, but to prevent the system from trusting it. This emphasizes that, although information flow is usually seen as a mechanism for maintaining confidentiality, its application in maintaining integrity is equally important.

Secure Network Server Mail Guard

Consider two networks, one of which has data classified SECRET^[4] and the other of which is a public network. The authorities controlling the SECRET network need to allow electronic mail to go to the unclassified network. They do not want SECRET information to transit the unclassified network, of course. The Secure Network Server Mail Guard (SNSMG) [937] is a computer that sits between the two networks. It analyzes messages and, when needed, sanitizes or blocks them.

The SNSMG accepts messages from either network to be forwarded to the other. It then applies several filters to the message; the specific filters may depend on the source address, destination address, sender, recipient, and/or contents of the message. Examples of the functions of such filters are as follows.

Check that the sender of a message from the SECRET network is authorized to send messages to the unclassified network.
Scan any attachments to messages coming from the unclassified network to locate, and eliminate, any computer viruses.
Require all messages moving from the SECRET to the unclassified network to have a clearance label, and if the label is anything other than UNCLASS (unclassified), encipher the message before forwarding it to the unclassified network.

The SNSMG is a computer that runs two different message transfer agents (MTAs), one for the SECRET network and one for the unclassified network (see Figure 16-5). It uses an assured pipeline [785] to move messages from the MTA to the filter, and vice versa. In this pipeline, messages output from the SECRET network's MTA have type a, and messages output from the filters have a different type, type b. The unclassified network's MTA will accept as input only messages of type b. If a message somehow goes from the SECRET network's MTA to the unclassified network's MTA, the unclassified network's MTA will reject the message as being of the wrong type.

Secure Network Server Mail Guard. The SNSMG is processing a message from the SECRET network. The filters are part of a highly trusted system and perform checking and sanitizing of messages.

Figure 16-5. Secure Network Server Mail Guard. The SNSMG is processing a message from the SECRET network. The filters are part of a highly trusted system and perform checking and sanitizing of messages.

The SNSMG is an information flow enforcement mechanism. It ensures that information cannot flow from a higher security level to a lower one. It can perform other functions, such as restricting the flow of untrusted information from the unclassified network to the trusted, SECRET network. In this sense, the information flow is an integrity issue, not a confidentiality issue.

Summary

Two aspects of information flow are the amount of information flowing and the way in which it flows. Given the value of one variable, entropy measures the amount of information that one can deduce about a second variable. The flow can be explicit, as in the assignment of the value of one variable to another, or implicit, as in the antecedent of a conditional statement depending on the conditional expression.

Traditionally, models of information flow policies form lattices. Should the models not form lattices, they can be embedded in lattice structures. Hence, analysis of information flow assumes a lattice model.

A compiler-based mechanism assesses the flow of information in a program with respect to a given information flow policy. The mechanism either certifies that the program meets the policy or shows that it fails to meet the policy. It has been shown that if a set of statements meet the information flow policy, their combination (using higher-level language programming constructs) meets the information flow policy.

Execution-based mechanisms check flows at runtime. Unlike compiler-based mechanisms, execution-based mechanisms either allow the flow to occur (if the flow satisfies the information flow policy) or block it (if the flow violates the policy). Classifications of information may be static or dynamic.

Two example information flow control mechanisms, the Security Pipeline Interface and the Secure Network Server Mail Guard, provide information flow controls at the system level rather than at the program and program statement levels.

Research Issues

The declassification problem permeates information flow questions. The goal is to sanitize data so that it can move from a more confidential position to a less confidential one without revealing information that should be kept confidential. In the integrity sense, the goal is to accredit data as being more trustworthy than its current level. These problems arise in governmental and commercial systems. Augmenting existing models to handle this problem is complex, as suggested in Chapters 5 and 6.

Automated analysis of programs for information flows introduces problems of specification and proof. The primary problem is correct specification of the desired flows. Other problems include the user interface to such a tool (especially if the analysts are programmers and not experts in information flow or program proving methodologies); what assumptions are implicitly made; and how well the model captures the system being analyzed. In some cases, models introduce flows with no counterparts in the existing system. Detecting these flows is critical to a correct and meaningful analysis.

The cascade problem involves aggregation of authorized information flows to produce an unauthorized flow. It arises in networks of systems. The problem of removing such cascades is NP-complete. Efforts to approximate the solution must take into account the environment in which the problem arises.

Exercises

1:	Revisit the example for x := y + z in Section 16.1.1. Assume that x does not exist in state s. Confirm that information flows from y and z to x by computing H(y_s \| x_t), H(y_s), H(z_s \| x_t), and H(z_s) and showing that H(y_s \| x_t) < H(y_s) and H(z_s \| x_t) < H(z_s).
2:	Let L = (S_L, ≤_L) be a lattice. Prove that the structure IL = (S_IL, ≤_IL), where each of the following is a lattice. S_IL = { [a, b] \| a, b ∊ S_L ∧ a ≤_L b } ≤_IL = { ([a₁, b₁], [a₂, b₂]) \| a₁ ≤_L a₂ ∧ b₁ ≤_L b₂ } lub_IL([a₁, b₁], [a₂, b₂]) = (lub_L(a₁, a₂), lub_L(b₁, b₂)) glb_IL([a₁, b₁], [a₂, b₂]) = (glb_L(a₁, a₂), glb_L(b₁, b₂))
3:	Prove or disprove that the set P formed by the dual mapping of a reflexive information flow policy (as discussed in Definition 16–5) is a lattice.
4:	Extend the semantics of the information flow security mechanism in Section 16.3.1 for records (structures).
5:	Why can we omit the requirement lub{ i, b[i] } ≤ a[i] from the requirements for secure information flow in the example for iterative statements (see Section 16.3.2.4)?
6:	In the flow certification requirement for the goto statement in Section 16.3.2.5, the set of blocks along an execution path from b_i to IFD(b_i) excludes these endpoints. Why are they excluded?
7:	Prove that Fenton's Data Mark Machine described in Section 16.4.1 would detect the violation of policy in the execution time certification of the copy procedure.
8:	Discuss how the Security Pipeline Interface in Section 16.5.1 can prevent information flows that violate a confidentiality model. (Hint: Think of scanning messages for confidential data and sanitizing or blocking that data.)

^[1]From Denning [269], p. 306.

^[2]From Denning [269], Figure 5.7, p. 290.

^[3]From Denning [269], Figure 5.5, p. 285.

^[4]For this example, assume that the network has only one category, which we omit.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for
16. Information Flow

Chapter 16. Information Flow

Basics and Background

Entropy-Based Analysis

Information Flow Models and Mechanisms

Nonlattice Information Flow Policies

Confinement Flow Model

Transitive Nonlattice Information Flow Policies

Nontransitive Information Flow Policies

Compiler-Based Mechanisms

Declarations

Program Statements

Assignment Statements

Compound Statements

Conditional Statements

Iterative Statements

Goto Statements

Procedure Calls

Exceptions and Infinite Loops

Concurrency

Soundness

Execution-Based Mechanisms

Fenton's Data Mark Machine

Variable Classes

Example Information Flow Controls

Security Pipeline Interface

Secure Network Server Mail Guard

Summary

Research Issues

Further Reading

Exercises

Table of Contents for 16. Information Flow

Create new playlist

Sign In

Sign Up

Chapter 16. Information Flow

Basics and Background

Entropy-Based Analysis

Information Flow Models and Mechanisms

Nonlattice Information Flow Policies

Confinement Flow Model

Transitive Nonlattice Information Flow Policies

Nontransitive Information Flow Policies

Compiler-Based Mechanisms

Declarations

Program Statements

Assignment Statements

Compound Statements

Conditional Statements

Iterative Statements

Goto Statements

Procedure Calls

Exceptions and Infinite Loops

Concurrency

Soundness

Execution-Based Mechanisms

Fenton's Data Mark Machine

Variable Classes

Example Information Flow Controls

Security Pipeline Interface

Secure Network Server Mail Guard

Summary

Research Issues

Further Reading

Exercises

Table of Contents for
16. Information Flow