6.4 Static Scoping

Static scoping means that the declaration to which a reference is bound can be determined before run-time (i.e., statically) by examining the text of the program. Static scoping was introduced in ALGOL 60 and has been widely adopted by most programming languages. The most common instance of static scoping is lexical scoping, in which the scope of variable declaration is based on the program’s lexical layout. Lexical scoping and static scoping are not synonymous (Table 6.2). Examining the lexical layout of a program is one way to determine the scope of a declaration before run-time, but other strategies are also possible. In lexically scoped languages, the scope of a variable reference is the code constituting its static ancestors.

Table 6.2 Static Scoping Vis-à-Vis Dynamic Scoping

Static scoping

A reference is bound to a declaration before run-time, e.g., based on the spatial relationship of nested program blocks to each other, i.e., lexical scoping.

Dynamic scoping

A reference is bound to a declaration during run-time, e.g., based on the calling sequences of procedures on run-time call stack.

6.4.1 Lexical Scoping

To determine the declaration associated with a reference in a lexically scoped language, we must know that language’s scope rules. The scope rules of a language are semantic rules.

Scope Rule for λ-calculus: In Left parenthesis, lambda, left parenthesis, left angle bracket, identifier, right angle bracket, right parenthesis, left angle bracket, expression, right angle bracket., the occurrence of <identifier> is a declaration that binds all occurrences of that variable in <expression> unless some intervening declaration of the same variable occurs. (Friedman, Wand, and Haynes 2001, p. 29).

In discussing lexical scoping, to understand what intervening means in this rule, it is helpful to introduce the notion of a block. A block is a syntactic unit or group of cohesive lines of code for which the beginning and ending of the group are clearly demarcated—typically by lexemes such as curly braces (as in C). An example is if, left parenthesis, x greater than 1, right parenthesis, left curly brace, forward slash, asterisk, this is a block, asterisk, forward slash, right curly brace.. In Scheme, let expressions and functions are blocks. Lines 3–4 in the example in Section 6.3 define a block. A programming language whose programs are structured as series of blocks is a block-structured language. Blocks can be nested, meaning that they can contain other blocks. For instance, consider the following Scheme expression:

A set of seven code lines in a Scheme expression.
Description

This entire expression (lines 1–6) is a block, which contains a nested block (lines 2–6), which itself contains another block (lines 3–6), and so on. Lines 5–6 are the innermost block and lines 1–6 constitute the outermost block; lines 3–6 make up an intervening block. The spatial nesting of the blocks of a program is depicted in a lexical graph:

A flow diagram of a lexical graph consisting of the following blocks: lambda superscript, left parenthesis, x, right parenthesis, plus, lambda superscript, left parenthesis, a b, right parenthesis, plus, lambda superscript, left parenthesis, c a, right parenthesis, left parenthesis, plus a b x, right parenthesis.

Scheme, Python, Java, and C are block-structured languages; Prolog and Forth are not. Typically block-structured languages are primarily lexically scoped, as is the case for Scheme, Python, Java, and C.

A simple procedure can be used to determine the declaration to which a reference is bound. Start with the innermost block of the expression containing the reference and search within it for its declaration. If it is not found there, search the next block enclosing the one just searched. If the declaration is not found there, continue searching in this innermost-to-outermost fashion until a declaration is found. After searching the outermost block, if a declaration is not found, the variable reference is free (as opposed to bound).

Due to the scope rules of Scheme and the lexical layout of the program (i.e., the nesting of the expressions) that it relies upon, applying this procedure reveals that the reference to x in line 6 of the example Scheme expression previously is bound to the declaration of x on line 1. Neither the scope rule nor the procedure yields the scope of a declaration. The scope of a declaration is the region of the program within which references refer to the declaration. In this example, the scope of the declaration of x is lines 2–6.

The scope of the declaration of a on line 3, by contrast, is lines 4–5 rather than lines 4–6, because the inner declaration of a on line 5 shadows the outer declaration of a on line 3. The inner declaration of a on line 5 creates a scope hole on line 6, so that the scope of the declaration of a on line 3 is lines 4–5 and not lines 4–6. Thus, a declaration may shadow another and create a scope hole. For this reason, we now make a distinction between the visibility and scope of a declaration—though the two concepts are often used interchangeably. The visibility of a declaration in a program constitutes the regions of that program where references are bound to that declaration—this is the definition of scope given and used previously. Scope refers to the entire block of the program where the declaration is applicable. Thus, the scope of a declaration includes scope holes since the bindings still exist, but are hidden. The visibility of a declaration is a subset of the scope of that declaration and, therefore, is bounded by the scope. The visibility of a declaration is not always the entire body of a lambda expression owing to the possibility of holes. As an example, the following figure graphically depicts the declarations to which the references to a, b, and x are bound. Nesting of blocks progresses from left to right. On line 2, the declaration of a on line 3 is not in scope:

A flow diagram of a lexical graph consisting of the following blocks: lambda, left parenthesis, x, right parenthesis, plus, lambda, left parenthesis, a b, right parenthesis, plus, lambda, left parenthesis, c a, right parenthesis, left parenthesis, plus a b x, right parenthesis. Plus a b x leads to lambda c a, lambda a b, and lambda x.

Figure 6.1 depicts the run-time stack at the time the expression (+ a b x) is evaluated.

An illustration of a run-time call stack. The stack consists of the following blocks from bottom to top: lambda, left parenthesis, x, right parenthesis, plus, lambda, left parenthesis, a b, right parenthesis, plus, lambda, left parenthesis, c a, right parenthesis, left parenthesis, plus a b x, right parenthesis. Plus a b x leads to lambda c a, lambda a b, and lambda x. The top side of the stack is labeled, Top of the stack.

Figure 6.1 Run-time call stack at the time the expression (+ a b x) is evaluated. The arrows indicate to which declarations the references to a, b, and x are bound.

Design Guideline 6: Factor out Constant Parameters in Table 5.7 indicates that we should nest a letrec within a lambda only when the body of the letrec must know about arguments to the outer function. For instance, as recursion progresses in the reverse1 function, the list to be reversed changes (i.e., it gets smaller). In turn, in Section 5.9.3 we defined the reverse1 function (i.e., the lambda)in the body block of the letrec expression. For purposes of illustrating a scope hole, we will do the opposite here; that is, we will nest the letrec within the lambda. (We are not implying that this is an improvement over the other definition.)

A set of 11 code lines for illustrating a scope hole.
Description

Based on our knowledge of shadowing and scope holes, we know there is no need to use two different parameter names (e.g., l and lst) because the inner l shadows the outer l and creates a scope hole in the body of the inner lambda expression (which is the desired behavior). Thus, the definition of reverse1 can be written as follows, where all occurrences of the identifier lst in the prior definition are replaced with l:

A set of 10 code lines for defining reverse 1.
Description

A reference can either be local or nonlocal. A local reference is bound to a declaration in the set of declarations (e.g., the formal parameter list) associated with the innermost block in which that reference is contained. Sometimes that block is called the local block. Note that not all blocks have a set of declarations associated with them; an example is if, left parenthesis, a equals equals b, right parenthesis, left curly brace, c equals a plus b, semicolon d equals c plus 1, semicolon, right curly brace. in Java. The reference to a on line 6 in the expression given at the beginning of this section is a local reference with respect to the lambda block on lines 5–6, while the references to b and x on line 6 are nonlocal references with respect to that block. All of the nested blocks enclosing the innermost block containing the reference are sometimes referred to as ancestor blocks of that block. In a lexically scoped language, we search both the local and ancestor blocks to find the declaration to which a reference is bound.

Since we implement interpreters for languages in this text, we must cultivate the habit of thinking in a language-implementation fashion. Thinking in an implementation-oriented manner helps us understand how bindings can be hidden. We must determine the declaration to which a reference is bound so that we can determine the value bound to the identifier at that reference so that we can evaluate the expression containing that reference. This leads to the concept of an environment, which is a core element of any interpreter. Recall from Chapter 5 that a referencing environment is a set or mapping of name–value pairs that associates variable names (or symbols) with their current bindings at any point in a program in a programming language implementation. To summarize:

A set of two rules for summarizing scope and referencing environment.
Description

The set of declarations associated with the innermost block in which a reference is contained differs from the referencing environment, which is typically much larger because it contains bindings for nonlocal references, at the program point where that reference is made. For instance, the referencing environment at line 6 in the expression given at the beginning of this section is Left curly brace, left parenthesis, a, comma, 4, right parenthesis, comma, left parenthesis, b, comma, 2, right parenthesis, comma, left parenthesis, c, comma, 3, right parenthesis, comma, left parenthesis, x, comma, 5, right parenthesis, right curly brace. while the declarations associated with the innermost block containing line 6 is Left parenthesis, left parenthesis, c 3, right parenthesis, left parenthesis, a 4, right parenthesis, right parenthesis..

There are two perspectives from which we can study scope (i.e., the determination of the declaration to which a reference is bound): the programmer and the interpreter. The programmer, or a human, follows the innermost-to-outermost search process described previously. (Programmers typically do not think through the referencing environment.) Internally, that process is operationalized by the interpreter as a search of the environment. In turn, (static or dynamic) scoping (and the scope rules of a language) involves how and when the referencing environment is searched in the interpreter.

In a statically scoped language, that determination can be made before run-time (often by a human). In contrast, in a statically scoped, interpreted language, the interpreter makes that determination at run-time because that is the only time during which the interpreter is in operation. Thus, an interpreter progressively constructs a referencing environment for a computer program during execution.

While the specific structure of an environment is an implementation issue extraneous to the discussion at hand (though covered in Chapter 9), some cursory remarks are necessary. For now, we simply recognize that we want to represent and structure the environment in a manner that renders searching it efficient with respect to the scope rules of a language. Therefore, if the human process involves an innermost-to-outermost search, we would like to structure the environment so that bindings of the declarations of the innermost block are encountered before those in any ancestor block. One way to represent and structure an environment in this way is as a list of lists, where each list contains a list of name–value pairs representing bindings, and where the lists containing the bindings are ordered such that the bindings from the innermost block appear in the car position (the head) of the list and the declarations from the ancestor blocks constitute the cdr (the tail) of the list organized in innermost-to-outermost order. Using this structure, the referencing environment at line 6 is represented as Left parenthesis, left parenthesis, left parenthesis, c 3, right parenthesis, left parenthesis a 4, right parenthesis, right parenthesis, left parenthesis, left parenthesis, a 1, right parenthesis, left parenthesis, b 2, right parenthesis, right parenthesis, left parenthesis, left parenthesis x 5, right parenthesis, right parenthesis, right parenthesis.. These are the scoping semantics with which most of us are familiar. Representation options for the structure of an environment (e.g., flat list, nested list, tree) as well as how an environment is progressively constructed are the topic of Section 9.8.

Conceptual Exercises for Section 6.4

In each of the following two exercises, draw an arrow from each variable reference in the given λ-calculus expression to the declaration to which it is bound.

Exercise 6.4.1

A set of six code lines with a lambda-calculus expression for drawing an arrow from each variable reference.
Description

Exercise 6.4.2

A set of five code lines with a lambda-calculus expression for drawing an arrow from each variable reference.
Description

Exercise 6.4.3 In programming languages that do not require the programmer to declare variables (e.g., Python), there is often no distinction between the declaration of a variable and the first reference to it without the use of qualifier. (Sometimes this concept is called manifest typing or implicit typing.) For instance, in the following Python program, is line 3 a reference to the declaration of an x on line 1 or a (new) declaration itself?

A set of four code lines in Python with two declarations.
Description

(See Appendix A for an introduction to the Python programming language.) The following program suffers from a similar ambiguity. Is line 4 a reference bound to the declaration on line 2 or does it introduce a new declaration that shadows the declaration on line 2?

A set of four code lines in Python with two declarations.
Description
Continuation of the code with two declarations in Python with three code lines.
Description

Investigate the semantics of the keywords global and nonlocal in Python. How do they address the problem of discerning whether a line of code is a declaration or a reference? What are the semantics of global x? What are the semantics of nonlocal x?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.243.70