Chapter 9. Semantics: The Meaning of Language

You do not really understand something unless you can explain it to your grandmother.

—Albert Einstein
U.S. (German-born) physicist (1879–1955)

A language specification is not complete without a description of its intended meaning. You may think that the meaning of a certain mogram is clear, but when you try to express this meaning in some way, you will often find that this is not the case. Therefore, we need to address semantics, or the meaning of software languages.

9.1 Semantics Defined

Semantics is simply another word for meaning, so to define semantics, we have to define meaning. For that, we have to turn to philosophy.

9.1.1 Understanding Is Personal

In 1923 Ogden and Richards, two leading linguists and philosophers of their time, wrote a book called The Meaning of Meaning [Ogden and Richards 1923]. Their ideas on semantics and semiotics are still considered valuable. They state that there is a triangular relationship between ideas (concepts), the real world, and linguistic symbols like words, which is often referred to as the meaning triangle. An example can be found in Figure 9-1, which depicts the triangular relationship using the example of a larch tree.

Figure 9-1. Ogden and Richard’s meaning triangle

image

The meaning triangle explains that every person links a word (linguistic symbol) to a real-world item through a mental concept. Small children are very good at this, and as every parent can tell you, they keep testing their triangular relationships. Often, they point at a real-world item—for example, a car—and speak the word car, hoping that an adult present confirms their understanding. Likewise, a word can be linked to a real-world item via a concept. Imagine that you have already heard the word tricycle and have some understanding that it is a three-wheeled vehicle. Then, at the moment that you see one in real life or in a photo, you make the connection between the word and the real-world item.

An important understanding of Richards and Ogden is that ideas, or concepts, exist only in a person’s mind. In fact, the authors state that ultimately, every person understands the world in his or her unique manner, one that no other person is able to copy, because that understanding of a new concept is fed by—based on—all other concepts that a person has previously acquired. For instance, the concept tricycle can be explained as a three-wheeled bicycle only when you know what a bicycle is. A person who has no understanding of bicycles will still not grasp this concept. Thus, one concept builds on another. As no person has exactly the same knowledge as another person, each person’s understanding of the world will be unique. In fact, Richards and Ogden state that communication, which is understanding another’s linguistic utterances, is fundamentally crippled.

What we can learn from this for language specification is that understanding is personal; that is, each person creates with his or her own mind a mental concept that for this person represents the meaning of the utterance. In other words, the semantics of all mograms is subjective.

This conclusion is truly fundamental, so it is good to question it; there is no need to take the reasoning of Richards and Ogden for granted.

More, and influential, people have expressed more or less the same notion. Let’s start with some other philosophers, such as René Descartes (1596–1650), George Berkeley (1685–1753), and Immanuel Kant (1724–1804). In his funny but solid introduction to philosophy, Mark Rowlands uses simple words to explain that these three philosophers claim that “the only reality of which we can meaningfully speak is mental reality: the reality of experiences, ideas, thoughts, and other mental things” [Rowlands 2003].

Philosophy is not the only field in which these ideas of semantics have been used. Mathematician Luitzen Egbertus Jan Brouwer (1881–1966), who founded intuitionistic mathematics and modern topology and after whom the Brouwer fixed-point theorem is named, wrote that every mathematical proof is in fact a mental construction in the mind of a mathematician [van Dalen 2001]. The only purpose of the written form of the proof is to facilitate communication with other mathematicians. Physicist Albert Einstein (1879–1955) created his theory on the basis of a now-famous thought experiment, visualizing traveling alongside a beam of light. Even modern psychological theories, such as neurolinguistic programming (NLP), are based on the idea that an individual’s thoughts are a fundamental ingredient of one’s perception of the world. By changing one’s thoughts, a person can improve his or her attitudes and actions.

9.1.2 The Nature of a Semantics Description

Even if you are not convinced about the subjective nature of semantics, I hope you can find truth in the following. (Simply remove the word subjective if you want.)

Definition 9-1 (Semantics Description) A description of the semantics of a language L is a means to communicate a subjective understanding of the linguistic utterances of L to another person or persons.

A semantics description is included in a language specification so that you can communicate your understanding of the language to other persons. If you specify a language, you are probably its creator and want to get it used; therefore, it seems a good idea to tell other people about the meaning you intended the language to have.

An important consequence of this view is that computers are never part of the audience of a semantics description, because they do not have the capability of constructing their own mental reality. Semantics descriptions of software languages are intended for human consumption, even when they describe the actions of a computer when executing a mogram.

It is important to explain the intended meaning of a language as well as possible to other persons. Similar to a paper or a presentation, a semantics description should be adapted to its audience.

9.2 Semantics of Software Languages

The semantics of a software language describe what happens in a computer when a mogram of that language is executed. When executing a mogram, a computer is processing data. This means that a semantics description should explain (1) the data being processed, (2) the processes handling the data, and, last but certainly not least, (3) the relationship of these two with the mograms of the language. Together, we call parts 1 and 2 the runtime system and part 3 the semantic mapping. Note that some software languages address only data—for instance, entity-relationship diagrams—whereas others address only processes, such as, petrinets.

Applying the meaning triangle to the software domain gives us the following notions. The software domain’s real world is the real runtime system—the actual computer running the actual mogram. Its linguistic symbol is a mogram. The concept corner is represented by the description of the runtime system, which is the way you think about your computer and what it does (Figure 9-2). Note that the semantic mapping is the relationship between linguistic symbol and concept, or between mogram and description of the runtime system.

Figure 9-2. The meaning triangle applied to software

image

9.2.1 Ways to Describe Semantics

Because descriptions in natural language are always more or less ambiguous, a more formal way to describe semantics is preferred. There are at least four ways in which we can describe the semantics of a software language, other than in natural language. These are:

  1. Denotational, by constructing mathematical objects—called denotations, or meanings)—that represent the meaning of the mogram.

  2. Pragmatic, that is by providing a tool that executes the mogram. This tool is often called a reference implementation.

  3. Translational, by translating the mogram into another language that is well understood.

  4. Operational, by describing how a valid mogram is interpreted as sequences of computational steps, often given in the form of a state transition system, which shows how the runtime system progresses from state to state.

Denotational

The most complex way to describe semantics is denotational. Figure 9-3 shows a simple (incomplete) example of a denotational semantics (see Schmidt 1986, p. 68]). To understand denotational semantics, you have to understand the mathematical constructs that are being used, such as actors—a sort of object, as in object-oriented programming—and domains, a special sort of set. Thus, the intended audience for such a semantics description is limited to mathematicians and others familiar with these sort of things.

Figure 9-3. An example of a denotational semantics

image

Pragmatic

On the other end of the scale is the pragmatic approach to semantics, which is the simplest way to describe semantics. The only way to get to know what a mogram means is by watching it execute by the reference implementation. However, it is not always clear what a mogram does just by watching it execute. For instance, suppose that a simple traditional program checks whether the input, which must be a natural number, is larger than 3. You would have to execute this mogram many times before realizing that the outcome (true or false) changes exactly at an input value of 3.

But what if you have available the source code of a reference implementation? In that case, you would not only look at the output of the reference implementation but also investigate the code that produces the output. In fact, you are crossing over to the translational approach, because in this we assume that the reference implementation is coded in a language that you already know and understand. By investigating the source code of the reference implementation, you are able to understand the meaning of the mogram because it is given in another, already known language.

Translational

A slightly less simple way to describe semantics is translational. Here the problem is to find the right target language for the translation. Do all people in the audience know the target language? Does it provide equivalent constructs for all concepts in the language? These questions are important. The answer to the second question can be especially problematic. How do you explain all the different words for snow in the Eskimo language of the Inuit when the target language of the translation is an African language, such as Swahili? Speakers of Swahili might not be familiar with any kind of snow.

Using a translational semantics, the two parts that form the runtime system—the data and the processes—are implicitly known by the audience because they are more or less the same in both languages. The semantics description is given through a relationship between two mograms. In terms of the meaning triangle, this means that, although the real-world and concept corner of the meaning triangle remain the same, the corner labeled linguistic symbol changes from a mogram in language A to a mogram in language B. This is illustrated in Figure 9-4, where the smaller triangle is the meaning triangle that is constructed for the English word Larch and the larger triangle is the existing one for the French word Mélèze. Because the semantic mapping already exists for the target language, the combination of the translation plus the existing semantic mapping defines the semantic mapping of the source language. Translational semantics can be specified using a model transformation from one language to another. The next chapter takes a closer look at translational semantics.

Figure 9-4. The meaning triangle and a translational semantics

image

Operational

At first glance, operational semantics looks very much like a translational approach. However, when you define an operational semantics, you cannot take the description of the runtime system for granted. You must describe it next to the semantics mapping.

An example of the difference between translational and operational is a description of the meaning of the English phrase to cut down a tree. (see Figure 9-5). You can either translate this phrase into another language, such as French—réduire un arbre (the translational approach)—or make a comic strip showing all the subsequent phases of cutting down a tree, the operational approach. A video would also do nicely. Each individual frame in the video or picture in the comic strip would represent a state in the system, hence the fact that the word snapshot is often used. The series of states and the transitions from one state to another are called the state transition system.

Figure 9-5. Operational semantics as a series of snapshots

image

9.2.2 The “Best” Semantics

In my opinion, the best way to describe semantics to an audience of computer scientists is either translational or operational. If you can find a good target language, the translational approach is probably the best and is certainly the quickest. For instance, explaining the semantics of C# to a Java programmer is best done by translating the C# concepts to Java concepts, because the two languages are so alike. However, when you lack a good target language, the operational approach is better.

9.3 Operational Semantics Using Graphs

An operational semantics is a description of how a virtual, or abstract, machine reacts when it executes a mogram. This is how Edgar Dijkstra formulated his view on semantics in [Dijkstra 1961, p. 8]: “As the aim of a programming language is to describe processes, I regard the definition of its semantics as the design, the description of a machine that has as reaction to an arbitrary process description in this language the actual execution of this process.” Accordingly, I am going to describe an abstract machine for a very simplistic object-oriented programming language (Section 9.3.5). The formalism I use there is graphs and graph transformations because, according to the definitions in Chapter 5, a mogram is a labeled graph.

In the example object-oriented programming language, you can define only classes and operations, and the only thing an operation can do is create a number of objects and return the last. In other words, its abstract syntax model includes a metaclass called Class, one called Operation, and one called CreateExp, as depicted in Figure 9-6. The language is not a very useful one but is sufficient as an example. A more elaborate example can be found in Kastenberg et al. [2005 and 2006]. But before getting into the details of the example, you need to understand more about operational semantics.

Figure 9-6. ASM of example object-oriented language

image

9.3.1 Modeling Semantics

The fun thing about operational semantics, which also makes it complex to understand, is that we can model the abstract machine in the same way that we model our applications and our languages. All we have to do is create a model describing the data and processes that take part in the execution of the abstract machine. We could even implement the abstract machine using this model and thus provide a pragmatic semantics as well.

For instance, in our example object-oriented programming language, a single mogram might define a class House and a class Garage. Our intuition tells us that there can be several objects of class House and several objects of class Garage. These objects represent the meaning of the class definitions. But because our semantics description must hold for the complete language, the way to specify the values in our abstract machine is not by stating that there can be houses and garages but by specifying the fact that there can be objects. We relate these objects to the mogram by stating that each object must have a class, which is defined in the mogram. Thus, the model of the abstract machine contains a metaclass called Object, which is related to the metaclass Class in the abstract syntax model.

This example is depicted in Figure 9-7. The upper-left quadrant of the figure holds the metaclass from the abstract syntax model. The lower-left quadrant holds the elements in a certain mogram, instances of the abstract syntax model. The upper-right quadrant holds the model of the abstract machine, the semantic domain model.[1] Symmetry would then suggest that the lower-right quadrant holds the instances of the semantic domain model. It does, and these elements are part of the actual runtime system. There is a technical instance-of relationship between the model of the abstract machine and the elements in the runtime system. However, there is also a second instance-of relationship. The elements in the lower-right quadrant are also semantical instances of the House and Garage classes in our mogram. (For the enthusiastic reader: Compare this to the ontological and linguistic instance-of relationships described in Atkinson and Kühne [2003].)

[1] In Kastenberg et al. [2005], this is called the value graph.

Figure 9-7. Relationship between ASM, mogram, and semantic domain model

image

9.3.2 Consequences of the Von Neumann Architecture

According to Section 9.2, an operational semantics must provide two things: (1) a description of the runtime system and (2) a mapping of this description to the language’s mograms. Furthermore, the runtime system breaks down into two parts: the data and the processes.

Unfortunately, these distinctions fail when it comes to an operational semantics. According to the long-standing Von Neumann architecture, both the program and the data on which the program acts are present in the computer and represented in a similar fashion. In our terminology, this means that the executing mogram is part of the runtime system. Furthermore, the links between the data on which the mogram acts and the mogram itself are part of the runtime system too. Thus, the semantic mapping is part of the description of the runtime system.

The runtime environment of our abstract machine holds more than the elements in the lower-right quadrant of Figure 9-7. Because of the Von Neumann architecture, the elements in the lower-left quadrant are included as well. Therefore, the description of the states in the runtime system not only consists of the semantic domain model but also includes the abstract syntax model.[2] Figure 9-8 shows the consequences of the Von Neumann architecture on the description of the abstract machine.

[2] In Kastenberg et al. [2005], this description is called the execution graph.

Figure 9-8. Runtime environment of abstract machine

image

Even the data and processes within the runtime system cannot be clearly separated. In every execution, some data concerns the processes, such as the process ID of the running process, a program counter, and so on. And surely, the processes are driven by the executing mogram; thus, the connections between data, processes, and executing mogram are very close indeed.

In the discussion here, I first specify the states in an execution. To again use the metaphor of the comic strip, a description of the states tells you what can be seen in a single snapshot. Next, I explain how the execution proceeds from one state to another, or from one snapshot to another.

9.3.3 States in the Abstract Machine

First, a description of the states of the abstract machine must be provided. Each state consists of data on which the mogram acts, data on the running processes, and the executing mogram.

The Semantic Data Model

The data on which the mogram acts will, in the following, be called the values. The mograms can help us to describe the values. Every mogram specifies data: for instance, in the form of tables in an entity-relationship diagram, classes in an object-oriented program, or elements in an XML schema. Looking at the mograms in a language, we can identify the data that the abstract machine must be able to handle.

However, the specification of the data in a mogram cannot be equal to a specification of the values. The specification of the data in a mogram is the thing that we need to give meaning. The specification of the values is the meaning of (part of) the mogram. We cannot specify the meaning of a mogram by saying that it means itself. Instead, we must specify the values independently and relate the mogram to this specification. Furthermore, we must build this relationship not on a permogram base but for the complete language.

For example, take again our object-oriented programming language with a single mogram that defines a class House and a class Garage. The specification of the data in the mogram tells us that we have house and garage objects. Still, the semantics of the mogram is not House and Garage classes but something different: a model that specifies the house and garage objects, just as the mogram does. The big difference is that the semantic domain model is mogram independent and hence, the Object class is included in the semantic domain model. By the way, the part of the semantic domain model that concerns the values is often called the semantic data model.[3]

[3] In Kastenberg et al. [2005], this is called the value graph.

The Semantic Process Model

The second part of each state in the abstract machine is the information on the running processes. In each state, it must be clear which operation is executing, on what data it is executing, and what the next statement to be executed is. One could call this type of information the data on the processes.

Similar to the way that the data elements can be modeled in a semantic data model, the process information can be modeled in a semantic process model.[4] This model includes metaclasses that represent, for instance, the execution of an operation. In our example, there are only two types of processes: (1) the creation of an object and (2) the execution of an operation. The first is represented by the CreateExec class; the second, by the OperExec class.

[4] In Kastenberg et al. [2005], this is called the frame graph.

As in the semantic data model, the link to the abstract syntax is provided. The CreateExec class is linked to the metaclass CreateExp to represent that this process executes a certain expression. The OperExec is linked to the metaclass Operation to indicate the operation that is executing. Figure 9-9 shows how the semantic process model relates to the abstract syntax model and the mogram. The relationships are like those in the semantic data model.

Figure 9-9. Relationship between ASM, mogram, and semantic process model

image

9.3.4 The Rules of Execution: Transitions

Next to a description of the states, a semantics description must provide information on how the abstract machine goes from one state to another. We call each of these steps a transition. A transition can be described by giving the situation before the transition, the start state, and the situation after the transition, the end state.

The start state specifies when the transition may occur. In other words, a transition is triggered by a specific constellation of the three constituting parts in the start state: the mogram, the values, and the processes. The end state specifies what exactly happens when this transition fires.

Both the semantic data model and the semantic process model are models as defined in Chapter 5. This means that both are graphs and that their instances—the parts in the abstract machine’s runtime environment—are graphs as well. Therefore, we can use graph transformations to specify the transitions. (See Background on Graph Transformation.)

9.3.5 Example: Simple Semantics

Figure 9-10 shows the semantic domain model for our example language. The model uses the classes from the abstract syntax model; therefore, these are shown as well. As explained earlier, the semantic domain contains instances of class Object, and the information on the running processes is maintained by the classes OperExec and CreateExec.

Figure 9-10. Semantic domain model of example object-oriented language

image

The execution rules are given by the transformation rules in Figure 9-11. For the sake of simplicity, not all rules are included. Rule 1 specifies the creation of a new object. When it is not yet associated with an Object instance, the CreateExec instance can be executed. Rule 2 specifies how the execution goes from one create expression to the next. This rule can be applied when the current create expression has been executed, that is, when the associated CreateExec is linked to an object. When the rule is executed, the OperExec instance moves to the next expression in its operation, and a new CreateExec instance is created. The old, full-filled CreateExec instance is removed. Note that none of the rules may change instances of ASM metaclasses. These instances are fixed by the host graph, which is an abstract form of a mogram.

Figure 9-11. Example transformation rules that specify the dynamic semantics

image

In Figure 9-12 you find the execution of the following mogram, which is represented in the figure in situation 1. To simplify the picture, the order of the two create expressions in h_op is given by an edge between the two from the first to the next.

Figure 9-12. Execution of example mogram

image

class House {
   h_op {
       create House;
       create Garage;
    }
}
class Garage {
    g_op {
       create Garage;
    }
}


The rules that are applied to reach situations 2 and 3, respectively, are not given, but the first simply creates the first object; the second creates an OperExec instance. To go from situation 3 to 4, rule 1 is applied; from situation 4 to 5, rule 2; from situation 5 to 6, rule 1; from situation 6 to 7, rule 2.

This manner of expressing what happens during the execution of a mogram is very detailed. You often need many transformation rules to specify the complete semantics of a language. On the bright side, this means that you get a very good understanding of the runtime behavior. The devil is in the details: In this semantics, all details have been considered and checked, so you can be pretty sure that they are correct.

9.4 Summary

Semantics of a language is a very personal thing. In this chapter I have defined semantics as follows.

• A description of the semantics of a language L means to communicate a subjective understanding of the linguistic utterances of L to another person or persons.

There are several ways to describe semantics:

• Denotational, that is, by constructing mathematical objects which represent the meaning of the mogram.

• Operational, that is by describing how a valid mogram is interpreted as sequences of computational steps, which is often given in the form of a state transition system.

• Translational, that is, by translating the mogram into another language that is well understood.

• Pragmatic, that is, by providing a tool that executes the mogram. This tool is often called a reference implementation.

Semantics can best be described to an audience of computer scientists by either translational or operational means. If you can find a good target language, the translational approach is probably the best. If you lack a good target language, the operational approach is more appropriate.

An operational semantics describes of how a virtual, or abstract, machine reacts when it executes a mogram. This description can be given by a combination of a metamodel specifying the semantic domain and a set of transformation rules that can be applied to the abstract form of a mogram.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.38.117