© Hillel Wayne 2018
Hillel WaynePractical TLA+https://doi.org/10.1007/978-1-4842-3829-5_2

2. PlusCal

Hillel Wayne1 
(1)
Chicago, Illinois, USA
 

Introduction

In this chapter we’ll be introducing PlusCal. PlusCal is a language that compiles down to TLA+. Lamport developed it in 2009 to make TLA+ more accessible to programmers. Most of the things we’ll want to do will be significantly easier in PlusCal than in TLA+. This chapter will cover all of PlusCal with the exception of multiprocess algorithms, which is Chapter 5; and fair processes, which is Chapter 6.

Specifications

Layout of a Spec

To start, let’s take one of the examples of the bank transfer from the last chapter.
---- (1) MODULE wire (2) ----
EXTENDS Integers * (3)
(*--algorithm wire * (4)
    variables (5)
        people = {"alice", "bob"},
        acc = [alice |-> 5, bob |-> 5];
begin * (6)
    skip;
end algorithm;*) (4)
==== * (1)
All the specs we write will have this form.
  1. (1)

    All TLA+ specs must start with at least four  on each side of the MODULE and four = at the end. This is for backwards compatibility reasons. Everything outside these two boundaries is ignored, and people often put metadata there.

     
  2. (2)

    The module name must be the same as the filename.

     
  3. (3)

    EXTENDS is the import keyword. We’re importing the Integers module.

     
  4. (4)

    * starts a line comment in TLA+, (* ... *) is a block comment. PlusCal specs are placed in comments (so the parser ignores it), are started with --algorithm <name>, and closed with end algorithm. The name of the algorithm does not have to match the filename.

     
  5. (5)

    Inside the algorithm , we initialize variables with variables. Variables are separated by commas or semicolons.

     
  6. (6)

    This is where we write the algorithm itself.

     

Note

In the tutorial we had Labels too. They’re only necessary for concurrent algorithms, so for now we leave them out and will cover all of the uses for them in Chapter 5.

Expressions

Everything in an expression is either a value, like {TRUE}, or an operator, like +. In the next chapter we will be writing our own operators, but for this one we will only use the ones provided by the standard library.

Since we’re going to be working with a lot of expressions, we will need some way of evaluating them without running an entire spec. To do that, we can use the expression evaluator. Go to a model (such as Model_1 in your spec) and go to the Model Checking Results. If you put an expression into “Evaluate Constant Expression” and run the model, it will output the result in the Value box as shown in Figure 2-1.
../images/462201_1_En_2_Chapter/462201_1_En_2_Fig1_HTML.jpg
Figure 2-1

Evaluate Constant Expression

Tip

Checking an expression will also evaluate your spec. If you don’t want to run it, you can switch it to “No Behavior Spec” under Model Overview > What is the behavior Spec?

From now on we will use the following format to mean “Evaluate Constant Expression” and the result:
>> Expression
Result
For example, we would write the above screenshot as:
>> 1 + 2
3

Values

There are four kinds of basic values in TLA+: strings, integers, Booleans, and model values. Floats are not supported, Boolean values are written TRUE and FALSE, and model values will be introduced in Chapter 4. Strings must be written in double quotes, and cannot be written with single quotes. The standard operations are:

Operator

Meaning

Example

x = y

Equals

>> 1 = 2FALSE

x /= yx # y

Not Equals

>> 1 /= 2TRUE

x / y

And

>> TRUE / FALSEFALSE

x / y

Or

>> TRUE / FALSEFALSE

x := y

Assignment

N/A [PlusCal only]

~x

Not

>> ~TRUEFALSE

= VS :=

If = is equality and := is assignment, how come we write variables x = 1 and not variables x := 1? In raw TLA+, there is no assignment, only equality. If you want to initialize x to 1, you write x = 1. If x is initialized and you want to compare x to 1, you write x = 1. If x is initialized and you want to assign it to 1, you write x' = 1. In TLA+, these are all actually equality checks! While this might seem unintuitive, it’s all part of the underlying way TLA+ treats time.

Here’s a rule of thumb: if it’s the first time you’re using the variable, = is initialization. Every other time, = is equality and := is assignment. If you write variables x = 2, y = x, z = (x = y) you will get x = 2, y = 2, z = TRUE. By the time we reach z we’ve already initialized x and y, so (x = y) is an equality check. If this is a little confusing, that’s understandable, and you’ll build an intuition with some experience.

If we EXTENDS Integers , we also get arithmetic. +, -, %, and * all behave as you expect them to. Integer division is div, while decimal division is unsupported. You also have the range operator .., where a..b is the set {a, a+1, a+2, ..., b}.
>> 1..3
{1, 2, 3}

There are four kinds of constructed types in TLA+: sets, tuples/sequences, structures, and functions. Functions are covered in the next chapter. We can cover the basics of the rest here.

Sets are unordered collections of elements. They are specified with curly braces. For example, we can say set = {"a", "b", "c"}. All elements in the set must have the same type, but there are no restrictions beyond that. You can have sets of sets of sets of functions if you’d like. Sets have the following operators:

Operator

Meaning

Example

x in set

Is element of set

>> 1 in 1..2TRUE

x otin set

~(x in set)

Is not element of set

>> 1 otin 1..2FALSE

set1 subseteq set2

Is subset of set

>> {1, 2} subseteq {1, 2, 3}TRUE

set1 union set2

Set Union

>> (1..2) union (2..3){1, 2, 3}

set1 intersect set2

Set Intersection

>> (1..2) intersect (2..3){2}

set1 set2

Set Difference

>> (1..2) (2..3){1}

Cardinality(set)

Number of elements in set (requires EXTENDS FiniteSets)

>> Cardinality({"a", "b"}) 2

Note

When reading specs, you might come across union written as cup and intersect as cap. This is a visual match of the mathematical symbols, ∪ and ∩. We’re using union/intersect because it’s more explicit.

Sets also have two special set transformations. We filter sets with {x in set: conditional} and map sets with {expression: x in set}.
>> {x in 1..2: x < 2}
{1}
>> {x * 2: x in 1..2}
{2, 4}

Note

It’s very rare to need to write something like {x in set1 : y in set2}, but it (again, rarely) does happen. In this edge case, it’s treated as a filter on set1.

Tuples or Sequences (both words are common) are ordered collections of elements, with the index starting at 1. They are specified with << and >>, and they do not need to be the same element type. If you write tup = <<"a", {1, 2}>>, then tup[1] = "a" and tup[2] = {1, 2}. Again, this is 1-indexed. If we EXTENDS Sequences, we get some additional sequence operators:

Operator

Meaning

Example

Head(sequence)

Head

>> Head(<<1, 2>>)1

Tail(seq)

Tail

>> Tail(<<1, 2, 3>>) <<2, 3>>

Append(seq, x)

Append

>> Append(<<1, 2>>, 3)

<<1, 2, 3>>

seq1 o seq2

Combine

>> <<1>> o <<2, 3>> <<1, 2, 3>>

Len(seq)

Length of sequence

>> <<1, 1, 1, 1>>

4

While the terms are interchangeable, by convention we use tuple when we don’t expect to use sequence operators on it or change its length, and we use sequence if we do.

Structures (or structs) map strings to values. You write them as [ key1 |-> val1, key2 |-> val2, etc ]. The values do not have to be the same type. You get the value with struct.key.
>> [a |-> 1, b |-> <<1, {}>>].b
<<1, {}>>

Note

While the syntax for the two is different, structures and sequences are the same data type! We’ll learn more about this when we cover functions in the next chapter.

Sets, sequences, and structures can be assigned to variables. The following is a valid PlusCal algorithm:
(*--algorithm example
variables x = <<1, [a |-> {}]>>;
begin
  x[2].a := x[2].a union {2};
end algorithm; *)

Here we modified a set inside a struct inside a tuple/sequence.

That covers the basics of data types. Now let’s quickly run through some basic syntax of the PlusCal algorithm body. Most of this should be familiar from programming languages, so we won’t go into too much detail.

PlusCal Algorithm Body

Assignment

Assign an existing variable to a different value. Done with :=.

assert

An assertion. assert TRUE does nothing. assert FALSE raises an error. Adding assertions is one common way we test invariants: the assertion checks that in that step a given expression holds, so if it fails our spec broke the invariant. In order to use assertions, you need to add EXTENDS TLC.

skip

A no-op. We can use this to fill represent parts of the spec that we haven’t filled out yet or conditionals that don’t update anything.

if

if condition1 then
  body
elsif condition2 then
  body
else
  body
end if;

While if is the only conditional in PlusCal, it is not the only branching statement. Two others, either and with, will be introduced later in this chapter.

while

While loops are the only form of loops in PlusCal:
while condition do
  body
end while;

Note that we use do here, while for the if statement we use then.

Macros

To clean up specs a little, we can add macros before the begin.
macro name(arg1, arg2) begin
  * assignments
end macro;
begin
  name(x, y);
end algorithm;
You can place assignments, assertions, and if statements in macros, but not while loops. You also cannot assign to any variable more than once. You can refer to outside values in the macro , and you can assign to outside variables. For example, the following is a spec error:
EXTENDS TLC
(*--algorithm example
variables x = TRUE;
macro set_false() begin
  x := FALSE;
end macro;
begin
  set_false();
  assert x;
end algorithm; *)

While set_false doesn’t take x as a parameter, it’s still able to change the variable.

Example

Let’s design a sorting machine! An abstract one so that we can ignore the hardware details. Imagine we have a machine that sorts material into “recyclable” and “trash.” It has finite space for both recycling and trash. Items with a specified size and type come in, one at a time, and it sorts them according to the following rules:
  • If the item is labeled as “recycling” and it is under the remaining capacity for the recycling bin, the item goes into recycling.

  • If the item is labeled as “trash” OR the item is labeled as “recycling” and there is not enough recycling capacity AND there is sufficient capacity in the trash bin, the item goes into trash.

  • Otherwise, it’s dropped on the floor and somebody else gets to sweep it up.

Let’s start by thinking about our representations. The capacities of the two bins can be represented by numbers. The items can be represented by a structure with a key for size and a key for type: an item might look like [type |-> "trash", size |-> 2]. We can represent the numbers in each bin with sets. Finally, we can represent the order that items come in as a sequence, such as <<item1, item2, item3>>.

Next, we should figure out what our invariants are. We don’t have the tools yet to inspect properties on sets, so we can start by just checking that we don’t go over the capacity limit for either bin, and that each bin has the appropriate amount of items in it.

Here’s one way of writing the spec:
EXTENDS Sequences, Integers, TLC, FiniteSets
(*--algorithm recycler
variables
    capacity = [trash |-> 10, recycle |-> 10],
    bins = [trash |-> {}, recycle |-> {}],
    count = [trash |-> 0, recycle |-> 0],
    items = <<
        [type |-> "recycle", size |-> 5],
        [type |-> "trash", size |-> 5],
        [type |-> "recycle", size |-> 4],
        [type |-> "recycle", size |-> 3]
    >>,
    curr = ""; * helper: current item
begin
    while items /= <<>> do
        curr := Head(items);
        items := Tail(items);
        if curr.type = "recycle" / curr.size < capacity.recycle then
            bins.recycle := bins.recycle union {curr};
            capacity.recycle := capacity.recycle - curr.size;
            count.recycle := count.recycle + 1;
        elsif curr.size < capacity.trash then
            bins.trash := bins.trash union {curr};
            capacity.trash := capacity.trash - curr.size;
            count.trash := count.trash + 1;
        end if;
     end while;
     assert capacity.trash >= 0 / capacity.recycle >= 0;
     assert Cardinality(bins.trash) = count.trash;
     assert Cardinality(bins.recycle) = count.recycle;
end algorithm; *)
Confirm this works (remember to compile the PlusCal to TLA+), has 7 states, and has no errors. I don’t like the duplication in those two if statements, so let’s add a macro.
macro add_item(type) begin
  bins[type] := bins[type] union {curr};
  capacity[type] := capacity[type] - curr.size;
  count[type] := count[type] + 1;
end macro;
begin
    while items /= <<>> do
        curr := Head(items);
        items := Tail(items);
        if curr.type = "recycle" / curr.size < capacity.recycle then
            add_item("recycle");
        elsif curr.size < capacity.trash then
            add_item("trash");
        end if;
        * rest is same

We replaced the bodies of the two conditions branches with calls to add_item. Confirm again that this works.

Complex Behaviors

We now know how to write a very simple spec, but what we have is barely more interesting than a deterministic unit test. If we want to make this useful, we need a way to check not just one setup, but an entire space of setups and runtime occurrences. There are three basic ways to do this.

Multiple Starting States

We initialize variables with =. But we can also initialize them with in. If we write x in set, all that means is that x is any possible element in the set. For example, if we had
(*--algorithm in
variables x in 1..3;
begin
    assert x <= 2;
end algorithm; *)

TLC would first try running the whole algorithm with x = 1, then x = 2, then finally x = 3, which fails. If we added a second variable y that also used in, TLC would check every single possible combination of x and y.

Tip

TLA+ defines a shorthand BOOLEAN for the set {TRUE, FALSE}. This can be useful if you have a flag variable, such as variable is_ready in BOOLEAN.

We can use this to choose some arbitrary number. What about arbitrary sets, structures, and tuples? We have some special operators to generalize them.

First of all, for a given set, SUBSET set is the power set, or the set of all subsets. We reverse this with UNION set, which combines a set-of-sets back into one. UNION {set1, set2, ... setn} is equivalent to writing set1 union set2 union ... union setn.
>> SUBSET {"a", "b"}
{{}, {"a"}, {"b"}, {"a", "b"}}
>> UNION {{"a"}, {"b"}, {"b", "c"}
{"a", "b", "c"} 
Given two sets, set1 X set2 is the set of all tuples where the first element is in set1 and the second element is in set2.
>> {"a", "b", "c"} X (1..2)
{<<"a", 1>>, <<"a", 2>>, <<"b", 1>>, <<"b", 2>>, <<"c", 1>>, <<"c", 2>>}
Note that X is not associative. A X B X C is a set of triplets, while (A X B) X C is a pair where the first element is also a pair, and A X (B X C) is a pair where the second element is also a pair.
>> <<1, 2, 3>> in (1..3) X (1..3) X (1..3)
TRUE
>> <<1, 2, 3>> in (1..3) X ((1..3) X (1..3))
FALSE
Finally, to generate a set of structures, we use a different syntax. Instead of writing [key |-> val], we write [key: set]. Then if x in [key: set], x is a structure where the value of key is some element of set.
>> [a: {"a", "b"}]
{[a |-> "a"], [a |-> "b"]}
>> [a: {"a", "b"}, b: (1..2)]
{ [a |-> "a", b |-> 1], [a |-> "a", b |-> 2], [a |-> "b", b |-> 1], [a |-> "b", b |-> 2] }

Tip

Sometimes you want a structure where one key is always a specific value, but another key is some value in a set. In that case you can wrap the value in a one-element set, as in [key1: set, key2: {value}].

As with everything else, all of these can be freely mixed and matched. We can write variable x in [key: (set1 X set2)] to mean “x is a structure where the value of key is some pair of elements, the first being in set1, the second being in set2.” We can use this to detail complex data structures in our specifications. In particular, we can use this to detail complex starting states that break our spec.

Example

We’ll rewrite our recycler example to have arbitrary capacities and arbitrary items.
variables
    capacity in [trash: 1..10, recycle: 1..10],
    bins = [trash |-> {}, recycle |-> {}],
    count = [trash |-> 0, recycle |-> 0],
    item = [type: {"trash", "recycle"}, size: 1..6],
    items in item X item X item X item,
    curr = ""; * helper: current item
    * rest is same
To make it cleaner, we added a helper item. items can be defined in terms of item and will be a four-element sequence of them. When you rerun this, you should notice two things:
  1. 1.

    Checking the model takes longer. Before, we had one possible starting state. Now we have 10 × 10 × (2 × 6)4 = 2,073,600 starting states. TLC will be clever about checking them, but optimizing your models is an important skill you’ll develop.

     
  2. 2.

    Our spec failed. The TLC error will look something like this:

     
TLC threw an unexpected exception.
      This was probably caused by an error in the spec or model.
      See the User Output or TLC Console for clues to what happened.
      The exception was a tlc2.tool.EvalException
      ...
      The first argument of Assert evaluated to FALSE; the second argument was:

This means that the spec failed because one of our asserts failed. The exact values of the failure you get will probably be different run to run, but it will be the same core problem. Sets are unique, and {x} union {x} = {x}, not {x, x}. If we handle two items with the exact same type and size, we end up storing it once but increasing count twice. Then the size of the set and the value of count don’t match up.

Ultimately, the problem is in the set: our count is correct and the set is wrong. We want duplicates, so we should preferably store the items in sequences instead of sets. set union {curr} looks ugly, anyway. Plus, we can get rid of the FiniteSets dependency, since we’d be using Append instead of Cardinality.

Our final spec looks like this:
EXTENDS Sequences, Integers, TLC
(*--algorithm recycler
variables
    capacity in [trash: 1..10, recycle: 1..10],
    bins = [trash |-> <<>>, recycle |-> <<>>],
    count = [trash |-> 0, recycle |-> 0],
    item = [type: {"trash", "recycle"}, size: 1..6],
    items in item X item X item X item,
    curr = ""; * helper: current item
macro add_item(type) begin
  bins[type] := Append(bins[type], curr);
  capacity[type] := capacity[type] - curr.size;
  count[type] := count[type] + 1;
end macro;
begin
    while items /= <<>> do
        curr := Head(items);
        items := Tail(items);
        if curr.type = "recycle" / curr.size < capacity.recycle then
            add_item("recycle");
        elsif curr.size < capacity.trash then
            add_item("trash");
        end if;
     end while;
     assert capacity.trash >= 0 / capacity.recycle >= 0;
     assert Len(bins.trash) = count.trash;
     assert Len(bins.recycle) = count.recycle;
end algorithm; *)

It should pass with 9,323,626 states .

Nondeterministic Behavior

Not all behavior is deterministic. A request may succeed or fail, a query might return a random result, there might be one of several choices to make. For single process algorithms, we have two PlusCal constructs to simulate nondeterminisim.

Either

We write an either expression like this:
either
  * branch 1
or
  * branch 2
  * ...
or
  * branch n
end either;

When you model-check your spec, TLC will check all branches simultaneously. We can use this to represent one of several possibilities happening. There is no way to make one possibility more likely than the other. We generally assume that if some possible choice invalidates our spec, no matter how unlikely, it’s something we’ll want to fix.

We can place any assignment or PlusCal expression inside of an either branch. If all of the branches are “macro-valid,” we may place an either inside of a macro.

With

There are two ways we can write a with expression:
with var = value do
  * body
end with;
* or
with var in set do
  * body
end with;
In the former case, this just creates a temporary variable. This follows the “if it’s the first time we see a variable, use =” rule. We could have used this to replace curr in our last example, such as
with curr = Head(items) do
  if curr.type = "recycle" * ...

The second case, however, is nondeterministic. TLC will check what happens for all possible assignments of var to elements of set. If the set is empty, the spec halts until the set is not empty. For single-process apps, this is considered a spec failure.

with statements follow macro rules: no double-assignments and no while loops. You can place with statements inside macros.

Warning

with gives you values, not references. If x and y are variables, you could not reassign to them by writing with t in {x, y} do t := 1. You could, though, write with t in {x, y} do x := t.

Example

For this example, we’ll have an idealized model of sending messages when the receiver doesn’t automatically accept them. Maybe the receiver is a friend who’s going in and out of cell coverage. We can approximate this with a two-turn cycle:
  1. 1.

    On the sender’s turn, they put a message in transit.

     
  2. 2.

    On the receiver’s turn, they either receive a message in transit or do nothing (they’re outside cell coverage).

     
While we have a definite order on how the messages are sent and an order in which they are received, they aren’t ordered while in transit. The receiver can get the messages in transit in any order. This means we have two sequences for to_send and received, but a set for in_transit.
EXTENDS Sequences, TLC
(*--algorithm telephone
variables
  to_send = <<1, 2, 3>>,
  received = <<>>,
  in_transit = {};
begin
  while Len(received) /= 3 do
    * send
    if to_send /= <<>> then
      in_transit := in_transit union {Head(to_send)};
      to_send := Tail(to_send);
    end if;
    * receive
    either
      with msg in in_transit do
        received := Append(received, msg);
        in_transit := in_transit {msg}
      end with;
    or
      skip;
    end either;
  end while;
end algorithm; *)
This runs normally with no errors. If you add an assert received = <<1, 2, 3>>; after the while loop, you should get an error. There’s several failing behaviors now, and TLC reports the first one it finds, which might not always be the same one. But all of them should look something like this:
  1. 1.

    The sender places message 1 in transit.

     
  2. 2.

    The receiver skips.

     
  3. 3.

    The sender places message 2 in transit.

     
  4. 4.

    The receiver pulls message 1.

     
  5. 5.

    The sender places message 3 in transit.

     
  6. 6.

    The receiver pulls message 3.

     
  7. 7.

    The receiver pulls message 2.

     

This is called a concurrency bug. TLA+ is especially well-suited to identifying and debugging concurrency bugs, which is good, because a lot of the nastiest and most subtle bugs are concurrency bugs.

One (admittedly heavy-handed) fix would be to only let you send if the last message was confirmed received. While the implementation details may be complex, we can represent that at a high level by just adding a flag for can_send:
variables
  to_send = <<1, 2, 3>>,
  received = <<>>,
  in_transit = {},
  can_send = TRUE;
begin
  while Len(received) /= 3 do
    * send
    if can_send / to_send /= <<>> then
      in_transit := in_transit union {Head(to_send)};
      can_send := FALSE;
      to_send := Tail(to_send);
    end if;
    * receive
    either
      with msg in in_transit do
        received := Append(received, msg);
        in_transit := in_transit {msg};
        can_send := TRUE;
      end with;
    or
      skip;
    end either;

With this fix, the spec is valid (18 states). But there’s a subtle and very dangerous problem here: if you can only send if the other person receives , what if the message is never received? Our only invariant is that, at the very end, the messages have arrived in order. One way to satisfy this is if the messages never arrive at all! This is called a liveness bug and we will study them further in Chapter 6.

We should also consider the case where the message is successfully received but the mechanism that reports it was received fails. We can represent this by using another either, this time to check whether we reset can_send.
      with msg in in_transit do
        received := Append(received, msg);
        in_transit := in_transit {msg};
        either
          can_send := TRUE;
        or
          skip;
        end either;
      end with;

If you run this, you should see it fail, with the error “Deadlock reached.” This means TLC reached a state where it can’t make any more progress. In this case, the sender places message 1 in transit, and the receiver receives message 1 but does not reset can_send. The sender can’t do anything else because can_send is false, and the receiver can’t do anything because in_transit is empty. Deadlock.

Deadlocks are a particularly common problem in concurrent code, and we will discuss them more in Chapter 5.

Summary

In this chapter we covered the basics of using PlusCal to write specs. We learned the syntax for sets, tuples/sequences, and structures, as well as how to check multiple starting states and simulate nondeterministic behavior.

In the next chapter, we will learn how to use TLA+ proper to create complex data and invariants. We will also introduce the last data type: functions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.150.123