© Hillel Wayne 2018
Hillel WaynePractical TLA+https://doi.org/10.1007/978-1-4842-3829-5_9

9. State Machines

Hillel Wayne1 
(1)
Chicago, Illinois, USA
 

Specifications are more expressive than code. This comes at a cost: it’s not always clear how to go from a specification to reality. A technique for handling this is to write a very abstract spec and expand it into a more detailed, lower-level “implementation” that is closer to the program you’ll be writing. One very common pattern for doing this is to use a state machine. In this chapter, we will learn how to represent them and use them in writing our specifications.

State Machines

A state machine is a system with a finite set of internal “states” along with a set of transitions between the states. Outside events can trigger these transitions. For example, a door may be locked or unlocked , and it may be open or closed. The actions we can take are to lock and unlock the door, and to open and close a door. We can further stipulate that we can only open an unlocked door. We can write the state machine as a giant either-or chain:
(*--algorithm door
variables
  open = FALSE,
  locked = FALSE;
begin
  Event:
    either * unlock
      await locked;
      locked := FALSE;
    or * lock
      await ~locked;
      locked := TRUE;
    or * open
      await ~locked;
      await ~open;
      open := TRUE;
    or * close
      await open;
      open := FALSE;
    end either;
  goto Event;
end algorithm; *)

This passes with eight states. Only four of those states are distinct, though, which matches our expectations. Note the behavior is nondeterministic : when the door is unlocked and closed, for example, we can either lock it or open it. That’s two separate transitions that are possible. Most state machines (at least most interesting state machines) are concurrent.

What would be a failure mode? Let’s consider the case where, if the door is closed, we need a key to lock or unlock it. If the door is open, though, we can lock and unlock it without the key, such as by turning the lock on the other side. We can represent the new state machine this way:
(*--algorithm door
variables
  open = FALSE,
  locked = FALSE,
  key in BOOLEAN;
begin
  Event:
    either * unlock
      await locked / (open / key);
      locked := FALSE;
    or * lock
      await ~locked / (open / key);
      locked := TRUE;
    or * open
      await ~locked / ~open;
      open := TRUE;
    or * close
      await open;
      open := FALSE;
    end either;
  goto Event;
end algorithm; *)

Now the spec can deadlock: if we don’t have the key, then we can open the door, lock it, and then close the door again. One way around this is to make the door only closable if it’s unlocked, like how most deadbolts work. If you change the “close” guard to await open / ~locked, the spec passes again (14 states). Only 7 are distinct: we have twice as many initial states due to having a key, but if we don’t have a key then reaching the closed locked door state is impossible.

As state machines get more complex, we can simplify them by breaking them into separate processes. We put some of the events in each process. Here’s what it would look like if we had separate processes for opened and closed doors:
(*--algorithm door
variables
  open = FALSE,
  locked = FALSE,
  key in BOOLEAN;
process open_door = "Open Door"
begin
  OpenDoor:
    await open;
    either * lock/unlock
      locked := ~locked;
    or * close
      await ~locked;
      open := FALSE;
    end either;
    goto OpenDoor;
end process;
process closed_door = "Closed Door"
begin
  ClosedDoor:
    await ~open;
    either * lock/unlock
      await key;
      locked := ~locked;
    or
      await ~locked;
      open := TRUE;
    end either;
    goto ClosedDoor;
end process;
end algorithm; *)

This passes with 12 states . This actually ends up being about 1.5x more lines, but this style is often more concise and clearer for larger state machines. It also makes the concurrency of the state machine more evident, showing us the framework we’d be using for complex examples. Finally, it lets us include non-atomic transitions via labels, which becomes important when we try to write distributed state machines.

We used await to shape the event flow in both the single-process and the multiprocess state machine. We can roughly describe all PlusCal “state machines” as looking like this: branches and processes controlled by await and either statements.

Scaffolding Implementations

Most real-life problems aren’t explicitly state machines. But many look like state machines and can be abstractly specified as state machines. Then we can implement our spec off that state machine, filling in the details on how the transitions are actually implemented in code and making sure it preserves the same invariants. As an example of this, we will spec a simple data pattern. Some clients can read from and write to a database. We first specify this as a state machine without a database, and then progress to a more-detailed one that more accurately models how, exactly, the clients “read” and “write.”

REFINEMENT

TLA+ formalizes the concept of “scaffolding” with refinement . You refine a spec by writing a second, more detailed spec and showing that every valid behavior of the detailed spec is also a valid behavior of the general spec. But this requires some tooling we don’t have access to in PlusCal, and so it is outside the scope of this book. For now, we’ll have to use a more informal version.

In our first implementation, we don’t even have a database, and clients are directly reading and writing the data. We will represent the set of possible values we can write to the database with the constant Data and assume that the database stores at most one such value.
EXTENDS Integers, Sequences, TLC
CONSTANTS Data, NULL, Clients
(*--algorithm database
variables
  db_value in Data;
process clients in Clients
variables result = NULL;
begin
  Client:
    either * read
      result := db_value;
      assert result = db_value;
    or * write
      with d in Data do
        db_value := d;
      end with;
    end either;
    goto Client;
end process;
end algorithm; *)

All of this should be fairly simple. The only used internal state so far is db_value, and the only transitions we have are reading db_value (trivially correct) and modifying db_value . The only thing we check (aside from deadlock) is assert result = db_value, which is our primary invariant.

Set
Clients <- [ model value ] {c1}
NULL <- [ model value ]
Data <- [ model value ] {d1, d2}

This passes with 20 states. You may also want to check that it passes with two clients (110 states). We will return to the two-client case once we’ve added some implementation details to the spec.

This is the most abstract version of our spec. In reality, we can’t just have the client directly writing to the database; we’d have to have them communicate with whatever controls the database . The key here, though, is that any details we add to the state machine won’t change this overall spec structure. The more complex state machines we make will be more elaborate versions of this simple state machine and will still preserve the same high-level invariants.

We’ll implement how the client actually communicates with the database. Instead of directly reading and writing, it will send a request query . Then it will wait for a response before continuing. The database will take the query, perform a read/write based on it, and then give a response. We start by adding this only for the write to see what machinery we’ll need to add to support it.
variables
  query = [c in Clients |-> NULL];
  db_value in Data;
macro request(data) begin
  query[self] := [type |-> "request", data |-> data]
end macro;
macro wait_for_response() begin
  await query[self].type = "response";
end macro;
process clients in Clients
variables result = NULL;
begin
  Request:
    while TRUE do
      either * read
        result := db_value;
        assert result = db_value;
      or * write
        with d in Data do
          request(d);
        end with;
        Wait:
          wait_for_response();
      end either;
    end while;
end process;
The above is a more detailed state machine that’s closer to an actual implementation. We also added a new property, query, to our state machine. Our new write is now two steps: one to make the query and one to await the response. This is, though, an incomplete step. First, I can tell our request macro does not let us send reads. Second, without anything to actually respond to the client, our spec will deadlock. We need to add something that takes request queries and updates the database based on it.
define
  Exists(val) == val /= NULL
  RequestingClients == {c in Clients: Exists(query[c]) / query[c].type = "request"}
end define;
* our macros
* ...
process database = "Database"
begin
  DB:
    with client in RequestingClients, q = query[client] do
      db_value := q.data;
      query[client] := [type |-> "response"];
    end with;
  goto DB;
end process;
Our clients can now write to the database, and our state machine passes again (50 states). Let’s complete the transition with our read operation. To do this, we’ll need a way to differentiate between read requests and write requests in our query.
macro request(data) begin
  query[self] := [type |-> "request"] @@ data;
end macro;
Instead of a single value, clients now pass in a structure containing the data to request. For reads, the data is just a tag saying we want a read. For writes, the data is a tag saying we want to write, as well as the exact data we want to write to the database.
process clients in Clients
variables result = NULL;
begin
  Request:
    while TRUE do
      either * read
        request([request |-> "read"]);
        Confirm:
          wait_for_response();
          result := query[self].result;
          assert result = db_value;
      or * write
        with d in Data do
          request([request |-> "write", data |-> d]);
        end with;
        Wait:
          wait_for_response();
      end either;
    end while;
end process;
We also need to change the database:
process database = "Database"
begin
  DB:
    with client in RequestingClients, q = query[client] do
      if q.request = "write" then
        db_value := q.data;
      elsif q.request = "read" then
        skip;
      else
        assert FALSE; * What did we even pass in
      end if;
      query[client] := [type |-> "response", result |-> db_value];
    end with;
  goto DB;
end process;

This passes with 56 states.

In adding detail to our high-level state machine, we also added new behaviors. Making a request and getting a response is no longer an atomic operation. We want to make sure the implemented version preserves the same invariants: in this case, the assert result = db_value. The abstract state machine and the detailed state machine match for a single client. But what about two clients? Try rerunning both versions with Client <- [ model value ] {c1, c2}.

You should see they don’t agree: the abstract version passes (110 states) while the detailed version fails. Adding that communication layer means that c1 can request a read and get a response, but c2 can write to the database before c1 reads its response.

As always, there are two things we can do: we can change the implementation, or we can rethink what our requirements really are. Here it is worth asking what the invariant of the abstract machine actually means. Is it that the client always knows what’s in the database, or that the database is always honest with the client? Both of these interpretations are compatible with the original invariant.

The former case is more difficult, and we’d have to backtrack and start again. In the latter case, we can achieve it by further elaborating on our invariant.

Ghost Variables

One way to formalize “the database is honest with the client” is to say that whatever the client receives was correct data at the time of the request. Our implementation only tracks what the data is currently. We can alter it to also store history. This, though, would be irrelevant to our actual, physical system: the history only matters for checking the invariant, not for the actual implementation of the interactions we want.

The trick here is that our specification is doing two things. First, it shows how our state machine is intended to work. Second, it represents the wider context our state machine exists in. Even if the implementation doesn’t track prior values, that’s still part of our wider context. What we can do is add more detail to that context and see if that gets us to a correct system.

Contextual data that we track to verify invariants is called auxiliary, or ghost, data. We can also have ghost operators, ghost processes, etc. What’s important is that the ghost data is only used for checking invariants. While our spec can affect our ghost data, our ghosts cannot change the behavior of the spec. It may define which states are considered invariant breaking, but it cannot prevent the model checker from reaching those states.

As always, this makes more sense when you see it. Let’s add a ghost variable to our spec that tracks what the value of the database was at the time it responded to a request:
variables
  query = [c in Clients |-> NULL],
  ghost_db_history = [c in Clients |-> NULL];
  * db_value is no longer global
* ...
process database = "Database"
  variable db_value in Data;
begin
  DB:
    with client in RequestingClients, q = query[client] do
      if q.request = "write" then
        db_value := q.data;
      elsif q.request = "read" then
        skip;
      else
        assert FALSE;
      end if;
      ghost_db_history[client] := db_value;
      query[client] := [type |-> "response", result |-> db_value];
    end with;
We capture the additional data in ghost_db_history. The database process is allowed to write to the variable, but it does not (and cannot) read it. The client will assert on ghost_db_history, not db_value, which means we no longer need db_value to be global. While this doesn’t change the behavior, it tightens up the spec a little.
  Request:
    either * read
      request([request |-> "read"]);
      Confirm:
        wait_for_response();
        result := query[self].result;
        assert result = ghost_db_history[self];

The client isn’t reading ghost_db_history either. It only appears in an assertion and so is only used to check an invariant. With this, our two-client model passes (6,098 states). If we implemented this, the database would not be tracking history, since ghost_db_history isn’t part of the implementation. But it would still conform to the spec we have.

We now have a relationship between our initial, abstract state machine and our final spec. If we read the invariant as “the response is the value of the database at the time the request is processed,” then our final spec implements the initial state machine. If we read the invariant as “the client always reads the current value of the database,” then our final spec does not implement the initial state machine.

Summary

We learned how to write a state machine pattern, how to use it as a means of designing the implementations of specs, and the value of ghost values for checking properties of a spec. We designed a client-database system and showed that it behaves correctly.

TLA+ helped us create an implementation that matched a higher-level specification. In the next chapter, we will go one step higher and use TLA+ to turn a set of informal business requirements into a specification, formalizing the requirements in the process.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.236.70