12.4 Example: Encoding S-Expressions

Display is a debugging routine for displaying structured data, but it’s not far short of being able to encode or marshal arbitrary Go objects as messages in a portable notation suitable for inter-process communication.

As we saw in Section 4.5, Go’s standard library supports a variety of formats, including JSON, XML, and ASN.1. Another notation that is still widely used is S-expressions, the syntax of Lisp. Unlike the other notations, S-expressions are not supported by the Go standard library, not least because they have no universally accepted definition, despite several attempts at standardization and the existence of many implementations.

In this section, we’ll define a package that encodes arbitrary Go objects using an S-expression notation that supports the following constructs:

42             integer
"hello"        string (with Go-style quotation)
foo            symbol (an unquoted name)
(1 2 3)        list   (zero or more items enclosed in parentheses)

Booleans are traditionally encoded using the symbol t for true, and the empty list () or the symbol nil for false, but for simplicity, our implementation ignores them. It also ignores channels and functions, since their state is opaque to reflection. And it ignores real and complex floating-point numbers and interfaces. Adding support for them is Exercise 12.3.

We’ll encode the types of Go using S-expressions as follows. Integers and strings are encoded in the obvious way. Nil values are encoded as the symbol nil. Arrays and slices are encoded using list notation.

Structs are encoded as a list of field bindings, each field binding being a two-element list whose first element (a symbol) is the field name and whose second element is the field value. Maps too are encoded as a list of pairs, with each pair being the key and value of one map entry. Traditionally, S-expressions represent lists of key/value pairs using a single cons cell (key . value) for each pair, rather than a two-element list, but to simplify the decoding we’ll ignore dotted list notation.

Encoding is done by a single recursive function, encode, shown below. Its structure is essentially the same as that of Display in the previous section:

gopl.io/ch12/sexpr
func encode(buf *bytes.Buffer, v reflect.Value) error {
    switch v.Kind() {
    case reflect.Invalid:
        buf.WriteString("nil")

    case reflect.Int, reflect.Int8, reflect.Int16,
        reflect.Int32, reflect.Int64:
        fmt.Fprintf(buf, "%d", v.Int())

    case reflect.Uint, reflect.Uint8, reflect.Uint16,
        reflect.Uint32, reflect.Uint64, reflect.Uintptr:
        fmt.Fprintf(buf, "%d", v.Uint())

    case reflect.String:
        fmt.Fprintf(buf, "%q", v.String())

    case reflect.Ptr:
        return encode(buf, v.Elem())

    case reflect.Array, reflect.Slice: // (value ...)
        buf.WriteByte('(')
        for i := 0; i < v.Len(); i++ {
            if i > 0 {
                buf.WriteByte(' ')
            }
            if err := encode(buf, v.Index(i)); err != nil {
                return err
            }
        }
        buf.WriteByte(')')

    case reflect.Struct: // ((name value) ...)
        buf.WriteByte('(')
        for i := 0; i < v.NumField(); i++ {
            if i > 0 {
                buf.WriteByte(' ')
            }
            fmt.Fprintf(buf, "(%s ", v.Type().Field(i).Name)
            if err := encode(buf, v.Field(i)); err != nil {
                return err
            }
            buf.WriteByte(')')
        }
        buf.WriteByte(')')

    case reflect.Map: // ((key value) ...)
        buf.WriteByte('(')
        for i, key := range v.MapKeys() {
            if i > 0 {
                buf.WriteByte(' ')
            }
            buf.WriteByte('(')
            if err := encode(buf, key); err != nil {
                return err
            }
            buf.WriteByte(' ')
            if err := encode(buf, v.MapIndex(key)); err != nil {
                return err
            }
            buf.WriteByte(')')
        }
        buf.WriteByte(')')

    default: // float, complex, bool, chan, func, interface
        return fmt.Errorf("unsupported type: %s", v.Type())
    }
    return nil
}

The Marshal function wraps the encoder in an API similar to those of the other encoding/... packages:

// Marshal encodes a Go value in S-expression form.
func Marshal(v interface{}) ([]byte, error) {
    var buf bytes.Buffer
    if err := encode(&buf, reflect.ValueOf(v)); err != nil {
        return nil, err
    }
    return buf.Bytes(), nil
}

Here’s the output of Marshal applied to the strangelove variable from Section 12.3:

((Title "Dr. Strangelove") (Subtitle "How I Learned to Stop Worrying and Lo
ve the Bomb") (Year 1964) (Actor (("Grp. Capt. Lionel Mandrake" "Peter Sell
ers") ("Pres. Merkin Muffley" "Peter Sellers") ("Gen. Buck Turgidson" "Geor
ge C. Scott") ("Brig. Gen. Jack D. Ripper" "Sterling Hayden") ("Maj. T.J. 
"King" Kong" "Slim Pickens") ("Dr. Strangelove" "Peter Sellers"))) (Oscars
("Best Actor (Nomin.)" "Best Adapted Screenplay (Nomin.)" "Best Director (N
omin.)" "Best Picture (Nomin.)")) (Sequel nil))

The whole output appears on one long line with minimal spaces, making it hard to read. Here’s the same output manually formatted according to S-expression conventions. Writing a pretty-printer for S-expressions is left as a (challenging) exercise; the download from gopl.io includes a simple version.

((Title "Dr. Strangelove")
 (Subtitle "How I Learned to Stop Worrying and Love the Bomb")
 (Year 1964)
 (Actor (("Grp. Capt. Lionel Mandrake" "Peter Sellers")
         ("Pres. Merkin Muffley" "Peter Sellers")
         ("Gen. Buck Turgidson" "George C. Scott")
         ("Brig. Gen. Jack D. Ripper" "Sterling Hayden")
         ("Maj. T.J. "King" Kong" "Slim Pickens")
         ("Dr. Strangelove" "Peter Sellers")))
 (Oscars ("Best Actor (Nomin.)"
          "Best Adapted Screenplay (Nomin.)"
          "Best Director (Nomin.)"
          "Best Picture (Nomin.)"))
 (Sequel nil))

Like the fmt.Print, json.Marshal, and Display functions, sexpr.Marshal will loop forever if called with cyclic data.

In Section 12.6, we’ll sketch out the implementation of the corresponding S-expression decoding function, but before we get there, we’ll first need to understand how reflection can be used to update program variables.

Exercise 12.3: Implement the missing cases of the encode function. Encode booleans as t and nil, floating-point numbers using Go’s notation, and complex numbers like 1+2i as #C(1.0 2.0). Interfaces can be encoded as a pair of a type name and a value, for instance ("[]int" (1 2 3)), but beware that this notation is ambiguous: the reflect.Type.String method may return the same string for different types.

Exercise 12.4: Modify encode to pretty-print the S-expression in the style shown above.

Exercise 12.5: Adapt encode to emit JSON instead of S-expressions. Test your encoder using the standard decoder, json.Unmarshal.

Exercise 12.6: Adapt encode so that, as an optimization, it does not encode a field whose value is the zero value of its type.

Exercise 12.7: Create a streaming API for the S-expression encoder, following the style of json.Encoder (§4.5).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.224.197