Chapter 8. User-Defined Types

Because any Rosetta item whose value is a set can be used to create a type, creating new types is the same process as creating new sets. Rosetta provides three basic mechanisms for forming sets and types: (i) listing the elements explicitly, (ii) filtering or composing existing sets, or (iii) defining functions for constructing the elements of the set. Listing the elements of a set, referred to as extension, is the simplest mechanism for defining new types. The new type is formed by simply using the set former to list the elements of the type. Filtering existing sets, referred to as comprehension, involves using the sel operation or one of its derivatives to extract a set of items from an existing set. Finally, defining functions to construct type elements, referred to as constructive specification, involves defining functions that generate all elements of a type.

Types are specified using the expression language to define sets making Rosetta’s type system dependent. Specifically, the same language used to define expressions that use declared items is used to define their types. Most programming languages use a distinct language subset for defining types that does not include anything requiring evaluation. Although this restriction makes type checking far simpler, Rosetta uses a dependent type system due to its expressive power.

An excellent example of Rosetta’s type system’s power is using comprehension to define item properties. Defining a set by comprehension is defining a property that each member of the new set must have. Using Rosetta’s set comprehension capabilities, we can define sets that assert a property over all members of a type. This is a simple and powerful way of defining new properties for items that is difficult or impossible to achieve without dependent types.

Because Rosetta types are essentially set values, they are first-class in the language. Types can be passed as actual parameters to facets and functions, and type items can be variable or constant. They can be created, compared, observed, and transformed like any other Rosetta value using the same expression language.

Defining New Types

When a new Rosetta type is defined, it must either be a subtype of an existing type, or a new base type that has no supertype (Table 8.1). The standard notation for Rosetta declarations applies equally to the definition of new types. Recall that the notation for a item definition is:

 names ::T [[ is e ]];

Table 8.1. Forms for defining types classified by interpretability and subtype

 

Constant Type

Variable Type

Base Type

T :: type is V;

T :: type;

Subtype

T :: subtype (S) is V;

T :: subtype (S);

where names is a comma-separated list of type names, T is the set of values the new type’s value is drawn from, and e is a Rosetta expression whose value determines the values associated with the new type. Using this notation, the Rosetta type tri can be defined using extension as a subtype of the integer type:

  tri ::set (integer) is {–1,0,1};

In this declaration, the label is tri, the type set (integer), and the value {–1,0,1}. The set type former creates the powerset of its argument. Thus, tri must be a subset of the integer set. The is clause defines the value of tri as the set {–1,0,1}, an element of the powerset of integers and a legal value for this type. Thus, we have defined a new item, tri, that can be used as a type to define both variables and constants:

  high ::tri is 1;
  x ::tri;

To distinguish types from sets, Rosetta provides the keyword subtype to indicate (i) that a new type is being defined and (ii) the supertype of the new subtype. The subtype synonym is semantically the same as set, but indicates the intent of a declaration. The previous definition of tri would appear as follows in a definition:

   tri ::subtype (integer) is {–1,0,1};

The most common mechanism for defining new types is using the subtype qualifier to to specify the type expression. The notation:

   x::subtype(integer) is sel (x::integer | x>=0);

declares that x is a set of items from integer. It is semantically equivalent to the previous notation. Types defined in this manner are referred to as interpreted subtypes because their supertype is known and the value of the new type is also known. In general, new types are defined using the following notation:

  name ::subtype(T) [[ is e ]];

where name is the name of the new type and T is the associated supertype. The optional expression, e, defines the value of the new type. It is mandatory that the range of e is a subset of T in the same manner as any other item declaration. In declarations of this sort, name is considered subtype of T while T is the supertype of name. Treated as sets, the relationship name = <T must hold for all values of name.

Like traditional item declarations, the is clause is optional, allowing the definition of a subtype whose specific value is not known. For example:

  time::subtype(integer);

defines a new item called time that is a subtype of integer, whose specific value is not known. The difference between this definition and previous definitions is that the value of time is not known. It is known that the values associated with time are found in integer, but those values are otherwise unconstrained. This supports incomplete definitions of types. Specifically, some properties of the type are known, but its specific contents are not. Types declared in this fashion are called uninterpreted subtypes because their supertypes are known, but their values are not.

At times it is useful to define a new type that is not a subtype of any existing type. Rosetta does not allow type definitions that do not refer to some kind of supertype in their declaration. To achieve the desired result, the type top is used as the supertype. The definition:

  tri ::subtype(top) is {–1,0,1};

defines a new type called tri whose values are –1, 0, and 1 and whose supertype is top. In this definition, an item of type tri will not inherit operations from number or element types.

Given the alternative definition:

   tri ::subtype(integer) is {–1,0,1};

operations over integer values are inherited by the new type. The previous definition breaks the inheritance chain forcing the user to define operations over the new type. Types defined in this manner are referred to as interpreted types, as their values are known but their supertypes are known to only be top.

In both cases existing values are used to populate a new type. When the literal –1 is seen in a specification, some mechanism must be used to determine its associated type. The most common approach is to simply use a type assertion such as –1::tri or –-1::integer to specify the desired type. In some cases, the appropriate type can be inferred. However, it cannot be inferred in the general case.

At times it is desirable to define new types, where both value and supertype are unknown. This definition style defers all properties of the new type to definition using terms within the facet. Such types are defined using top as the supertype, but omitting the is clause and the explicitly specified value. The definition:

   tri ::subtype(top);

defines a type whose supertype and value are not known. Such types are referred to as uninterpreted types or sorts in definitions. They are particularly useful at high levels of abstraction where some, but not all, properties are known.

The use of subtype(top) is quite common in Rosetta specifications and always indicates the definition of a new type. For readability, the keyword type is introduced as a synonym for this construct. The definitions:

    tri ::type is {–1,0,1};
    tri ::type;

are identical to the previous definitions, but are easier to read and interpret.

Types need not be defined starting only from elemental values, but can be defined from any type. For example, consider the definition of a type whose values are subsets of a set T:

  s::subtype(set(T));

The notation set (T) defines the type containing all possible subsets created from elements of T. The notation x::set(T) therefore states that x is a single subset of T. Using the subtype form defines s to be an arbitrary collection of subsets of T. The subtype form simply says that s may include many sets. This is fundamentally different than the declaration:

   s::set(set(T));

In this case, s is a single set of subsets from T, rather than a set of subsets from T. The notation set (T) again defines a type that contains all subsets of T. In this case, the outer set former evaluates to all possible subsets of all possible subsets from T. These types can be further restricted, as in:

   set4 ::subtype (set(T)) is
      sel (x::subtype (set(T)) | forall (t::x | #t=4));

where set4 is the set of subsets of T that contain exactly four elements. The notation z::set4 declares z to be a singleton element of type set4, or simply a subset of T containing four elements.

One reminder about the distinction between the notations x::T and x:: subtype(T) as declarations: The first says that the value of x is a single element of type T; the second says that the value of x is a set of values selected from T. If the first definition is used as a type, then only single element types are allowed.

The second explicitly allows sets. In contrast, if these statements are used as terms in a specification, they are identical.

Defining Types By Extension

There are two mechanisms for defining new types by extension. Using set formers, values for a type can be listed explicitly. However, those values must already exist in the Rosetta value space. Using enumerations, sets of values are defined where values must be fresh and cannot appear in Rosetta prior to the declaration.

Using Set Formation Operators

The simplest mechanism for defining a type is to simply list its elements or combine existing items to form a set. In Rosetta, the items in a set can be listed explicitly by using the set former. For example, the definition of type bit has the following form:

  bit ::subtype(natural) is {0,1};

Because {0,1} is an element of subtype (natural), this represents a legal type definition. Further, the new bit type inherits operations from the natural numbers. Other set formation operations, such as intersection and difference, can be used to form types in the same manner as for the set former. They are used far less often, but still remain useful. For example, the type containing only positive integers, posint, can be defined as:

   posint ::subtype (natural) is sel (x::natural | x>0);

using the selection function to filter out all values greater than 0.

As noted earlier, there is some danger in defining new types whose values are shared with other existing types. When processing specifications, Rosetta tools must determine the type of –1, 0, and 1. If the new bit type is defined without reference to integer, the only type that can be inferred is top. To assist tools, type ascription can be used to indicate the type explicitly:

    1::tri;

identifies 1 as the value from tri, while:

   1::integer;

identifies 1 as the value from integer. With two types associated with 1 and no common supertype other than top, this annotation is required in virtually all situations. Using an enumeration to define new types like this will avoid this problem.

Enumerated Types

Enumerations are types whose values can be specified by explicitly listing them. Unlike definitions using set enumeration, the values of an enumerated type cannot be defined previously and are subtypes of element only. Defining an enumerated type is achieved by declaring the type and defining a collection of values associated with it using the enumeration former:

   colors:: type is enumeration[red,yellow,blue];

The notation yellow defines a new item whose lexical representation is yellow and whose value is also yellow. Thus, yellow becomes a new value in the current value space. Enumerations in Rosetta differ from enumerations in traditional programming languages, as the new type definition does not imply an ordering on its elements. Ordering must be introduced separately if needed. Any variable whose type is an enumerated type is an element and thus a scalar.

Because enumeration values are fresh and cannot exist in the Rosetta type space prior to the enumeration declaration, the problem with ambiguous types for values is avoided. Ascription can be used in the same manner as described previously, but it need not be, as each enumeration value is associated with exactly one type. It is possible to define subtypes consisting of enumeration values. This reintroduces the need for ascription.

The semantics for enumerated types is defined by elaborating the type declaration to a constructed type. Described later in this chapter, constructed types define a type by specifying a collection of constructors for all elements of that type. An enumeration is simply a shorthand for defining constructed types with constant constructors.

Example 8.1. Defining and Using a Three-Valued Logic Using Enumerations

Using enumerated types, we can define a three-valued logic that adds an unknown value to the normal logical high and low values. The declaration of this type using enumerated types is:

   tri ::type is enumeration [high,low,unknown];

We can now define standard operations over this type using high, low, and unknown:

    tri_and(x,y::tri )::tri is
     case x of
       {high} -> y
       | {low} -> low
       | {unknown} -> if y=low then low else unknown end if
     end case;

     tri_or (x,y::tri)::tri is
     case x of
       {high} -> high
       | {low} -> y
       | {unknown} -> if y=high then high else unknown end if
     end case;

tri_and and try_or define the logic of conjunction and disjunction for this three-valued logic. Other functions can be defined similarly. Using these definitions, we can define facet models for and and or gates, respectively:

   facet tri_and_gate(x,y::input tri ; z::output tri)::state_based is
    begin
        update: z' = tri_and(x,y)
    end facet tri_and_gate;

    facet tri_or_gate(x,y::input tri; z::output tri)::state_based is
    begin
        update: z' = tri_or(x,y);
    end facet tri_and_gate;

Defining Types By Comprehension

Defining types by comprehension involves filtering or transforming existing types to create new subtypes. Using the sel function or filter functions defines new types by filtering values from old types. Using the ran or image functions defines new types by transforming old types.

Using the Selection Function

An excellent example of defining types by comprehension is the definition of the built-in type natural. Ideally, natural should include the integer values that are greater than or equal to zero. Using this fact, the standard definition for natural numbers is:

  natural :: subtype(integer) is sel (x::integer | x >= 0);

In this type definition, the comprehension operator sel chooses all elements of the type integer that satisfy the relationship x >= 0. Because natural is a subtype of integer, all operations defined on integer are also defined on natural.

Similarly, the sel operator can be used to form a type called byte consisting of 8-bit bitvectors. Here the selection operator will be used again, with the length of the bitvector being checked:

  byte :: subtype (bitvector) is sel (b::bitvector | #b=8);

In this definition, the selection operator chooses elements from the bitvector type that are of length 8.

Using the Range Function

While the selection operator chooses elements from a type, the range operation transforms each element of a type. An example of using ran to define a new type is a potential definition of the even integers. Instead of using the comprehension operator, here the ran operation is used as an image function:

   even :: subtype(integer) is ran(x::integer | 2*x);

Here the even numbers are generated by multiplying each integer value by 2.

As demonstrated here, the selection and range operations are excellent choices for defining new types by comprehension. Other higher-order functions, such as forall and exists, are far less useful, as they evaluate to single boolean values.

Sets as Types — A Caution

As noted earlier, any Rosetta set can be used as the value of a type. This has numerous advantages, not the least of which is clarity. However, it quite easy to misuse this capability to develop specifications that are exceptionally difficult to analyze. Using operations such as set union, intersection, and difference to form new types can introduce specification complexities that make analysis virtually impossible. We have already seen the need to use type annotations to disambiguate literals shared between types. New types formed from arbitrary sets must be used carefully, with full understanding of their purpose and how they impact analysis.

Defining Constructed Types

The constructed type definition syntax provides a mechanism for defining constructor, observer, and recognizer functions in a single definitional notation. Constructor functions create values associated with the new type. These values behave as any other values and are treated as normal forms that are not evaluated. Together, the collection of constructor functions can create every value associated with the type. The observer functions define properties associated with values of the new type. Observer functions specify and calculate properties for constructed type values. Finally, recognizer functions are special predicates that indicate the constructor used to create a value. Summarizing: (i) constructor functions create new values, (ii) observer functions define properties, and (iii) recognizer functions indicate how the value was created.

As an example of constructed type definition, consider the following definition for a binary tree of integers:

    intTree :: type is data
      nil::empty |
      node(L::intTree,v:integer,R::intTree )::nonempty;
    end data;

The item intTree is defined to be a new, fresh type using the type indicator. This declaration simply says that the new type’s supertype is subtype(top). The is clause is used to give the type a value and the data keyword is used to indicate that the value of this type will be given a value.

The constructed type definition provides two constructors for intTree: (i) the constructor function nil and (ii) the constructor function node. The nil function creates an empty tree and is recognized by the boolean function empty. The recognizer function is defined so that if its argument is the nil constructor function, it will return true, otherwise it returns false. The node constructor creates a nonempty tree from a value and a left and right subtree. Note that the type is recursive in that intTree is referenced in the definition of node. The nonempty recognizer is true whenever its argument was created by a call to the node function. In addition, the observer functions L, v, and R observe parameters of the node constructor.

A tree with one node whose value is 0 is defined using the node constructor:

   node(nil,0,nil);

The recognizers empty and nonempty indicate the constructor used to generate a tree. Specifically, empty is true if its argument is nil and nonempty is true if its argument is an instantiation of the node constructor. The recognizers nonempty and empty evaluate to:

    nonempty(node(nil,0,nil)) == true
    empty(node(nil,0,nil)) == false

A balanced tree with 0 as the root, and 1 and 2 as the left and right nodes, respectively, is constructed as follows:

   node(node(nil,1,nil),0,node(nil,2,nil ));

Here, the node constructor is used to create nonempty left and right subtrees.

Parameter names are used to generate observer functions that return actual parameters from constructor functions. These functions return the actual parameter instantiation of their associated formal parameter. For example, when evaluated on the node constructor, the observers L, R, and v evaluate to:

   L(node(nil,0,nil )) == nil
   R(node(nil,0,nil )) == nil
   v(node(nil,0,nil )) == 0

In this sense, the constructed value behaves much like a record— so much so that Rosetta uses constructed types in lieu of traditional record structures.

The primary purposes for constructors, observers, and recognizers should be clear from these examples. Constructors create values, observers make observations over values, and recognizers partition the value set. In if and other conditional statements, the recognizers are used to guard the use of observers. For example:

   if nonempty(t) then L(t) else nil end if

determines if t is empty and accesses the left subtree if it is not. If t were empty, then L would be undefined.

Constructed type definitions have the form:

   T :: type is data
      c(parameters)::r
      [[ | c(parameters) ::r ]]*
   end data;

where T names the new type and each subsequent constructor definition specifies a constructor name, a collection of typed parameters, and a recognizer name. The format of each constructor declaration is identical to a function definition. The parameters identify the names and types of observers, and the recognizer names a predicate that is true when its argument is constructed with the associated constructor.

Like any other function, constructor functions can be curried. If this is the case, then the results of applying observer functions associated with uninstantiated parameters are not defined. However, constructor functions are not evaluated in the traditional fashion. When a constructor function is fully instantiated, it becomes a value in the Rosetta value space. Thus, it cannot be further evaluated.

The ranges of constructor functions are disjoint, implying that each element of the constructed type is created by exactly one constructor. This implies that each value associated with a constructed type can have only one form. Furthermore, each constructor name must be unique. This allows Rosetta tools to automatically determine what type a constructed value is associated with.

Example 8.2. Defining Records using constructed types

In Rosetta, no special syntax for defining records is defined, as record structures follow directly from constructed types. A record type is a constructed type with a single constructor function that associates values with parameters used as field names. A typical record type will be defined with the following constructive technique:

    record::type is data
      recordFormer(f0::T0 | f1::T1 |... fn::Tn)::recordp;
    end data;

where recordFormer is the single constructor, f0 through fn are the names of the various fields, and T0 through Tn are the types associated with those fields. The recognizer recordp is also defined, but is largely unused, as there is only one constructor. Any constructed type definition providing only a single constructor is referred to as a record. To define a specific record type that represents Cartesian coordinates, the following notation is used:

    cartesian::type is data
      cartFormer(x::real, y::real, z::real)::cartp;
    end data;

To define an item of this type, the standard Rosetta declaration syntax is used:

  c ::cartesian;

Values can be associated with record items using the canonical is form:

   origin ::cartesian is cartFormer(0,0,0);

Accessing individual fields of the record is achieved by applying one of the observer functions associated with a field name. To access field y in the record c, the following notation is used:

   y(c)

Defining a coordinate in Cartesian space using this definition is achieved by:

   cartFormer(1,0,0);

Accessing elements of the constructed coordinate is achieved using observer functions:

   x(cartFormer(1,0,0))==1;
   y(cartFormer(1,0,0))==0;
   z(cartFormer(1,0,0))==0;

The following notation creates a Cartesian coordinate whose x and y values are known, but whose z value is not specified:

   cartFormer(1,0);

cartFormer(1,0) produces a function of one parameter that returns the completed Cartesian value. Should the function z(cartFormer(1,0)) be specified, it should not pass type checking because cartFormer(1,0) is a function, not a Cartesian value.

Example 8.3. Semantics of Enumerations

Similar to records, it is quite easy to define enumerations using constructed types. The enumeration declaration:

   colors ::type is [red,yellow,blue]

is semantically equivalent to:

    colors ::type is data
               red()::redp
               | yellow()::yellowp
               | blue()::bluep;
    end data;

The constructed type defines three nullary constructors that are now values in the value space. Further, three recognizers are provided that specify what constructor produces a value. In this case, the recognizers are not particularly useful because the equals relationship performs the same function. Specifically:

   redp(x) == x=red()

In a sense, an enumeration is the opposite of a record. A record is defined using one constructor with parameters to indicate record elements. An enumeration is defined using many constructors with no parameters to define different elements.

Functions as Type Definition Tools

Using functions to define types allows the introduction of parameters in type definitions. Because types are first-class items in Rosetta, any function returning a set can be used to define a Rosetta parameterized type. Consider the following function definition:

   word(n::natural)::subtype(bitvector) is
     sel (b::bitvector | #b = n);

The function signature defines a mapping from natural to a subtype of the built-in bitvector type. The subtype is defined by the sel operation to be those bitvectors whose lengths are equal to the parameter n. Thus, word evaluates to the set of bitvectors of length equal to its parameter. We can now use word as a type definition construct. The notation:

   reg::word(8);

defines reg to be a bitvector of length 8. The notation:

   bv8::subtype(bitvector) is word(8);

defines bv8 to be the type containing all bitvectors of length 8.

When using functions as type definition tools it is important to understand that they, themselves, are not types. They generate types, but they are not usable as types prior to evaluation. The word type is an excellent example. Altough word(8) is a type, word is simply a function making the declaration w::word meaningless.

Example 8.4. Defining Generic Trees

The intTree example demonstrates a capability for defining a tree containing integer values that can easily be modified to define a tree containing other types. Simply replace the integer type with the item type to be contained in the tree. Using functions and constructed types together, a general tree type definition function can be developed that allows definition of trees containing arbitrary types.

The modified tree definition has the following form:

  tree(T::type) ::type is data
    nil ::empty |
    node(L::tree(T),v:T,R::tree(T ))::nonempty;
  end data;

This definition uses the standard function definition format, specifying a function signature and an expression. In this case, tree is defined to take a single, arbitrary type and return another type. The value of the returned type is obtained by evaluating the function’s associated expression. In this case, the data keyword indicates a constructed type definition. If the tree definition is instantiated with a type, the result is a tree data type that contains the instantiated type. For example:

  intTree::type is tree(integer);

is semantically equal to the intTree provided earlier in this section. If the parameter T is replaced by integer in the expression, the same type definition results. The advantage here is that the new tree function can easily and quickly be used to define new tree types with new contents. The tree function is not a type, but simply a function whose result value is a type value. Unfortunately, it is not possible to define a new item t::tree. The tree function must be fully instantiated.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.192.151