© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
C. MilanesiBeginning Rusthttps://doi.org/10.1007/978-1-4842-7208-4_8

8. Using Heterogeneous Data Structures

Carlo Milanesi1  
(1)
Bergamo, Italy
 
In this chapter you will learn how to define and use other composite types:
  • Tuples

  • Structs

  • Tuple-structs

They are useful to group objects of different types.

At the end of the chapter, you’ll see some code style conventions.

The Tuples

Arrays and vectors can contain several items, yet such items must all be of the same type. If you wish to store in a single object several subobjects of different types, you can do it in this way:
let data = (10000000, 183.19, 'Q');
let copy_of_data = data;
print!("{}, {}, {}",
    data.0, copy_of_data.1, data.2);

This will print: 10000000, 183.19, Q.

The “data” variable is a composite object, as it is composed of three objects. Even arrays are composite objects, but they are constrained to be composed of objects of the same type, while the “data” variable is composed of objects of different types: an integer number, a floating-point number, and a character.

Therefore, our object is not an array but a tuple.

The declaration of tuples looks like that of arrays. The only difference is that round parentheses are used instead of square brackets.

Each item of a tuple is named field.

Also the type of tuples can be made explicit:
let data: (i32, f64, char) = (10000000, 183.19, 'Q');

The type has the same format as the expression giving the value, where, for each field, the expression giving the value is replaced by its type.

As shown in the second statement, an entire tuple may be used to initialize another tuple of the same type.

You can access the fields of a tuple by their position, using the dot notation. When accessing the seventh item of the “arr” array, you must write “arr[6]”; to access the seventh field of the “data” tuple, you must write “data.6”.

Also, tuples can be mutable:
let mut data = (10000000, 183.19, 'Q');
data.0 = -5;
data.2 = 'x';
print!("{}, {}, {}", data.0, data.1, data.2);

This will print: -5, 183.19, x.

Similarly to arrays, tuples can also have any number of fields, including zero. Because to write the type of a tuple you write the sequence of the types of its fields, enclosed in parentheses, then, if there are no fields, only the parentheses remain. Therefore, to write the type of a tuple with no fields, you just write “().”

But in Chapter 6, we already saw this type and this value. So now it is explained why they are named empty tuples.

A difference between tuples and arrays is that tuples cannot be accessed by a variable index. Consider this code:
let array = [12, 13, 14];
let tuple = (12, 13, 14);
let i = 0;
print!("{}", array[i]);
print!("{}", tuple.i);

The compiler generates, at the last line, the error: no field `i` on type `({integer}, {integer}, {integer})`. There is no way to get the value of a field of a tuple using an index determined at runtime.

The Structs

Tuples are useful as long as they contain no more than a handful of items, but when they have many fields, it is too easy to mistake them, and the code that uses them is hard to understand:
let data = (10, 'x', 12, 183.19, 'Q', false, -9);
print!("{}", data.2 + data.6);

Is it clear that this code will print 3?

In addition, the type of any tuple is defined only by the sequence of the types of its fields, and if there are many fields, the types are too long to specify and hard to understand:
let data1 = (10, 'x', 12, 183.19, 'Q', false, -9);
let mut data2: (u16, char, i16, f64, bool, char, i16);
data2 = data1;

This code is illegal. Can you spot the error?

In addition, if a field is added at the beginning of a tuple, all the indexes to objects of such a type must be incremented in source code. For example, data.2 must become data.3. For instance, the first example of this section would become:
let data = ("first", 10, 'x', 12, 183.19, 'Q', false, -9);
print!("{}", data.3 + data.7);
Therefore, it comes out very useful to have a specific statement to declare the type of a structure, giving it a name, and labeling all the fields of that structure. Here is an example of it:
struct SomeData {
    integer: i32,
    fractional: f32,
    character: char,
    five_bytes: [u8; 5],
}
let data = SomeData {
    integer: 10_000_000,
    fractional: 183.19,
    character: 'Q',
    five_bytes: [9, 0, 250, 60, 200],
};
print!("{}, {}, {}, {}",
    data.five_bytes[3], data.integer,
    data.fractional, data.character);

This will print: 60, 10000000, 183.19, Q.

The first statement occupies six lines: it starts with the struct keyword and proceeds with a block. Its effect is to declare the SomeData type . Any object of that type is a sequence of four fields. For each field, its name and its type are declared, separated by a colon. The list of field declarations is comma separated, with an optional ending comma. Let’s name struct such a kind of data type.

Also the second statement occupies six lines. It declares the variable “data” and initializes it with an object of the type just declared. Notice that the initialization syntax looks like the type declaration syntax, where the struct keyword is removed, and each field type is replaced by an expression whose value is to be assigned to such a field. Let’s name struct-object such kind of an object, which is any object whose type is a struct.

The third statement accesses the fields of the just-defined struct-object, using the so-called dot notation . This notation consists of an expression representing the struct-object, followed by a dot, followed by the name of the field to access.

This code is similar to the following C language program:
#include <stdio.h>
int main() {
    struct SomeData {
        int integer;
        float fractional;
        char character;
        unsigned char five_bytes[5];
    };
    struct SomeData data = {
        10000000,
        183.19,
        'Q',
        {9, 0, 250, 60, 200},
    };
    printf("%d, %d, %g, %c",
        data.five_bytes[3], data.integer,
        data.fractional, data.character);
    return 0;
}

Let’s see where this C code differs from the preceding Rust code.

While in C the fields are separated by semicolons, in Rust they are separated by commas.

In Rust, the type is written after the name of the field, like in the Pascal language.

In C you can declare several fields of the same type, by specifying the type just once, that is, in this way: int a, b;. In Rust, instead, you must specify the type once for every field, in this way: a: i32, b: i32,.

In C the initialization of “data” is done simply by listing the values, similarly to Rust tuples. In Rust, instead, for each field you must specify also the name of that field.

Both in C and in Rust, the dot notation is used.

If you declare a variable as mutable, you can also change the values of its fields, using the same dot notation:
struct SomeData {
    integer: i32,
    fractional: f32,
}
let mut data = SomeData {
    integer: 10,
    fractional: 183.19,
};
data.fractional = 8.2;
print!("{}, {}", data.fractional, data.integer);

This will print: 8.2, 10.

Like tuples, structs may be empty also, so you can declare a struct containing no fields, and then a variable of such type:
struct NoData {}
let _no_data = NoData {};

The Tuple-Structs

We already saw that if you want to define a structure containing objects of different types, you have two possibilities:
  • Create a tuple, whose type has no name, it was not previously declared, and whose fields have no name.

  • Create a struct, whose type has a name, it must have been previously declared, and whose fields have a name.

So, there are several differences between these two kinds of structures. Yet sometimes something halfway is needed: a kind of structure whose types have names and must be previously declared, like structs, but whose fields have no name, like tuples. Because they are a hybrid between tuples and structs, they are named tuple-structs:
struct SomeData (
    i32,
    f32,
    char,
    [u8; 5],
);
let data = SomeData (
    10_000_000,
    183.19,
    'Q',
    [9, 0, 250, 60, 200],
);
print!("{}, {}, {}, {}",
    data.2, data.0, data.1, data.3[2]);

This will print: Q, 10000000, 183.19, 250.

As shown in the example, the tuple-struct is defined before instantiating it, by using the keyword struct like for structs, but enclosing its fields in parentheses instead of braces, and without specifying the names of the fields, like for tuples. The initialization starts with the name of the type, like for structs, but it goes on like for tuples.

Its fields are accessed like those of tuples, because they have no name.

Differing from both tuples and structs, empty tuple-structs are not allowed.

Actually, tuple-structs are not used often.

Lexical Conventions

Now that we have seen a good deal of different Rust constructs (but not yet all of them), it is a good time to think about some lexical conventions adopted by almost every Rust programmer, so they are strongly recommended to everyone. Such conventions are so entrenched, that even the compiler emits a warning if they are violated.

Here is a program showing them:
const MAXIMUM_POWER: u16 = 600;
#[allow(dead_code)]
enum VehicleKind {
    Motorcycle,
    Car,
    Truck,
}
#[allow(dead_code)]
struct VehicleData {
    kind: VehicleKind,
    registration_year: u16,
    registration_month: u8,
    power: u16,
}
let vehicle = VehicleData {
    kind: VehicleKind::Car,
    registration_year: 2003,
    registration_month: 11,
    power: 120,
};
if vehicle.power > MAXIMUM_POWER {
    println!("Too powerful");
}
The conventions shown in this example are:
  • Names of constants (for example: MAXIMUM_POWER) contain only uppercase characters, with words separated by underscore. This convention is usually named screaming snake case, or upper snake case, or simply upper case.

  • Type names defined by application code or by the standard library (for example: VehicleKind and VehicleData) and enum variant names (for example: Car) are comprised of words stuck together, where every word has an uppercase initial letter, followed by lowercase letters. This convention is usually named upper camel case or PascalCase.

  • Any other name (for example, keywords like let, primitive types like u8, and field identifiers like registration_year) use only lowercase letters, with words separated by underscores. This convention is usually named snake case.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.175.164