Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

5. Using Data Sequences

Carlo Milanesi¹

(1)

Bergamo, Italy

In this chapter, you will learn:

How to define sequences of objects of the same type, having fixed length (arrays) or variable length (vectors)
How to specify the initial contents of arrays or vectors, by listing the items or by specifying one item and its repeat count
How to read or write the value of single items of arrays or vectors
How to add items to a vector or to remove items from a vector
How to create arrays with several dimensions
How to create empty arrays or vectors
How to print or copy whole arrays or vectors
How to specify whether the compiler should deny, allow with a warning, or silently allow possible programming errors
What is a panic, and why it may be generated when accessing an array or a vector

Arrays

So far, we have seen how to store in a variable a string, a number, or a Boolean. But if you want to store several strings into a single variable, you can write this:

let x = ["English", "This", "sentence", "a", "in", "is"];

print!("{} {} {} {} {} {}",

x[1], x[5], x[3], x[2], x[4], x[0]);

That will print: This is a sentence in English.

The first statement declares the x variable as an immutable object made of an array of six objects, specified in the statement itself. Each of such objects is of string type. Such a compound kind of object is indeed named array.

The second statement contains six expressions, each of them making a read access to a different element of x. Such accesses specify the item to be accessed by means of a positional index, or subscript, put between brackets. Notice that indexes always start from zero, and so, as the array has six elements, the index of the last item is 5. This behavior is similar to that of C language arrays.

To determine how many elements there are in an array, you can do this:

let a = [true, false];

let b = [1, 2, 3, 4, 5];

print!("{}, {}.", a.len(), b.len());

It will print: 2, 5.

The a variable is an array of two Booleans, while the b variable is an array of five integer numbers.

The third statement invokes on the objects a and b the len function of the standard library, to get the number of the objects contained in the array. Both the syntax and semantics of these invocations are similar to those of the expression "abc".len(), used to get the length in bytes of a string.

Notice that in the examples, every array contains elements of the same type; only strings, only Booleans, or only integer numbers.

If you try to write

let x = ["This", 4];

let x = [4, 5.];

you get a mismatched types compilation error, indicating the array cannot contain objects of different types.

It is possible to create arrays of any types of items, as long as in every array all the items are of the same type.

It is so, because there is not a single type array. The concept of array is that of a generic type, which is parameterized by its items type, and also by the number of its items. So, in the first example of the chapter, the variable x is of type “array of 5 strings”; instead, in the second example, a is of type “array of 2 Booleans,” and b is of type “array of 5 integer numbers.”

In the following program, every line but the first one will generate a compilation error:

let mut x = ["a"]; // array of strings

x[0] = 3;

x[-1] = "b";

x[0.] = "b";

x[false] = "b";

x["0"] = "b";

The second statement is wrong because it tries to assign an integer number to an item of an array of strings. So the compiler emits a mismatched types error message.

The other statements are wrong because the compiler expects any index used to access an array to be a non-negative integer number, and the expressions used in such statements are not.

The following one is a different case:

let x = ["a"]; // array of strings

let _y = x[1];

It has a valid syntax, but nevertheless it is wrong, because the second statement tries to access the second element of an array having only one element.

The compiler error message is: this operation will panic at runtime . It means that this statement would generate a runtime error when executed. The compiler message also contains the explanation: index out of bounds: the length is 1 but the index is 1.

Rust Attributes

The Rust compiler is somewhat flexible about the strictness of analysis in the Rust language. For some possible error conditions, like not using a variable after its last assignment, the compiler allows the developer to specify three possible behavior of compilation:

Deny: The compiler emits an error message, and it does not generate the executable code.
Warn: The compiler emits a warning message, but it does generate the executable code.
Allow: The compiler does not emit any message, and it does generate the executable code.

To specify such behaviors, the developer must write into the code specific statements, named attributes.

Here is an example:

#[deny(unused_variables)]

let x = 1;

#[warn(unused_variables)]

let y = 2;

#[allow(unused_variables)]

let z = 3;

When compiling this code, the following error is emitted:

error: unused variable: `x`

Then the following warning is emitted:

warning: unused variable: `y`

And then no executable code is generated, because there is at least one error.

The last statement does not generate any output.

This code contains three attributes. Any attribute usually is on a line by itself. It begins with a “#” character followed by an expression in brackets. It applies to the statement immediately following it.

In the cases shown, the expressions are the word deny or warn or allow, followed by an identifier in parentheses. The identifier must be a known compiler option. If this case it is unused_variables; that means that this attribute specifies how the compiler must handle the case that the value of the variable declared in the following statement is not used after the last assignment.

If a compiler option is not specified by an attribute, it has a default value anyway. The default value of the unused_variables option is warn. It is for this reason that if you don’t use the value of a variable after its last assignment, the generated warning is followed by the following line:

note: `#[warn(unused_variables)]` on by default

Instead of specifying an attribute just before any declaration, it is possible to declare it before a function definition, in the following way:

#[allow(unused_variables)]

fn main() {

let x = 1;

let y = 2;

}

This whole program does not generate any warnings, because the attribute in the first line applies to the whole main function, and therefore to both the x and y variables.

Another compiler option regards runtime errors.

Panicking

Accessing an array with a too large index has notoriously undefined behavior in C language. Instead, the Rust language has the goal of avoiding any undefined behaviors.

To avoid such an undefined behavior, the compiler inserts into the generated executable code some statements to check whether any access to arrays use a proper index, that is, an index less than the length of the array. In case an out-of-bounds index is actually used, such checking statements generate an abort operation for the current thread, that, by default, causes the immediate termination of the thread itself. If the process has just one thread, such thread termination causes the termination of the whole process.

In Rust jargon, such an abort operation is named panic, and the action of aborting the current thread is named panicking.

In general, the compiler cannot detect out-of-bound indexing errors, because array indices may have variable values; though, consider the following code, already cited before:

let x = ["a"];

let _y = x[1];

print!("End");

Here the compiler knows both the size of the array and the value of the index, because they are constants, and so it can foresee that the generated code will panic at runtime.

Therefore, when you compile this code, the compiler emits an error message containing the following sentences.

First:

error: this operation will panic at runtime

Then:

index out of bounds: the length is 1 but the index is 1

And then:

note: `#[deny(unconditional_panic)]` on by default

The first sentence states what the compiler has foreseen, as a compilation error.

The second sentence explains why this operation will panic.

The third statement explains that, by default, such condition generates a compile-time error. The identifier unconditional_panic means a condition in which, if this statement is executed, it will surely generate a panic at runtime; therefore it is probably wrong, or in any case it should be replaced by a more explicit error message.

You can obtain an executable program if you write, instead:

let x = ["a"];

#[warn(unconditional_panic)]

let _y = x[1];

print!("End");

It will generate the compilation message:

warning: this operation will panic at runtime

If you write the following code instead:

let x = ["a"];

#[allow(unconditional_panic)]

let _y = x[1];

print!("End");

you will get no compilation error or warning.

In both cases, when you run the program you will see on the terminal a message starting with this:

thread 'main' panicked at

'index out of bounds: the len is 1 but the index is 1'

After that, the process will terminate without printing the End word.

Notice that such abnormal termination is not caused by the operating system, like a segmentation violation, but it is caused by instructions inserted by the Rust compiler itself into the executable program.

Mutable Arrays

Now get back to what we can do with arrays.

The modification of the items of an array is possible only on mutable arrays:

let mut x = ["This", "is", "a", "sentence"];

x[2] = "a nice";

print!("{} {} {} {}.", x[0], x[1], x[2], x[3]);

This will print: This is a nice sentence.

The first statement contains the mut keyword, and the second statement assigns the new value to the third item of the array. This operation is allowed by the compiler, because the following three conditions hold:

The x variable is mutable.
The type of the new value assigned is the same as the other items of x. Actually, they are all strings.
The index is a non-negative integer number.

In addition, at runtime, the operation is performed without panicking, because the index is less than the array size, that is, the condition 2 < 4 holds.

Instead, it is not allowed to add items to an array or to remove items from an array. Therefore, its length is a compile-time-defined constant.

A mutable variable of an array type, as any mutable variable, can be the target of an assignment from another array:

let mut x = ["a", "b", "c"];

print!("{}{}{}. ", x[0], x[1], x[2]);

x = ["X", "Y", "Z"];

print!("{}{}{}. ", x[0], x[1], x[2]);

let y = ["1", "2", "3"];

x = y;

print!("{}{}{}.", x[0], x[1], x[2]);

This will print: abc. XYZ. 123.

In the first line, an array is created and assigned to the x variable. In the third line, another array is created and assigned to the same x variable, therefore replacing all the three existing strings. In the fifth line, another array is created and assigned to the new y variable. In the sixth line, such array is assigned to the x variable, therefore replacing the three existing values.

If x weren’t mutable, two compilation errors would be generated: one at the third line and one at the sixth.

The following code generates “mismatched types” compilation errors both at the second and at the third lines:

let mut x = ["a", "b", "c"];

x = ["X", "Y"];

x = [15, 16, 17];

Actually, because of the first line, x is of type “array of three elements of type string.” The statement in the second line tries to assign to x a value of type “array of two elements of type string”; in this case, the number of elements is different, even if the type of each element is the same.

The statement in the third line tries to assign to x a value of type “array of three elements of type integer number”; in this case, the number of the elements is the same, but the type of each element is different.

In both cases, the type of the whole value that is being assigned is different from the type of the target variable.

Arrays of Explicitly Specified Size

We already saw how to create an array by listing the items initially contained.

If you want to handle many items, instead of writing many expressions, you can write

let mut x = [4.; 5000];

x[2000] = 3.14;

print!("{}, {}", x[1000], x[2000]);

That will print: 4, 3.14.

The first statement declares the x variable as a mutable array of 5000 floating-point numbers, all initially equal to 4. Notice the usage of a semicolon instead of a comma inside the square brackets.

The second statement assigns the value 3.14 to the item at position 2000 of such array.

Finally the value, which never changed, at position 1000, and the value, which just changed, at position 2000 are printed. Notice that the valid indexes for this array go from 0 to 4999.

To scan the items of an array, the for-statement is very useful:

let mut fib = [1; 12];

for i in 2..fib.len() {

fib[i] = fib[i - 2] + fib[i - 1];

}

for i in 0..fib.len() {

print!("{}, ", fib[i]);

}

This will print: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144,.

This program computes the first 12 numbers of the famous Fibonacci sequence and then it prints them. The sequence is defined in this way: the first two numbers are both 1, and every other number is the sum of the two preceding numbers.

Examine the program.

The first statement creates a variable named fib and initializes it using an array of 12 numbers with value 1.

The statement in the following three lines is a for loop, in which the i variable is initialized to 2, and it is incremented up to the length of the array. The body of the loop assigns to each item of the array, starting from the third one, the sum of the two preceding items. Given that, when each item is written, the preceding items have already received their correct value, such assignment always uses correct values.

Finally, there is another for loop, to print all the numbers; this time its index starts from 0.

Multidimensional Arrays

You can easily write arrays having several dimensions:

let mut x = [[[23; 4]; 8]; 15];

x[14][7][3] = 56;

print!("{}, {}", x[0][0][0], x[14][7][3]);

This will print: 23, 56.

The first statement declares an array of 15 items, each of them being an array of 8 items, each of them being an array of 4 items, each of them initialized with the integer number 23.

The second statement accesses the 15th and last item of the array, so referring an array; then it accesses the 8th and last item of the array, so getting an array; then it accesses the 4th and last item of such array, so getting an item of type integer number; finally, it assigns the integer number 56 to that item.

The third statement prints the content of the very first item and the content of the very last item of the array.

Because multidimensional arrays are no more than arrays of arrays, it is easy to get the sizes of a given array:

let x = [[[0; 4]; 8]; 15];

print!("{}, {}, {}.",

x.len(), x[0].len(), x[0][0].len());

This will print: 15, 8, 4.

A big limitation of arrays is the fact that their size must be defined at compilation time:

let length = 6;

let arr = [0; length];

The compilation of this code generates the error attempt to use a non-constant value in a constant. Actually, the expression length is a variable, and therefore conceptually it is not a compile-time constant, even if it is immutable and even if it has just been initialized by a constant. The size of an array cannot be an expression containing variables.

This is because, in general, immutable variables can have a value that is computed only at runtime, even if in this case it can be easily computed at compilation time. Instead, the size of an array must always be known by the compiler at compilation time.

Vectors

To create sequences of objects whose size is defined at runtime, the Rust standard library provides the Vec type, shorthand for vector:

let x = vec!["This", "is"];

print!("{} {}. Length: {}.", x[0], x[1], x.len());

This will print: This is. Length: 2.

The first statement looks like the creation of an immutable array, with the only difference being the appearance of the vec! clause.

Such a clause is an invocation of the vec macro of the standard library. vec is an abbreviation of vector.

The effect of such a macro is indeed to create a vector, initially containing the two strings specified between square brackets. Actually, when len() is invoked on such an object, 2 is returned.

Vectors allow us to do everything that is allowed for arrays, but they allow also us to change their size after they have been initialized:

let mut x = vec!["This", "is"]; print!("{}", x.len());

x.push("a"); print!(" {}", x.len());

x.push("sentence"); print!(" {}", x.len());

x[0] = "That";

for i in 0..x.len() { print!(" {}", x[i]); }

This will print: 2 3 4 That is a sentence.

The first line creates a mutable vector, initially containing two strings, and prints its length, meaning the number of strings contained in the vector.

The second line invokes the push function on the just created vector, and prints its new length. Such a function adds its argument at the bottom of the vector. To be legal, it must have exactly one argument, and such argument must be of the same type of the items of the vector. The word push is the word commonly used for the operation of adding an item to a stack data structure.

The third line adds another string at the end of the vector, so making the vector contain four items, and it prints the new length of the vector.

The fourth line replaces the value of the first item of the vector. This operation and the two preceding ones are allowed only because the x variable is mutable.

The fifth line scans the four items of the vector, and prints them all on the terminal.

Let’s see another example:

let length = 5000;

let mut y = vec![4.; length];

y[6] = 3.14;

y.push(4.89);

print!("{}, {}, {}", y[6], y[4999], y[5000]);

This will print: 3.14, 4, 4.89.

The second line declares a variable having as value a vector containing a sequence of 5000 items, each one of them having value 4. The length of the sequence is specified by the variable length. This wouldn’t be allowed with an array.

The third line changes the value of the seventh item of the vector.

The fourth line adds a new item at the end of the vector. Because the vector had 5000 items, the new item will get position 5000, that is, the last position of any vector having 5001 items.

Finally, three items are printed: the changed item, the item that was the last before the addition, and the just added item.

So we saw that, differing from arrays, you can create vectors with a length defined at runtime, and you can change their length at runtime, after initialization. Also vectors, like arrays, have generic types, but while the type of every array is defined by two parameters, a type and a length, the type of every vector is defined by a single parameter, the type of its elements. The length of every vector is variable at runtime, so it does not belong to the type, as in Rust all types are defined only at compile time.

Therefore, this program:

let mut _x = vec!["a", "b", "c"];

_x = vec!["X", "Y"];

is valid, as a vector of strings is assigned to a vector of strings, even if they have different lengths. However, this one:

let mut _x = vec!["a", "b", "c"];

_x = vec![15, 16, 17];

is illegal, as a vector of numbers cannot be assigned to a vector of strings.

As vectors can do all that arrays can do, what’s the need to use arrays? The answer is that arrays are more efficient, so if at compile time you know how many items to put in your collection, you get a faster program by using an array instead of a vector.

Those who know the C++ language should have imagined that Rust arrays are equivalent to C++ std::array objects, while Rust vectors are equivalent to C++ std::vector objects.