© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
C. MilanesiBeginning Rusthttps://doi.org/10.1007/978-1-4842-7208-4_15

15. Ranges and Slices

Carlo Milanesi1  
(1)
Bergamo, Italy
 
In this chapter, you will learn:
  • How to use closed ranges and open-ended ranges, both right-exclusive and right-inclusive

  • How to define and use slices of arrays, vectors, or other slices, using ranges

The Ranges

We already saw a way to write a for loop:
for i in 0..12 { println!("{}", i); }
But there is another possible way to write it:
let dozen = 0..12;
for i in dozen { println!("{}", i); }

This shows that the 0..12 clause is not a part of the for statement syntax, but it is an expression, whose value can be assigned to a variable; that value can be used in for statements. The type of such a value is named range.

Here is some more code using a range:
let range: std::ops::Range<usize> = 3..8;
println!("{:?}, {}, {}, {}",
    range, range.start, range.end, range.len());
for i in range { print!("{}, ", i); }
This will print:
3..8, 3, 8, 5
3, 4, 5, 6, 7,

In the first line, we see that any range is a concretization of the Range<T> generic type, where T must be an integer type able to represent both extremes of the range.

The second statement prints some values about the variable range. First, the variable itself is printed for debugging, obtaining 3..8; then the values of the two fields’ start and end of range are printed, obtaining, respectively, 3 and 8. This shows that the type Range contains these two fields. Indeed, it does not contain anything else.

Then the len function is invoked, which simply evaluates the expression end - start, so it evaluates 8 - 3 and prints 5.

At last, that range is used in a for loop to scan the values from start included to end excluded. Notice that the iterated values are as many as the value returned by the invocation of len.

The parametric type T of the type Range<T> can be inferred by the two arguments:
let r1 = 3u8..12u8;
let r2 = 3u8..12;
let r3 = 3..12u8;
let r4 = 3..12;
let r5 = -3..12;
let r6 = 3..12 as i64;
print!(
    "{} {} {} {} {} {}",
    std::mem::size_of_val(&r1),
    std::mem::size_of_val(&r2),
    std::mem::size_of_val(&r3),
    std::mem::size_of_val(&r4),
    std::mem::size_of_val(&r5),
    std::mem::size_of_val(&r6));

This will print: 2 2 2 8 8 16.

The r1 variable has both extremes declared as u8, so they have that type, which occupies one byte, and the whole range occupies two bytes.

The r2 and r3 variables have one extreme declared as u8 and the other left unspecified. Therefore it is forced to be u8 too.

The r4 and r5 variables have both extremes of type unspecified, and there is no further constraint on such variables, so they get the default i32 type, which is used for the T parameters of the ranges.

The r6 variable has the second extreme of type i64 and the first extreme unconstrained, so T must be i64.

Notice that all the following statements are illegal:
let r1 = 3u8..12i8;
let r2: std::ops::Range<u32> = -3..12;
let r3: std::ops::Range<i32> = 3i16..12;

In the first statement, the two extremes have different types. (It is a mismatched types error.)

In the second statement, -3 is not a value of type u32. (It is a cannot apply unary operator `-` to type `u32` error.)

In the third statement, 3i16 is not a value of type i32. (It is a mismatched types error.)

Also the following statement is not allowed with default compiler settings, because it generate integer overflow:
let _r = 3u8..1200;

The range will use the first extreme to infer the type as Range<u8>, but when the second extreme is found to be larger than 255, the error literal out of range for `u8` is printed.

The following statements are allowed, even if they are probably nonsensical:
let _r1 = false .. true;
let _r2 = "hello" .. "world";
let _r3 = 4.2 .. 7.9;

Indeed, such absurd ranges cannot be used in a for loop.

Motivation for Slices

Let’s assume you need to create a function that gets as argument an eight-number array, and that returns the smallest number in this array. For such a purpose, you could write this program:
fn min(arr: [i32; 8]) -> i32 {
    let mut minimum = arr[0];
    for i in 1..arr.len() {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
print!("{}", min([23, 17, 12, 16, 15, 28, 17, 30]));
This program will correctly print: 12. However, such min function has some drawbacks:
  1. 1.

    It gets as argument a copy of the whole array, requiring a significant time to transfer it, and occupying a significant stack space and cache space.

     
  2. 2.

    It cannot receive the request to process just a portion of the array.

     
  3. 3.

    It can receive only eight-number arrays. If we would pass an array of seven or nine items, we would get a compilation error.

     
  4. 4.

    It cannot receive a vector as argument.

     
To overcome the first drawback, you can pass the array by reference, instead of by value, using the following code:
fn min(arr: &[i32; 8]) -> i32 {
    let mut minimum = arr[0];
    for i in 1..arr.len() {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
print!("{}", min(&[23, 17, 12, 16, 15, 28, 17, 30]));

Two “&” characters have been added, one where the argument is declared, in the first line; and one where the function is invoked, in the last line. As we have already seen, it is not necessary to change the body of the function, as the arr reference is implicitly dereferenced.

To overcome the second drawback, you could add an argument to specify from which item to start processing, and another argument to specify how many arguments to process:
fn min(arr: &[i32; 8], start: usize, count: usize) -> i32 {
    // Let's assume 'start' is between 0 and 7,
    // and 'count' is between 1 and 8 - start.
    let mut minimum = arr[start];
    for i in start + 1..start + count {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
print!("{}", min(&[23, 17, 12, 16, 15, 28, 17, 30], 3, 2));

This will print: 15. Indeed, it is specified to process two items starting from the one in position 3 (counting from 0). Therefore, only the two items having values 16 and 15 are processed.

However, two drawbacks remain.

Consider that our function needs to know only from which memory address to start processing, how many items to process, and which is the type of the items of the sequence. It is not required to know whether such sequence is a part of a larger sequence, and even less where such larger sequence starts and ends.

In addition, consider that any vector keeps its data in a heap-allocated array, so such a function could process it, once it knew where the items are to process.

The Slices

Considering all this, and to overcome all the cited drawbacks, the concept of slice has been introduced in the language. Its syntax is that of references:
fn min(arr: &[i32]) -> i32 {
    // Let's assume 'arr' is not empty.
    let mut minimum = arr[0];
    for i in 1..arr.len() {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
print!("{}", min(&[23, 17, 12, 16, 15, 28, 17, 30]));

This program will also print 12, but it has the following single difference from the second program in the previous section: in the type of the argument, the “; 8” clause has disappeared. Now the type of the arr argument looks like a reference to an array, but without specifying the size of the array.

This kind of type is a reference to a slice, or slice reference. Its generic form is “&[T],” where T represents any type that can be contained in an array. Here, slice means subsequence of items inside a sequence of items. Such a larger sequence can be an array or a vector buffer. For this purpose, the implementation of a slice reference is a pair of values: the address of the first item of the sequence, and the number of items.

Notice that usually we have variables whose type is slice reference, but the use of a slice type, without considering a reference to it, is quite rare. A slice would have type “[T],” but such type cannot be passed as an argument to a function, because it has a size not defined at compilation time, and a requirement of the arguments of functions is that they have a compile-time defined size. Therefore, we can pass to a function only references to slices, not slices. Such objects are pairs of a pointer and a length, so their memory occupation is exactly twice that of normal references.

Using a slice reference is quite similar to using an array. The main implementation difference is that an invocation of the len function on an array can be optimized away, replacing it by the constant length defined by the type of the array, while an invocation of the len function on a reference to a slice is implemented as an access to the second field of such object.

Actually, in previous chapters we already saw something very similar to slices and to slice references: string buffers, and static strings.

We can build this table of similarities among the types str, &str, String, [u8], &[u8], and Vec<u8>.

Buffer

Reference to a Buffer

Handler of a Dynamically Allocated Buffer

Undefined-length sequence of bytes

(address of beginning, length in bytes)

(address of beginning, length in bytes, number of bytes used)

String buffer: str

Static string: &str

Dynamic string: String

Slice of bytes: [u8]

Reference to slice of bytes: &[u8]

Vector of bytes: Vec<u8>

There are three columns, one for each concept:
  • The first column is for buffers, that is, sequences of bytes having a length not necessarily defined at compile time.

  • The second column is for references to such buffers. They keep two fields: a memory address, and the current length in bytes of the referred buffer.

  • The third column is for objects that can handle (i.e., to allocate or to deallocate) buffers. They keep three fields: the two needed to refer to the allocated buffer, and the number of items currently used in the buffer.

The types in the first column are those having undefined length. They are the string buffers, whose type is str, and the slices of unsigned 8-bit numbers, whose type is [u8].

The types in the second column are the references to the types of the first column. They are the static strings , whose type is &str, and the references to slices of unsigned 8-bit numbers, whose type is &[u8].

The types in the third column are those of the objects that can dynamically allocate buffers. They are the dynamic strings, whose type is String, and the vectors of unsigned 8-bit numbers, whose type is Vec<u8>.

Going back to the last example program, notice that the first argument in the invocation of the min function is the same of the previous program. A reference to an array is still passed as argument. In fact, such an array reference is implicitly converted into a slice reference, using the array address as slice address and the array length as slice length.

Therefore, the last statement of the program passes to the function a structure of two fields: the first is the memory address of the array item containing the number 23, and the second is the number 8.

Using slices, flexibility is much increased. Indeed, it’s now possible to write:
fn min(arr: &[i32]) -> i32 {
    // Let's assume 'arr' is not empty.
    let mut minimum = arr[0];
    for i in 1..arr.len() {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
print!("{} ", min(&[23, 17]));
print!("{}", min(&vec![55, 22, 33, 44]));

This will print: 17 22.

The first invocation passes only two arguments, and 17 is the lesser of them. So, the min function is not limited anymore to process 8-item arrays, but it can process arrays and slices having any nonnegative length.

The second invocation of min shows how our function can process data also belonging to a vector, with no need to copy them. The value passed to the function is a simple reference to a vector, but, because the function argument is of type “reference to slice,” the argument becomes a reference to a slice representing the whole contents of the vector.

So, we have overcome all the drawbacks of the first solution except the second one, which is the need to separately specify which part of the sequence we want to process. Though, even this drawback can be overcome, as explained in the following section.

Slicing

Let’s say we have an array or a vector, for example, the array [23, 17, 12, 16, 15, 2], and a function that gets a slice as argument, for example, the min function seen in the preceding code, and we want to use such a function to process just a portion of our array or vector. Let’s say we want to find the minimum value among the third, fourth, and fifth items of that array.

What we need is a way to forge a slice as a portion of an array or vector, and so pass that slice to the min function, not necessarily the entire array or vector.

The syntax comes quite natural. To get the item of index 2 of an array arr or of a vector v, you’d write, respectively, arr[2] or v[2]. Well, to get all the items of index between 2 (included) and 5 (excluded) of such containers, you’d write, respectively, arr[2..5] or v[2..5].

Here is another use of our ranges!

So the program would be:
fn min(arr: &[i32]) -> i32 {
    // Let's assume 'arr' is not empty.
    let mut minimum = arr[0];
    for i in 1..arr.len() {
        if arr[i] < minimum { minimum = arr[i]; }
    }
    minimum
}
let arr = [23, 17, 12, 16, 15, 2];
let range = 2..5;
let slice_ref = &arr[range];
print!("{}", min(slice_ref));
This will print 12, which is the minimum among 12, 16, and 15. The last four lines can be merged into this one:
print!("{}", min(&[23, 17, 12, 16, 15, 2][2..5]));

The operation of taking a slice from an array or a vector is named slicing.

Notice that, like in the for loop, the upper extreme of the range is excluded from the generated slice. Indeed, specifying the range 2..5, the items included in the range are those in positions 2, 3, and 4, counting from zero.

The slicing operator can be applied to both arrays and vectors, but also to other slices:
let arr = [55, 22, 33, 44, 66, 7, 8];
let v = vec![55, 22, 33, 44, 66, 7, 8];
let sr1 = &arr[2..5];
let sr2 = &v[2..5];
print!("{:?} {:?} {:?} {:?}", sr1, sr2, &sr1[1..2], &sr1[1]);

This will print: [33, 44, 66] [33, 44, 66] [44] 44.

The sr1 variable is a reference to a slice, which refers to the third, fourth, and fifth items of the arr array.

The sr2 variable is a similar reference to a slice, but it refers to items in the v vector.

After having printed the items referred to by these slice references, a slice of the first slice is taken. It refers to the second item of that slice, which is the fourth item of the underlying array.

At last, the second item of sr1 is taken, by simple indexing.

Out-of-Range Slicing

In addition to normal slicing, one could do something even weirder:
let arr = [55, 22, 33, 44, 66];
let _r1 = 4..4; let _a1 = &arr[_r1];
let _r2 = 4..3; //let _a2 = &arr[_r2];
let _r3 = -3i32..2; //let _a3 = &arr[_r3];
let _r4 = 3..8; //let _a4 = &arr[_r4];

In this program, except for the first line, every line declares a range and then tries to use it to slice the array declared in the first line.

All the ranges are valid, but not all the slicing operations are valid, so some statements have been commented out.

The second line is perfectly valid. It gets a slice starting from position 4 and ending before position 4. So it is an empty slice, but empty slices are allowed.

The third line uses a backward slice that ends before starting. That is allowed by the compiler, but it causes a panic at runtime, like an out-of-range array access. The runtime error message, printed on the console, is slice index that starts at 4 but ends at 3. Try removing the comment symbol //, compile, and run to see this error, and then rewrite the comment symbol.

The fourth line uses a range whose extremes are of type i32. That causes a compilation error, because slicing, like sequence indexing, requires the usize type. The error message will be the type `[{integer}]` cannot be indexed by `std::ops::Range<i32>`. It means that our slice of integers, like any slice, cannot be indexed by a range whose extremes are of type i32.

The fifth line uses a range exceeding the size of the sequence. It can be compiled, but it causes a panic with the message range end index 8 out of range for slice of length 5.

Notice that all this has been shown when slicing an array, but it also holds for slicing vectors and for slicing slices.

Mutable Slicing

What does it mean to change the contents of a slice? A slice is a portion of another sequence, so changing its contents means changing the value of one or more items in the underlying sequence:
let mut arr = [11, 22, 33, 44];
{
    let sl_ref = &mut arr[1..3];
    print!("{:?}", sl_ref);
    sl_ref[1] = 0;
    print!(" {:?}", sl_ref);
}
print!(" {:?}", arr);

This will print: [22, 33] [22, 0] [11, 22, 0, 44].

The sl_ref variable is an immutable reference to a mutable slice. Therefore, the reference cannot be changed, but its referred slice can be changed, and that means you can change the value of its items. And so it is done two lines below, by assigning zero to the second item of the slice, which is the third item of the underlying array.

To be able to get a reference to a mutable slice, the underlying sequence must be mutable. This requires the mut clause in the first line.

What does it mean to change a slice reference? A slice reference is a kind of reference, so changing it means to cause it to refer to another portion of sequence, that is, to another portion of the same sequence or to a portion of another sequence:
let arr = [11, 22, 33, 44];
{
    let mut sl_ref = &arr[1..3];
    print!("{:?}", sl_ref);
    sl_ref = &arr[0..1];
    print!(" {:?}", sl_ref);
}
print!(" {:?}", arr);

This will print: [22, 33] [11] [11, 22, 33, 44].

In this program the arr variable is an immutable array, and indeed it will not be changed. The sl_ref variable is a mutable reference to an immutable slice. It is initialized to refer to the second and third items of the array, and then it is changed to refer to the first item of the array.

Open-Ended Ranges and Slicing

Sometimes it is desired to get all the items from the beginning of a sequence to a given position n, or all the items from a given position n to the end of a sequence.

It could be done in this way:
let arr = [11, 22, 33, 44];
let n = 2;
let sr1 = &arr[0..n];
let sr2 = &arr[n..arr.len()];
print!("{:?} {:?}", sr1, sr2);

This will print: [11, 22] [33, 44].

But it is simpler to write it equivalently in this way:
let arr = [11, 22, 33, 44];
let n = 2;
let sr1 = &arr[..n];
let sr2 = &arr[n..];
print!("{:?} {:?}", sr1, sr2);

In the third line, the range has no lower limit; and in the fourth line, the range has no upper limit.

Actually, these ranges have distinct types:
let r1: std::ops::RangeFrom<i32> = 3..;
let r2: std::ops::RangeTo<i32> = ..12;
print!("{:?} {:?} {} {}", r1, r2,
    std::mem::size_of_val(&r1),
    std::mem::size_of_val(&r2));

This will print: 3.. ..12 4 4. The r1 variable is of type RangeFrom, meaning that it has a lower limit but not an upper limit. The r2 variable is of type RangeTo, meaning that it has an upper limit but not a lower limit. Both occupy only four bytes, because they need to store only one i32 object.

RangeFrom values may be used also to specify for loops.
for i in 3.. {
    if i * i > 40 { break; }
    print!("{} ", i);
}

This will print: 3 4 5 6 . The loop starts by assigning to i the value 3, and keeps incrementing it indefinitely, or until some other statement breaks the loop.

There is one last generic type of ranges:
let range: std::ops::RangeFull = ..;
let a1 = [11, 22, 33, 44];
let a2 = &a1[range];
print!("{} {:?} {:?}", std::mem::size_of_val(&range), a1, a2);

This will print: 0 [11, 22, 33, 44] [11, 22, 33, 44].

Any RangeFull stores no information, and so its size is zero. It is used to specify a range exactly as large as the underlying sequence.

Inclusive Ranges

All the types of range we saw in this chapter exclude the upper extreme for their values, and so they are named right-exclusive ranges.

Actually they are the most useful kinds of ranges. Though, sometimes a range is needed that includes the upper extreme. We have seen in Chapter 4 that in addition to the right-exclusive range operator (..), there is a right-inclusive range operator (..=). This can be used to create a right-inclusive range, which can be used for slicing too:
let arr = [11, 12, 13, 14, 15];
let r1: std::ops::RangeInclusive<usize> = 2..=3;
print!("{:?} ", &arr[r1]);
let r2: std::ops::RangeToInclusive<usize> = ..=2;
print!("{:?}", &arr[r2]);

This will print: [13, 14] [11, 12, 13].

The second line declares and initializes a variable of type RangeInclusive<usize>, which contains the extremes 2 and 3; but it means to also represent the upper extreme, so when it is used to slice the array, it extracts the values 13 and 14.

The fourth line declares and initializes a variable of type RangeToInclusive<usize>, which contains only the upper extreme 2; but it means to represent also such extreme, and when it is used to slice the array, it extracts the values 11, 12, and 13.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.136.249