© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
C. MilanesiBeginning Rusthttps://doi.org/10.1007/978-1-4842-7208-4_12

12. Data Implementation

Carlo Milanesi1  
(1)
Bergamo, Italy
 
In this chapter, you will learn:
  • How to know how many bytes of stack are taken by objects of various types

  • How to shorten the path to access functions declared in external modules

  • How bits are stored in primitive type objects

  • How to know where an object is stored in memory

  • Why padding can increase the size taken by some objects

  • How vectors are implemented

Discovering the Size of Objects

Given a source file, the Rust compiler is free to generate any machine code, as long as it behaves in the way specified by the Rust language for that source code.

Therefore, for example, given a variable, how many memory bits it uses, and where it is located in memory are not defined. The compiler could even remove that variable from memory because it is never used, or because it is kept in a processor register.

However, it is instructive to see a possible typical implementation of the arrangement of the data used by a Rust program.

For such a purpose, some Rust features are available:
print!("{} ", std::mem::size_of::<i32>());
print!("{} ", std::mem::size_of_val(&12));

This will print: 4 4 .

In the first statement, the compiler enters the standard library module std (shorthand for standard), then it enters its submodule mem (shorthand for memory), and then it takes its generic function size_of.

The compiler concretizes such a generic function using the type i32, and then it generates the invocation of such resulting concrete function without passing any arguments. Such a function will return the number of octets of bits occupied by any object of the specified type. For more than four decades, such octets have been unambiguously named bytes. In the first years of computer technology, there were computers for which the name byte was used for shorter sequences of bits. Anyway, such very old computer architectures are not supported by Rust.

Typically , such function invocation will be inlined, and so the code generated is just a constant number. Actually, a 32-bit number occupies 4 bytes.

Notice that you may invoke this function even if in the program there is no object of the specified type.

In the second statement, the compiler enters in the same library module, but it accesses the generic function size_of_val (meaning size of value). In this case, the type of the parameter needed to concretize the generic function is inferred from the argument, so it wasn’t required to specify it explicitly. Instead, in the first statement there were no arguments, so the type parameter was required.

When the concretized function size_of_val is invoked, an immutable reference to an object is passed to it. The function returns the size in bytes of that object.

The use Directive

If the path to reach a library function must be specified several times, it is convenient to import all or part of such path into the current scope, using the use directive.

The previous example can be rewritten in this way:
use std::mem;
print!("{} ", mem::size_of::<i32>());
print!("{} ", mem::size_of_val(&12));
or in this way:
use std::mem::size_of;
use std::mem::size_of_val;
print!("{} ", size_of::<i32>());
print!("{} ", size_of_val(&12));

The Rust use keyword is similar to the C++ using keyword.

There is an even more compact form of it:
use std::mem::*;
print!("{} ", size_of::<i32>());
print!("{} ", size_of_val(&12));

The asterisk is a wildcard that causes all the names at that level to be imported.

The Sizes of the Primitive Types

Now you should imagine the sizes of the objects having a primitive type:
use std::mem::*;
print!("{} {} {} {} {} {} {} {} {} {} {} {} {} {}",
    size_of::<i8>(),
    size_of::<u8>(),
    size_of::<i16>(),
    size_of::<u16>(),
    size_of::<i32>(),
    size_of::<u32>(),
    size_of::<i64>(),
    size_of::<u64>(),
    size_of::<i128>(),
    size_of::<u128>(),
    size_of::<f32>(),
    size_of::<f64>(),
    size_of::<bool>(),
    size_of::<char>());

In any computer, this will print: 1 1 2 2 4 4 8 8 16 16 4 8 1 4.

Some other data types have sizes that depend on the target platform of the compiler:
use std::mem::*;
print!("{} {} {} {}",
    size_of::<isize>(),
    size_of::<usize>(),
    size_of::<&i8>(),
    size_of::<&u32>());

In a 64-bit system, this will print: 8 8 8 8, while in a 32-bit system, it will print: 4 4 4 4.

The last two printed values are references. Independently from the referenced object, a reference (a.k.a. pointer) has the size of a memory address.

The Representation of Primitive Types

Rust discourages accessing the internal representation of objects, so it is not easy to do; but there is a trick to do that:
fn as_bytes<T>(o: &T) -> &[u8] {
    unsafe {
        std::slice::from_raw_parts(
            o as *const _ as *const u8,
            std::mem::size_of::<T>())
    }
}
println!("{:?}", as_bytes(&1i8));
println!("{:?}", as_bytes(&2i16));
println!("{:?}", as_bytes(&3i32));
println!("{:?}", as_bytes(&(4i64 + 5 * 256 + 6 * 256 * 256)));
println!("{:?}", as_bytes(&'A'));
println!("{:?}", as_bytes(&true));
println!("{:?}", as_bytes(&&1i8));
In an x86_64 system, this could print:
[1]
[2, 0]
[3, 0, 0, 0]
[4, 5, 6, 0, 0, 0, 0, 0]
[65, 0, 0, 0]
[1]
[129, 165, 54, 102, 23, 86, 0, 0]

The generic function as_bytes uses some Rust constructs we haven’t seen yet, and that will not be explained here, because their knowledge is not required to understand what it does. It simply takes a reference to an argument of any type, and returns an object that represents the sequence of bytes contained in such an object. By printing such an object, you can see the representation of any object as the sequence of bytes it stores in memory.

First, an i8 having value 1 is stored in a single byte. And that is just the same in any supported hardware architecture.

Then, an i16 having value 2 is stored as a pair of bytes, of which the first one is 2 and the second one is 0. This happens both on 32-bit and 64-bit processors, but only in the so-called little-endian hardware architectures, which are those that store multibyte numbers placing the least significant byte at the lowest address. Instead, a big-endian hardware architecture would have printed [0, 2].

Similar behavior appears in the following printed lines.

Notice that a char is stored as a 32-bit number containing the Unicode value of such character, and a bool is stored as a single byte, which is 1 for true and 0 for false.

Finally, the last statement prints the address of an i8 number. Such an address occupies eight bytes for a 64-bit processor, and differs from run to run.

Location of Bytes in Memory

You can also discover the (virtual) memory location of any object, which is its address:
let b1 = true;
let b2 = true;
let b3 = false;
print!("{} {} {}",
    &b1 as *const bool as usize,
    &b2 as *const bool as usize,
    &b3 as *const bool as usize);

In a 64-bit system, this will print three huge numbers, resembling 140729279188109 140729279188110 140729279188111. But in a 32-bit system, it will print three numbers smaller than five billion.

Also, the constructs to get such numbers will not be explained here.

However, there is a simpler way to print the address of any object, in hexadecimal notation:
let b1 = true;
let b2 = true;
let b3 = false;
print!("{:p} {:p} {:p}", &b1, &b2, &b3);

The p format specifier is shorthand for pointer.

It will print something like:
0x7ffe16b20c8d 0x7ffe16b20c8e 0x7ffe16b20c8f
Here is a representation of the location of the three objects in the preceding code.

Absolute Address

Binary Value

Variable Name

Type

140729279188109

0000_0001

b1

bool

140729279188110

0000_0001

b2

bool

140729279188111

0000_0000

b3

bool

Each one of the three objects occupies just one byte. The first printed number is the address of the b1 variable; the second one is the address of the b2 variable; and the third one is the address of the b3 variable. As it appears, the three numbers are consecutive, and that means that the three objects are allocated in contiguous virtual memory locations.

The first number contains the true Boolean value, which is represented by the 1 byte, which in turn consists of seven bits having zero value and one bit having one value. Also the second object contains the true value. Instead, the third object contains the false value, represented by eight bits having zero value.

When allocating single bytes, the Rust compiler usually lays them out sequentially and contiguously, but when allocating larger objects, their position in memory is not easily predictable.

Almost all modern processors require that elementary data have particular memory locations, so Rust places its objects so that they are easily accessible by the processors.

The typical alignment rule is this: “Every object of a primitive type must have an address that is a multiple of its own size.”

So, while the objects occupying just one byte can be placed anywhere, the objects occupying two bytes can be placed only at even addresses; the objects occupying four bytes can be placed only at addresses that are divisible by four; and the objects occupying eight bytes can be places only at addresses that are a multiple of eight.

In addition, larger objects often have addresses that are a multiple of sixteen.

Therefore, such alignment requirements can create unused spaces, the so-called padding.

Sizes of Composite Data Types

The effect of padding appears when there is a sequence of composite objects:
enum E1 { E1a, E1b }
enum E2 { E2a, E2b(f64) }
use std::mem::*;
print!("{} {} {} {} {} {}",
    size_of_val(&[0i16; 80]),
    size_of_val(&(0i16, 0i64)),
    size_of_val(&[(0i16, 0i64); 100]),
    size_of_val(&E1::E1a),
    size_of_val(&E2::E2a),
    size_of_val(&vec![(0i16, 0i64); 100]));

This will print: 160 16 1600 1 16 24.

This means that:
  • An array of 80 16-bit numbers occupies 160 bytes, which is 80 * 2, so there is no waste.

  • A tuple of a 16-bit number and a 64-bit number occupies 16 bytes, as if both numbers would occupy 8 bytes, so a padding of 6 bytes has been added.

  • An array of 100 16-byte tuples occupies 1600 bytes, so there is no padding between array items, but the size of every item is multiplied by the length of the array.

  • An enum having all its variants without data fields occupies just one byte; as long as there are no more than 256 variants, a single byte can contain a value that identifies a variant; such hidden value is usually named tag.

  • An enum whose largest variant contains an 8-byte number occupies 16 bytes, even if the current value has no data; this is because any value of a variant must occupy the same size, that is, the size of the largest possible value of that enum. In this case, the second variant has a 1-byte tag and an 8-byte numeric field; the address of this field must be a multiple of 8, so a 7-byte padding is added between the tag and the field.

  • A vector of 100 16-byte tuples looks like it occupies just 24 bytes, but of course something is missing from this measure.

Let’s see just the case of the vector.

The data placed in the stack must have a size known at compile time, so arrays can be fully allocated in the stack; while for vectors only a fixed-size header can be placed in the stack, and the remaining data must be allocated in the heap.

Vector Allocation

We saw that vectors must be implemented as a two-object structure: a stack-allocated fixed-size header, and a heap-allocated variable-length buffer.

There are theoretically several possible ways to implement a vector data structure.

One way would be to keep in the header only a pointer to the buffer.

That has the disadvantage that every time the length of the array is desired, an indirection level would be required. The length of arrays is needed quite often, implicitly or explicitly, so it is better to keep such information in the header.

A naive way to implement the buffer would be to size it just large enough to keep the desired data. For example, if a vector of 9 i32 items is requested, a buffer of 9 * 4 bytes would be allocated in the heap.

As long as the vector does not grow, this is good. But if another item is pushed into the vector, the buffer must be reallocated, and heap allocations and deallocation are costly. In addition, the old buffer contents must be copied to the new buffer.

If a 1,000-item vector is constructed, an item at a time, by creating an empty vector and calling the push function 1,000 times, there will be 1,000 heap allocations, 999 heap deallocations, and 1000 * 999 / 2 == 499_500 copies of items.

To improve such awful performance, a larger buffer may be allocated, so that a reallocation is performed only when that buffer is not enough.

Therefore, there is the need to track both the number of places in the allocated buffer and the number of used places in that buffer.

The number of places in the allocated buffer is usually called capacity , and that is also the name of the function used to access such number. Here is some code that displays the size of the buffer of a vector:
let mut v = vec![0; 0];
println!("{} {}", v.len(), v.capacity());
v.push(11);
println!("{} {}", v.len(), v.capacity());
v.push(22);
println!("{} {}", v.len(), v.capacity());
v.push(33);
println!("{} {}", v.len(), v.capacity());
v.push(44);
println!("{} {}", v.len(), v.capacity());
v.push(55);
println!("{} {}", v.len(), v.capacity());
This will possibly print:
0 0
1 4
2 4
3 4
4 4
5 8

When an empty vector is created, it contains zero items, and it hasn’t even allocated a heap buffer, so its capacity is also zero.

When the first item is added, the vector object allocates in the heap a buffer capable of containing four 32-bit numbers (i.e., a 16-byte buffer), so its capacity is 4, but it effectively contains just one item, so its length is 1.

When three other items are added to the vector, there is no need to allocate memory, as the preallocated buffer is large enough to contain them.

But when the fifth item is added to the vector, a larger buffer must be allocated. The new buffer has a capacity of 8 items.

Notice that this growing strategy of the buffer is an implementation detail, and it may change in future implementations of the standard library.

The v object stores three subobjects in the stack: a pointer to the heap-allocated buffer, which is a memory address; the capacity of such buffer as a number of items, which is a usize number; and the length of the vector as a number of items, which is a usize number that is less than or equal to the capacity.

For this reason, the header of any vector occupies 3 * 8 == 24 bytes in any 64-bit system, and 3 * 4 == 12 bytes in any 32-bit system.

Let’s see what happens if we add one thousand items to a vector of 32-bit numbers:
let mut v = vec![0; 0];
let mut prev_capacity = std::usize::MAX;
for i in 0..1_000 {
    let cap = v.capacity();
    if cap != prev_capacity {
        println!("{} {} {}", i, v.len(), cap);
        prev_capacity = cap;
    }
    v.push(1);
}
This could print:
0 0 0
1 1 4
5 5 8
9 9 16
17 17 32
33 33 64
65 65 128
129 129 256
257 257 512
513 513 1024

Well, it also will print the same for vectors whose items are of most other types.

The variable cap stores the current capacity of the vector; the variable prev_capacity stores the previous capacity of the vector, and it is initialized to a huge value.

At each iteration, before adding an item to the vector, it is checked to see if the capacity has changed. Each time the capacity changes, both the number of inserted items and the current capacity are printed.

It appears that the capacity is always a power of two, and all the powers of two are passed, just skipping the value 2. In this way, there are just 9 allocations, 8 deallocations, and 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 == 1020 copies. So, the actual standard library algorithm is much more efficient than the naive version described at the beginning of this section.

In general, filling a vector by pushing one item at a time causes a number of allocations and deallocation that is a logarithm of the total number of items, so a number that is much smaller than the number of items, for a rather large number of items.

So, adding an item at a time is quite efficient. Though, if you can estimate the number of items that will be contained in the vector, you can improve this efficiency by preallocating the required buffer, like this code does:
let mut v = Vec::with_capacity(800);
let mut prev_capacity = std::usize::MAX;
for i in 0..1_000 {
    let cap = v.capacity();
    if cap != prev_capacity {
        println!("{} {} {}", i, v.len(), cap);
        prev_capacity = cap;
    }
    v.push(1);
}
It will print:
0 0 800
801 801 1600

This program is identical to the previous one, except for the first line. The function Vec::with_capacity creates a vector with a preallocated buffer, large enough to contain the specified number of items. It specified a capacity of 800 items, so until such capacity is exceeded, no allocation happens after the initialization. When the 801st item is pushed, the capacity is bumped to 1600.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.24.196