By the time you get this note, we’ll no longer be alive, we’ll have all gone up in smoke, there’ll be no way to reply.
They Might Be Giants
In Chapter 1, you wrote programs that always behave the same way.
In this chapter, you will learn how to use arguments from the user to change the behavior of the program at runtime.
The challenge program you’ll write in this chapter is a clone of echo
, which will print its arguments on the command line, optionally terminated with a newline.
In this chapter, you’ll learn:
How to process command-line arguments with the clap
crate
About STDOUT
and STDERR
About Rust types like strings, vectors, slices, and the unit type
How to use expressions like match
, if
, and return
How to use Option
to represent an optional value
How to handle errors using the Result
variants of Ok
and Err
How to exit a program and signal success or failure
The difference between stack and heap memory
This challenge program will be called echor
for echo
plus r
for Rust.
(I can’t decide if I pronounce this like eh-core or eh-koh-ar.)
I’m hoping you have created a directory to hold all the projects you’ll write in this book.
Change into that directory and use Cargo to start your crate, which is a Rust binary program or library:
$ cargo new echor Created binary (application) `echor` package
You should see a familiar directory structure:
$ tree echor/ echor/ ├── Cargo.toml └── src └── main.rs 1 directory, 2 files
Go into the new echor directory and use Cargo to run the program:
$ cd echor $ cargo run Hello, world!
You’ve already seen this program in Chapter 1, but I’d like to point out a couple more things about the code:
$ cat src/main.rs fn main() { println!("Hello, world!"); }
As you saw in Chapter 1, Rust will start the program by executing the main
function in src/main.rs.
Any arguments to the function are contained in the parentheses after the function name, so the empty parentheses here indicate the function takes no arguments.
All functions return a value, and the return type may be indicated with an arrow and the type such as -> u32
to say the function returns an unsigned 32-bit integer.
The lack of any return type for main
implies that the function returns what Rust calls the unit type.
Note that the println!
macro will automatically append a newline to the output, which is a feature I’ll need to control when the user requests no terminating newline.
The unit type is like an empty value and is signified with a set of empty parentheses ()
. The documentation says this “is used when there is no other meaningful value that could be returned.” It’s not quite like a null pointer or undefined value in other languages, a concept first introduced by Tony Hoare (no relational to Rust creator Graydon Hoare) who called the null reference his “billion-dollar mistake.” Since Rust does not (normally) allow you to dereference a null pointer, it must logically be worth at least a billion dollars.
The echo
program is blissfully simple.
I’ll review how it works so you can consider all the features your version will need.
To start, echo
will print its arguments to STDOUT
:
$ echo Hello Hello
I’m using the bash
shell which assumes that any number of spaces delimit the arguments, so arguments that have spaces must be enclosed in quotes.
In the following command, I’m providing four words as a single argument:
$ echo "Rust has assumed control" Rust has assumed control
Without the quotes, I’m providing four separate arguments.
Note that I use a varying number of spaces when I provide the arguments, but echo
prints them using a single space between each argument:
$ echo Rust has assumed control Rust has assumed control
If I want the spaces to be preserved, I must enclose them in quotes:
$ echo "Rust has assumed control" Rust has assumed control
It’s extremely common for command-line programs to respond to the flags -h
or --help
to print a message about how to use the program, the so-called usage because that is usually the first word of the output.
If I try that with echo
, I just get back the text of the flag:
$ echo --help --help
To understand more about the program, execute man echo
to read the manual page.
You’ll see that I’m using the BSD version of the program from 2003:
ECHO(1) BSD General Commands Manual ECHO(1) NAME echo -- write arguments to the standard output SYNOPSIS echo [-n] [string ...] DESCRIPTION The echo utility writes any specified operands, separated by single blank (' ') characters and followed by a newline (' ') character, to the stan- dard output. The following option is available: -n Do not print the trailing newline character. This may also be achieved by appending 'c' to the end of the string, as is done by iBCS2 compatible systems. Note that this option as well as the effect of 'c' are implementation-defined in IEEE Std 1003.1-2001 (''POSIX.1'') as amended by Cor. 1-2002. Applications aiming for maximum portability are strongly encouraged to use printf(1) to suppress the newline character. Some shells may provide a builtin echo command which is similar or iden- tical to this utility. Most notably, the builtin echo in sh(1) does not accept the -n option. Consult the builtin(1) manual page. EXIT STATUS The echo utility exits 0 on success, and >0 if an error occurs. SEE ALSO builtin(1), csh(1), printf(1), sh(1) STANDARDS The echo utility conforms to IEEE Std 1003.1-2001 (''POSIX.1'') as amended by Cor. 1-2002. BSD April 12, 2003 BSD
By default, the text printed on the command line is terminated by a newline character.
As shown in the preceding manual page, echo
has a single option -n
that will omit the newline.
Depending on the version of echo
you have, this may not appear to affect the output.
For instance, the BSD version I’m using shows this:
$ echo -n Hello Hello $
While the GNU version on Linux shows this:
$ echo -n Hello Hello$
Regardless which version of echo
, I can use the bash
redirect operator >
to send STDOUT
to a file:
$ echo Hello > hello $ echo -n Hello > hello-n
The diff
tool will display the differences between two files.
This output shows that the second file (hello-n) does not have a newline at the end:
$ diff hello hello-n 1c1 < Hello --- > Hello No newline at end of file
The first order of business is getting the command-line arguments to print.
In Rust you can use std::env::args
for this.
The std
tells me this is in the standard library, and is Rust code that is so universally useful it included with the language.
The env
part tells me this is for interacting with the environment, which is where the program will find the arguments.
If you look at the documentation for the function, you’ll see it returns something of the type Args
:
pub fn args() -> Args
If you follow the link for the Args
documentation, you’ll find it is a struct, which is a kind of data structure in Rust.
If you look along the left-hand side of the page, you’ll see things like trait implementations, other related structs, functions, and more.
I’ll explore these ideas later, but for now, just poke around the docs and try to absorb what you see.
Edit src/main.rs to print the arguments. I can call the function by using the full path followed by an empty set of parentheses:
fn main() { println!(std::env::args()); // This will not work }
When I execute the program using cargo run
, I see the following error:
$ cargo run Compiling echor v0.1.0 (/Users/kyclark/work/sysprog-rust/playground/echor) error: format argument must be a string literal --> src/main.rs:2:14 | 2 | println!(std::env::args()); // This will not work | ^^^^^^^^^^^^^^^^ | help: you might be missing a string literal to format with | 2 | println!("{}", std::env::args()); // This will not work | ^^^^^
Here is my first spat with the compiler.
It’s saying that it cannot directly print the value that is returned from that function, but it’s also telling me exactly how to fix the problem.
It wants me to first provide a literal string that has a set of curly brackets {}
that will serve as a placeholder for the printed value, so I change my code accordingly:
fn main() { println!("{}", std::env::args()); // This will not work either }
Run the program again and see that we’re not out of the woods yet. Note that I will omit the “Compiling” and other lines to focus on the important output:
$ cargo run error[E0277]: `Args` doesn't implement `std::fmt::Display` --> src/main.rs:2:20 | 2 | println!("{}", std::env::args()); // This will not work either | ^^^^^^^^^^^^^^^^ `Args` cannot be formatted with the default formatter | = help: the trait `std::fmt::Display` is not implemented for `Args` = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead = note: required by `std::fmt::Display::fmt` = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
There’s a lot of information in that compiler message.
First off, there’s something about the trait std::fmt::Display
not being implemented for Args
.
A trait in Rust is a way to define the behavior of an object in an abstract way.
If an object implements the Display
trait, then it can be formatted for “user-facing output.”
Look again at the “Trait Implementations” section of the Args
documentation and notice that, indeed, Display
is not mentioned there.
The compiler suggests I should use {:?}
instead of {}
for the placeholder.
This is an instruction to print a Debug
version of the structure, which will “format the output in a programmer-facing, debugging context.”
Refer again to the Args
documentation to see that Debug
is listed under “Trait Implementations,” so works:
fn main() { println!("{:?}", std::env::args()); // Success at last! }
Now the program compiles and prints something vaguely useful:
$ cargo run Args { inner: ["target/debug/echor"] }
If you are unfamiliar with command-line arguments, it’s common for the first value to be the path of the program itself. It’s not an argument per se, but it is useful information. One thing to note is that this program was compiled into the path target/debug/echor. Unless you state otherwise, this is where Cargo will place the executable, also called a binary. Next, I’ll pass some arguments:
$ cargo run Hello world Args { inner: ["target/debug/echor", "Hello", "world"] }
Huzzah!
It would appear that I’m able to get the arguments to my program.
I passed two arguments, Hello and world, and they showed up as additional values after the binary name.
I know I’ll need to pass the -n
flag, so let me try that:
$ cargo run Hello world -n Args { inner: ["target/debug/echor", "Hello", "world", "-n"] }
It’s also common to place the flag before the values, so let me try that:
$ cargo run -n Hello world error: Found argument '-n' which wasn't expected, or isn't valid in this context USAGE: cargo run [OPTIONS] [--] [args]... For more information try --help
That doesn’t work because Cargo thinks the -n
argument is for itself, not the program I’m running.
To fix this, I need to separate Cargo’s options using two dashes:
$ cargo run -- -n Hello world Args { inner: ["target/debug/echor", "-n", "Hello", "world"] }
In the parlance of program parameters, the -n
is an optional argument because you can leave it out.
Typically program options start with one or two dashes.
It’s common to have short names with one dash and a single character like -h
for the help flag and long names with two dashes and a word like --help
.
Specifically, -n
and -h
are flags that have one meaning when present and the opposite when absent.
In this case, -n
says to omit the trailing newline; otherwise, print as normal.
All the other arguments to echo
are positional because their position relative to the name of the program (the first element in the arguments) determines their meaning.
Consider the command chmod
that takes two positional arguments, a mode like 755
first and a file or directory name second.
In the case of echo
, all the positional arguments are interpreted as the text to print, and they should be printed in the same order they are given.
This is not a bad start, but the arguments to the programs in this book are going to become much more complex.
I need to find a better way to parse the program’s arguments.
Although there are various methods and crates for parsing command-line arguments, I will exclusively use the clap
crate in this book.
To get started, I need to tell Cargo that I want to download this crate and use it in my project.
I can do this by adding a dependency to Cargo.toml.
Edit your file to add clap
version 2.33 to the [dependencies]
section:
$ cat Cargo.toml [package] name = "echor" version = "0.1.0" edition = "2018" [dependencies] clap = "2.33"
The crate name is not quoted. The equal sign indicates the version of the crate, which should be quoted.
The version “2.33” means I want to use exactly this version. I could use just “2” to indicate that I’m fine using the latest version in the “2.x” line. There are many other ways to indicate the version, and I recommend you read how to specify dependencies.
The next time I try to build the program, Cargo will download the clap
source code (if needed) and all of its dependencies.
For instance, I can run cargo build
to just build the new binary and not run it:
$ cargo build Updating crates.io index Compiling libc v0.2.93 Compiling bitflags v1.2.1 Compiling unicode-width v0.1.8 Compiling vec_map v0.8.2 Compiling strsim v0.8.0 Compiling ansi_term v0.11.0 Compiling textwrap v0.11.0 Compiling atty v0.2.14 Compiling clap v2.33.3 Compiling echor v0.1.0 (/Users/kyclark/work/sysprog-rust/playground/echor) Finished dev [unoptimized + debuginfo] target(s) in 27.30s
You may be curious where these packages went.
Cargo places the download source code into $HOME/.cargo
, and the build artifacts go into the target/debug/deps directory.
This brings up an interesting part of building Rust projects: Each program you build can use different versions of crates, and each program is built in a separate directory.
If you have ever suffered through using shared modules as is common with Perl and Python, you’ll appreciate that you don’t have to worry about conflicts where one program requires some old obscure version and another requires the latest bleeding-edge version in GitHub.
Python, of course, offers virtual environments to combat this problem, and other languages have similar solutions.
Still, I find Rust’s approach to be quite comforting.
A consequence of Rust placing the dependencies into the target directory is that it’s now quite large.
I’m already at close to 30MB for the project directory, with almost all of that living in the target directory.
If you run cargo help
, you will see that the clean
command will remove the target directory at the expense of having to recompile again in the future.
You might do to reclaim disk space if you aren’t going to work on the project for a while.
To learn how to use clap
to parse the arguments, I need to read the documentation.
I like to use Docs.rs, “an open source documentation host for crates of the Rust Programming Language.”
I can follow examples from the documentation that show how to create a new App
struct.
Change your src/main.rs to the following:
use clap::App;fn main() { let _matches = App::new("echor")
.version("0.1.0")
.author("Ken Youens-Clark <kyclark@gmail.com>")
.about("Rust echo")
.get_matches();
}
Import the clap::App
struct.
Create a new App
with the name echor.
Use semantic version numbers as described earlier.
Your name and email address so people know where to send the money.
This is a short description of the program.
Ask the App
to parse the arguments.
In the preceding code, the leading underscore in the variable name _matches
is functional. It tells the Rust compiler that I do not intend to use this variable right now. Without the underscore, the compiler would warn about an unused variable.
With this code in place, I can run the program with the -h
or --help
flags to get a usage document.
Note that I didn’t have to define this argument as clap
did this for me:
$ cargo run -- -h echor 0.1.0Ken Youens-Clark <kyclark@gmail.com>
Rust echo
USAGE: echor FLAGS: -h, --help Prints help information -V, --version Prints version information
The app name and version
number appear here.
Here is the author
information.
This is the about
text.
In addition to the help flags, I see that clap
also automatically handles the flags -V
and --version
to print the program’s version:
$ cargo run -- --version echor 0.1.0
Next, I need to define the parameters which I can do by adding Arg
structs to the App
.
use clap::{App, Arg};fn main() { let matches = App::new("echor") .version("0.1.0") .author("Ken Youens-Clark <kyclark@gmail.com>") .about("Rust echo") .arg(
Arg::with_name("text") .value_name("TEXT") .help("Input text") .required(true) .min_values(1), ) .arg(
Arg::with_name("omit_newline") .help("Do not print newline") .takes_value(false) .short("n"), ) .get_matches();
println!("{:#?}", matches);
}
Import both the App
and Arg
structs from the clap
crate.
Create a new Arg
with the name text
. This is a required positional argument that must appear at least once and can be repeated.
Create a new Arg
with the name omit_newline
. This is a flag that has only the short name -n
and takes no value.
Parse the arguments and return the matching elements.
Pretty-print the arguments.
Earlier I used {:?}
to format the debug view of the arguments. Here I’m using {:#?}
to include newlines and indentations to help me read the output. This is called pretty-printing because, well, it’s prettier.
If you request the usage again, you will see the new parameters:
$ cargo run -- --help echor 0.1.0 Ken Youens-Clark <kyclark@gmail.com> Rust echo USAGE: echor [FLAGS] <TEXT>... FLAGS: -h, --help Prints help information -n Do not print newline-V, --version Prints version information ARGS: <TEXT>... Input text
The -n
flag to omit the newline is optional.
The required input text is one or more positional arguments.
Run the program with some arguments and inspect the structure of the arguments:
$ cargo run -- -n Hello world ArgMatches { args: { "text": MatchedArg { occurs: 2, indices: [ 2, 3, ], vals: [ "Hello", "world", ], }, "omit_newline": MatchedArg { occurs: 1, indices: [ 1, ], vals: [], }, }, subcommand: None, usage: Some( "USAGE: echor [FLAGS] <TEXT>...", ), }
If you run the program with no arguments, you will get an error indicating that you failed to provide the required arguments:
$ cargo run error: The following required arguments were not provided: <TEXT>... USAGE: echor [FLAGS] <TEXT>... For more information try --help
This was an error, and so you can inspect the exit value to verify that it’s not 0:
$ echo $? 1
If you try to provide any argument that isn’t defined, it will trigger an error and a nonzero exit value:
$ cargo run -- -x error: Found argument '-x' which wasn't expected, or isn't valid in this context USAGE: echor [FLAGS] <TEXT>... For more information try --help
You might wonder how this magical stuff is happening. Why is the program stopping and reporting these errors? If you read the documentation for App::get_matches
, you’ll see that “upon a failed parse an error will be displayed to the user and the process will exit with the appropriate error code.”
There’s another subtle thing happening with the output that is not at all obvious.
The usage and error messages are all appearing on STDERR
(pronounced standard error), which is another channel of Unix output.
To see this in the bash
shell, I can redirect channel 1
(STDOUT
) to a file called out and channel 2
(STDERR
) to a file called err:
$ cargo run 1>out 2>err
You should see no results from that command because all the output was redirected to files.
The out file should be empty because there was nothing printed to STDOUT
, but the err file should contain the output from Cargo and the error messages from the program:
$ cat err Finished dev [unoptimized + debuginfo] target(s) in 0.01s Running `target/debug/echor` error: The following required arguments were not provided: <TEXT>... USAGE: echor [FLAGS] <TEXT>... For more information try --help
So you see that another hallmark of well-behaved systems programs is to print regular output to STDOUT
and error messages to STDERR
.
Sometimes errors are severe enough that you should halt the program, but sometimes they should just be noted in the course of running.
For instance, in Chapter 3 you will write a program that processes input files, some of which will intentionally not exist or will be unreadable.
I will show you how to print warnings to STDERR
about these files and skip to the next argument.
My next step is to use the values provided by the user to create the program’s output.
It’s common to copy the values out of the matches
into variables.
To start, I want to extract the text
argument.
Because this Arg
was defined to accept one or more values, I can use either of these functions that return multiple values:
ArgMatches::values_of
: returns Option<Values>
ArgMatches::values_of_lossy
: returns Option<Vec<String>>
To decide which to use, I have to run down a few rabbit holes to understand the following concepts:
Option
: a value that is either None
or Some(T)
where T
is any type like a string or an integer. In the case of ArgMatches::values_of_lossy
, the type T
will be a vector of strings.
Values
: An iterator for getting multiple values out of an argument.
Vec
: A vector, which is a contiguous growable array type.
String
: A string of characters.
Both of the functions ArgMatches::values_of
and ArgMatches::values_of_lossy
will return an Option
of something.
Since I ultimately want to print the strings, I will use ArgMatches::values_of_lossy
function to get an Option<Vec<String>>
.
The Option::unwrap
function will take the value out of Some(T)
to get at the payload T
.
Because the text
argument is required by clap
, I know it will be impossible to have None
; therefore, I can safely call Option::unwrap
to get the Vec<String>
value:
let text = matches.values_of_lossy("text").unwrap();
If you call Option::unwrap
on a None
, it will cause a panic that will crash your program. You should only call unwrap
if you are positive the value is the Some
variant.
The omit_newline
argument is a bit easier as it’s either present or not.
The type of this value will be a bool
or Boolean, which is either true
or false
:
let omit_newline = matches.is_present("omit_newline");
Finally, I want to print the values.
Because text
is a vector of strings, I can use Vec::join
to join all the strings on a single space into a new string to print.
Inside the echor
program, clap
will be creating the vector.
To demonstrate how Vec::join
works, I’ll show you how create a vector using the vec!
macro:
let text = vec!["Hello", "world"];
The values in Rust vectors must all be of the same type. Dynamic languages often allow lists to mix types like strings and numbers, but Rust will complain about “mismatched types.” Here I want a list of literal strings which must be enclosed in double quotes. The str
type in Rust represents a valid UTF-8 string. I’ll have more to say about UTF in Chapter 4.
Vec::join
will insert the given string between all the elements of the vector to create a new string.
I can use println!
to print the new string to STDOUT
followed by a newline:
println!("{}", text.join(" "));
It’s common practice in Rust documentation to present facts using assert!
to say that something is true
or assert_eq!
to demonstrate that one thing is equivalent to another.
In the following code, I can assert that the result of text.join(" ")
is equal to the string "Hello world"
:
assert_eq!(text.join(" "), "Hello world");
When the -n
flag is present, the output should omit the newline.
I will instead use the print!
macro which does not add a newline, and I will choose to add either a newline or the empty string depending on the value of omit_newline
.
Depending on your background, you might try to write something like this:
let ending = " ";if omit_newline { ending = ""; // This will not work
} print!("{}{}", text.join(" "), ending);
Assign a default value.
Change the value if the newline should be omitted.
Use print!
which will not add a newline to the output.
If I try to run this code, Rust tells me that I cannot reassign the value of ending
:
$ cargo run -- Hello world error[E0384]: cannot assign twice to immutable variable `ending` --> src/main.rs:27:9 | 25 | let ending = " "; | ------ | | | first assignment to `ending` | help: make this binding mutable: `mut ending` 26 | if omit_newline { 27 | ending = ""; // This will not work | ^^^^^^^^^^^ cannot assign twice to immutable variable
Something that really sets Rust apart from other languages is that variables are immutable by default, meaning they can’t be altered from their initial value.
I’m not allowed to reassign the value of the variable ending
; however, the compiler tells me to add mut
to make the variable mutable:
let mut ending = " ";if omit_newline { ending = ""; } print!("{}{}", text.join(" "), ending);
There’s a much better way to write this.
In Rust, if
is an expression and not a statement.
An expression returns a value, but a statement does not.
Here’s a more Rustic way to write this:
let ending = if omit_newline { "" } else { " " };
An if
without an else
will return the unit type. The same is true for a function without a return type, so the main
function returns ()
.
Since I only use ending
in one place, I don’t need to assign it to a variable.
Here is how I would update the main
function:
fn main() { let matches = ...; // Same as before let text = matches.values_of_lossy("text").unwrap(); let omit_newline = matches.is_present("omit_newline"); print!("{}{}", text.join(" "), if omit_newline { "" } else { " " }); }
With these changes, the program appears to work correctly; however, I’m not willing to stake my reputation on this.
I need to, as the Russian saying goes, “Доверяй, но проверяй.”1
This requires that I write some tests to run my program with various inputs and verify that it produces the same output as the original echo
program.
In Chapter 1, I showed how to create integration tests that run the program from the command line just as the user will do to ensure it works correctly.
In this chapter, I’ll also show you how to write unit tests that exercise individual functions, which might be considered a unit of programming.
To get started, you should add the following dependencies to Cargo.toml.
Note that I’m adding predicates
to this project:
[dev-dependencies] assert_cmd = "1" predicates = "1"
I often write tests that ensure my programs fail when run incorrectly. For instance, this program ought to fail and print help documentation when provided no arguments. Create a tests directory, and then create tests/cli.rs with the following:
use assert_cmd::Command; use predicates::prelude::*;#[test] fn dies_no_args() { let mut cmd = Command::cargo_bin("echor").unwrap(); cmd.assert()
.failure() .stderr(predicate::str::contains("USAGE")); }
Import the predicates
crate.
Run the program with no arguments and assert that it fails and prints a usage statement to STDERR
.
I usually name these sorts of tests with the prefix dies so that I can run them all with cargo test dies
to ensure the program fails under various conditions.
I can also add a test to ensure the program exits successfully when provided an argument:
#[test] fn runs() { let mut cmd = Command::cargo_bin("echor").unwrap(); cmd.arg("hello").assert().success();}
I can now run cargo test
to verify that I have a program that runs, validates user input, and prints usage.
Next, I would like to ensure that the program creates the same output as echo
.
To start, I need to capture the output from the original echo
for various inputs so that I can compare these to the output from my program.
In the 02_echor directory of my GitHub repository, you’ll find a bash
script called mk-outs.sh that I used to generate the output from echo
for various arguments.
You can see that, even with such a simple tool, there’s still a decent amount of cyclomatic complexity, which refers to the various ways all the parameters can be combined.
I need to check one or more text arguments both with and without the newline option:
$ cat mk-outs.sh #!/usr/bin/env bashOUTDIR="tests/expected"
[[ ! -d "$OUTDIR" ]] && mkdir -p "$OUTDIR"
echo "Hello there" > $OUTDIR/hello1.txt
echo "Hello" "there" > $OUTDIR/hello2.txt
echo -n "Hello there" > $OUTDIR/hello1.n.txt
echo -n "Hello" "there" > $OUTDIR/hello2.n.txt
The “shebang” line tells the operating system to use the environment to execute bash
for the following code.
Define a variable for the output directory.
Test if the output directory does not exist and create it if needed.
One argument with two words.
Two arguments separated by more than one space.
One argument with two spaces and no newline.
Two arguments with no newline.
If you are working on a Unix platform, you can copy this program to your project directory and run it like so:
$ bash mk-outs.sh
It’s also possible to execute the program directly, but you may need to execute chmod +x mk-outs.sh
if you get a permission denied error:
$ ./mk-outs.sh
If this worked, you should now have a tests/expected directory with the following contents:
$ tree tests tests ├── cli.rs └── expected ├── hello1.n.txt ├── hello1.txt ├── hello2.n.txt └── hello2.txt 1 directory, 5 files
If you are working on a Windows platform, then I recommend you copy this directory structure into your project. Now you should have some test files to use in comparing the output from your program.
The first output file was generated with the input Hello there as a single string, and the output was captured into the file tests/expected/hello1.txt.
For my next test, I will run echor
with this argument and compare the output to the contents of that file.
Be sure add use std::fs
to tests/cli.rs bring in the standard file system module, and then replace the runs
function with this:
#[test] fn hello1() { let outfile = "tests/expected/hello1.txt";let expected = fs::read_to_string(outfile).unwrap();
let mut cmd = Command::cargo_bin("echor").unwrap();
cmd.arg("Hello there").assert().success().stdout(expected);
}
This is the output from echo
generated by mk-outs.sh
.
Use fs::read_to_string
to read the contents of the file. This returns a Result
that might contain a string if all goes well. Use the Result::unwrap
method with the assumption that this will work.
Create a Command
to run echor
in the current crate.
Run the program with the given argument and assert it finishes successfully and that STDOUT
is the expected value.
The fs::read_to_string
is a convenient way to read a file into memory, but it’s also an easy way to crash your program—and possibly your computer—if you happen to read a file that exceeds your available memory. You should only use this function with small files. As Ted Nelson says, “The good news about computers is that they do what you tell them to do. The bad news is that they do what you tell them to do.”
If you run cargo test
, you might see output like this:
running 2 tests test hello1 ... ok test dies_no_args ... ok
I’ve been using the Result::unwrap
method in such a way that assumes each fallible call will succeed.
For example, in the hello1
function, I assumed that the output file exists and can be opened and read into a string.
During my limited testing, this may be the case, but it’s dangerous to make such assumptions.
I’d rather be more cautious, so I’m going to create a type alias called TestResult
.
This will be a specific type of Result
that is either an Ok
which always contains the unit type or some value that implements the std::error::Error
trait:
type TestResult = Result<(), Box<dyn std::error::Error>>;
In the preceding code, Box
indicates that the error will live inside a kind of pointer where the memory is dynamically allocated on the heap rather than the stack, and dyn
“is used to highlight that calls to methods on the associated Trait are dynamically dispatched.”
That’s really a lot of information, and I don’t blame you if your eyes glazed over.
In short, I’m saying that the Ok
part of TestResult
will only ever hold the unit type, and the Err
part can hold anything that implements the std::error::Error
trait.
These concepts are more thoroughly explained in Programming Rust (O’Reilly, 2021).
This changes my test code in some subtle ways.
All the functions now indicate that they return a TestResult
.
Previously I used Result::unwrap
to unpack Ok
values and panic in the event of an Err
, causing the test to fail.
In the following code, I replace unwrap
with the ?
operator to either unpack an Ok
value or propagate the Err
value to the return type.
That is, this will cause the function to return the Err
variant of Option
to the caller, which will in turn cause the test to fail.
If all the code in a test function runs successfully, I return an Ok
containing the unit type to indicate the test passes.
Note that while Rust does have return
to return a value early from a function, the idiom is to omit the semicolon from the last expression to implicitly return that result.
Update your tests/cli.rs to this:
use assert_cmd::Command; use predicates::prelude::*; use std::fs; type TestResult = Result<(), Box<dyn std::error::Error>>; #[test] fn dies_no_args() -> TestResult { let mut cmd = Command::cargo_bin("echor")?;cmd.assert() .failure() .stderr(predicate::str::contains("USAGE")); Ok(())
} #[test] fn hello1() -> TestResult { let expected = fs::read_to_string("tests/expected/hello1.txt")?; let mut cmd = Command::cargo_bin("echor")?; cmd.arg("Hello there").assert().success().stdout(expected); Ok(()) }
Use ?
instead of Result::unwrap
to unpack an Ok
value or propagate an Err
.
Omit the final semicolon to return this value.
The next test passes two arguments, Hello and there, and expects the program to print Hello there.
#[test] fn hello2() -> TestResult { let expected = fs::read_to_string("tests/expected/hello2.txt")?; let mut cmd = Command::cargo_bin("echor")?; cmd.args(vec!["Hello", "there"]).assert() .success() .stdout(expected); Ok(()) }
Use the Command::args
method to pass a vector of arguments rather than a single string value.
I have a total of four files to check, so it behooves me to write a helper function.
I’ll call it run
and will pass it the argument strings along with the expected output file.
Rather than use a vector for the arguments, I’m going to use a std::slice
because I don’t need to grow the argument list after I’ve defined it:
fn run(args: &[&str], expected_file: &str) -> TestResult {let expected = fs::read_to_string(expected_file)?;
Command::cargo_bin("echor")?
.args(args) .assert() .success() .stdout(expected); Ok(())
}
The args
will be a slice of &str
values, and the expected_file
will be a &str
. The return value is a TestResult
.
Try to read the contents of the expected_file
into a string.
Attempt to run echor
in the current crate with the given arguments and assert that STDOUT
is the expected value.
If all the previous code worked, return Ok
containing the unit type.
You will find that Rust has many types of “string” variables. The type str
is appropriate here for literal strings in the source code. The &
shows that I intend only to borrow the string for a little while. I’ll have more to say about strings, borrowing, and ownership later.
Below is how I can use the helper function to run all four tests.
Replace the earlier hello1
and hello2
definitions with these:
#[test] fn hello1() -> TestResult { run(&["Hello there"], "tests/expected/hello1.txt")} #[test] fn hello2() -> TestResult { run(&["Hello", "there"], "tests/expected/hello2.txt")
} #[test] fn hello1_no_newline() -> TestResult { run(&["Hello there", "-n"], "tests/expected/hello1.n.txt")
} #[test] fn hello2_no_newline() -> TestResult { run(&["-n", "Hello", "there"], "tests/expected/hello2.n.txt")
}
A single string value as input.
Two strings as input.
A single string value as input with -n
flag to omit the newline. Note that there are two spaces between the words.
Two strings as input with -n
flag appearing first.
As you can see, I can write as many functions as I like in tests/cli.rs.
Only those marked with #[test]
are run when testing.
If you run cargo test
now, you should see five passing tests (in no particular order):
running 5 tests test dies_no_args ... ok test hello1 ... ok test hello1_no_newline ... ok test hello2_no_newline ... ok test hello2 ... ok
Now you have written about 30 lines of Rust code in src/main.rs for the echor
program and five tests in tests/cli.rs to verify that your program meets some measure of specification (the specs, as they say).
Consider what you’ve achieved:
Well-behaved systems programs should printed basic output to STDOUT
and errors to STDERR
.
You’ve written a program that takes the options -h|--help
to produce help, -V|--version
to show the program’s version, and -n
to omit a newline along with one or more positional command line arguments.
If the program is run with the wrong arguments or with the -h|--help
flag, it will print usage documentation.
The program will echo back all the command line arguments joined on spaces.
The trailing newline will be omitted if the -n
flag is present.
You can run integration tests to confirm that your program replicates the output from echo
for at least four test cases covering one or two inputs both with and without the trailing newline.
You learned to use several Rust types including the unit type, strings, vectors, slices, Option
, and Result
as well as how to create type alias to a specific type of Result
called a TestResult
.
You used a Box
to create a smart pointer to heap memory. This required digging a bit into the differences between the stack—where variables have a fixed, known size and are accessed in LIFO order—and the heap—where the size of variables may change during the program and are accessed through a pointer.
You learned how to read the entire contents of a file into a string.
You learned how to execute an external command from within a Rust program, check the exit status, and verify the contents of both STDOUT
and STDERR
.
All this, and you’ve done it while writing in a language that simply will not allow you to make common mistakes that lead to buggy programs or security vulnerabilities. Feel free to give yourself a little high five or enjoy a slightly evil MWUHAHA chuckle as you consider how Rust will help you conquer the world. Now that I’ve shown how to organize and write tests and data, I’ll use the tests earlier in the next program so I can start using test-driven development where I write tests first then write code to satisfy the tests.
1 “Trust, but verify.” Apparently this rhymes in Russian and so sounds cooler than when Reagan used it in the 1980s during nuclear disarmament talks with the USSR.