How it works...

walkdir consists of three important types:

  • WalkDir: A builder (see the Using the builder pattern section in Chapter 1Learning the Basics) for your directory walker
  • IntoIter: The iterator created by the builder
  • DirEntry: Represents a single folder or file

If you just want to operate on a list of all entries under a root folder, such as in the first example in line [6], you can implicitly use WalkDir directly as an iterator over different instances of DirEntry:

for entry in WalkDir::new(".") {
if let Ok(entry) = entry {
println!("{}", entry.path().display());
}
}

As you can see, the iterator doesn't directly give you a DirEntry, but a Result. This is because there are some cases where accessing a file or folder might prove difficult. For instance, the OS could prohibit you from reading the contents of a folder, hiding the files in it. Or a symlink, which you could enable by calling follow_links(true) on the WalkDir instance, could point back to a parent directory, potentially resulting in an endless loop.

Our solution strategy for the errors in this recipe is simple—we just ignore them and carry on with the rest of the entries that didn't report any issues.

When you extract the actual entry, it can tell you a lot about itself. One of those things is its path. Keep in mind, though, that .path() [8] doesn't just return the path as a string. Actually, it returns a native Rust Path struct that could be used for further analysis. You could, for example, read a file path's extension by calling .extension() on it. Or you could get its parent directory by calling .parent(). Feel free to explore the possibilities by exploring the Path documentation at https://doc.rust-lang.org/std/path/struct.Path.html. In our case, we are only going to display it as a simple string by calling .display() on it.

When we explicitly convert WalkDir into an iterator with into_iter(), we can access a special method that no other iterator has: filter_entry. It is an optimization over filter in that it gets called during the traversal. When its predicate returns false on a directory, the walker won't go into the directory at all! This way, you can gain a lot of performance when traversing big filesystems. In the recipe, we use it while looking for non-hidden files [15]. If you need to operate only on files and never on directories, you should use plain old filter instead.

We define hidden files, by Unix convention, as all directories and files that start with a dot. For this reason, they are sometimes also called dotfiles.

In both cases, your filtering requires a predicate. They are usually put in their own function for simplicity and reusability.

Note that walkdir doesn't just give us the filename as a normal string. Instead, it returns an OsStr. This is a special kind of string that Rust uses when talking directly to the operating system. The type exists because some operating systems allow invalid UTF-8 in their filenames. When looking at such files in Rust, you have two choices—let Rust try to convert them into UTF-8 and replace all invalid characters with the Unicode Replacement Character (�), or instead handle the error yourself. You can go the first route by calling to_string_lossy on an OsStr [20]. The second route is accessible by calling to_str and checking the returned Option, like we did in has_file_name, where we simply discard invalid names.

In this recipe, you can see a splendid example of when to choose a for_each method call (discussed in the Access collections as Iterators section in Chapter 1, Learning the BasicsWorking with collections) over a for loop—most of our iterator calls are chained together, and so a for_each call can naturally be chained into the iterator as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.198