One of the most important parts of the C++ standard library is the input/output (I/O), stream-based library that enables developers to work with files, memory streams, or other types of I/O devices. The first part of this chapter provides solutions to some common stream operations, such as reading and writing data, localization settings, and manipulating the input and output of a stream. The second part of the chapter explores the new C++17 filesystem
library that enables developers to perform operations with the filesystem and its objects, such as files and directories.
The recipes covered in this chapter are as follows:
We will start the chapter with a couple of recipes on how to serialize and deserialize data to/from files.
Some of the data programs you work with must be persisted to disk files in various ways, including storing data in a database or to flat files, either as text or binary data. This recipe, and the next one, are focused on persisting and loading both raw data and objects from and to binary files. In this context, raw data means unstructured data, and, in this recipe, we will consider writing and reading the content of a buffer (that is, a contiguous sequence of memory) that can either be an array, an std::vector
, or an std::array
.
For this recipe, you should be familiar with the standard stream I/O library, although some explanations, to the extent that is required to understand this recipe, are provided next. You should also be familiar with the differences between binary and text files.
In this recipe, we will use the ofstream
and ifstream
classes, which are available in the namespace std
in the <fstream>
header.
To write the content of a buffer (in our example, an std::vector
) to a binary file, you should perform the following steps:
std::ofstream
class:
std::ofstream ofile("sample.bin", std::ios::binary);
if(ofile.is_open())
{
// streamed file operations
}
std::vector<unsigned char> output {0,1,2,3,4,5,6,7,8,9};
ofile.write(reinterpret_cast<char*>(output.data()),
output.size());
flush()
method. This determines the uncommitted changes in the stream to be synchronized with the external destination, which, in this case, is a disk file.close()
. This, in turn, calls flush()
, making the preceding step unnecessary in most contexts:
ofile.close();
To read the entire content of a binary file to a buffer, you should perform the following steps:
std::ifstream
class:
std::ifstream ifile("sample.bin", std::ios::binary);
if(ifile.is_open())
{
// streamed file operations
}
ifile.seekg(0, std::ios_base::end);
auto length = ifile.tellg();
ifile.seekg(0, std::ios_base::beg);
std::vector<unsigned char> input;
input.resize(static_cast<size_t>(length));
ifile.read(reinterpret_cast<char*>(input.data()), length);
auto success = !ifile.fail() && length == ifile.gcount();
ifile.close();
The standard stream-based I/O library provides various classes that implement high-level input, output, or both input and output file stream, string stream and character array operations, manipulators that control how these streams behave, and several predefined stream objects (cin
/wcin
, cout
/wcout
, cerr
/wcerr
, and clog
/wclog
).
These streams are implemented as class templates, and, for files, the library provides several classes:
basic_filebuf
implements the I/O operations for a raw file and is similar in semantics to a C FILE
stream.basic_ifstream
implements the high-level file stream input operations defined by the basic_istream
stream interface, internally using a basic_filebuf
object.basic_ofstream
implements the high-level file stream output operations defined by the basic_ostream
stream interface, internally using a basic_filebuf
object.basic_fstream
implements the high-level file stream input and output operations defined by the basic_iostream
stream interface, internally using a basic_filebuf
object.These classes are represented in the following class diagram to better understand their relationship:
Figure 7.1: Stream class diagram
Notice that this diagram also features several classes designed to work with a string-based stream. These streams, however, will not be discussed here.
Several typedefs for the class templates mentioned earlier are also defined in the <fstream>
header, in the std
namespace. The ofstream
and ifstream
objects are the type synonyms used in the preceding examples:
typedef basic_ifstream<char> ifstream;
typedef basic_ifstream<wchar_t> wifstream;
typedef basic_ofstream<char> ofstream;
typedef basic_ofstream<wchar_t> wofstream;
typedef basic_fstream<char> fstream;
typedef basic_fstream<wchar_t> wfstream;
In the previous section, you saw how we can write and read raw data to and from a file stream. Now, we'll cover this process in more detail.
To write data to a file, we instantiated an object of the type std::ofstream
. In the constructor, we passed the name of the file to be opened and the stream's open mode, for which we specified std::ios::binary
to indicate binary mode. Opening the file like this discards the previous file content. If you want to append content to an existing file, you should also use the flag std::ios::app
(that is, std::ios::app | std::ios::binary
). This constructor internally calls open()
on its underlying raw file object, that is, a basic_filebuf
object. If this operation fails, a fail bit is set. To check whether the stream has been successfully associated with a file device, we used is_open()
(this internally calls the method with the same name from the underlying basic_filebuf
). Writing data to the file stream is done using the write()
method, which takes a pointer to the string of characters to write and the number of characters to write. Since this method operates with strings of characters, a reinterpret_cast
is necessary if data is of another type, such as unsigned char
in our example. The write operation does not set a fail bit in the case of a failure, but it may throw an std::ios_base::failure
exception. However, data is not written directly to the file device but stored in the basic_filebuf
object. To write it to the file, the buffer needs to be flushed, which is done by calling flush()
. This is done automatically when closing the file stream, as shown in the preceding example.
To read data from a file, we instantiated an object of type std::ifstream
. In the constructor, we passed the same arguments that we used for opening the file to write the name of the file and the open mode, that is, std::ios::binary
. The constructor internally calls open()
on the underlying std::basic_filebuf
object. To check whether the stream has been successfully associated with a file device, we use is_open()
(this internally calls the method with the same name from the underlying basic_filebuf
). In this example, we read the entire content of the file to a memory buffer, in particular, an std::vector
. Before we can read the data, we must know the size of the file in order to allocate a buffer that is large enough to hold that data. To do this, we used seekg()
to move the input position indicator to the end of the file.
Then, we called tellg()
to return the current position, which, in this case, indicates the size of the file, in bytes, and then we moved the input position indicator to the beginning of the file to be able to start reading from the beginning. Calling seekg()
to move the position indicator to the end can be avoided by opening the file with the position indicator moved directly to the end. This can be achieved by using the std::ios::ate
opening flag in the constructor (or the open()
method). After allocating enough memory for the content of the file, we copied the data from the file into memory using the read()
method. This takes a pointer to the string of characters that receives the data read from the stream and the number of characters to be read. Since the stream operates on characters, a reinterpret_cast
expression is necessary if the buffer contains other types of data, such as unsigned char
in our example.
This operation throws an std::basic_ios::failure
exception if an error occurs. To determine the number of characters that have been successfully read from the stream, we can use the gcount()
method. Upon completing the read operation, we close the file stream.
The operations shown in these examples are the minimum ones required to write and read data to and from file streams. It is important, though, that you perform appropriate checks for the success of the operations and to catch any possible exceptions that could occur.
The example code discussed so far in this recipe can be reorganized in the form of two general functions for writing and reading data to and from a file:
bool write_data(char const * const filename,
char const * const data,
size_t const size)
{
auto success = false;
std::ofstream ofile(filename, std::ios::binary);
if(ofile.is_open())
{
try
{
ofile.write(data, size);
success = true;
}
catch(std::ios_base::failure &)
{
// handle the error
}
ofile.close();
}
return success;
}
size_t read_data(char const * const filename,
std::function<char*(size_t const)> allocator)
{
size_t readbytes = 0;
std::ifstream ifile(filename, std::ios::ate | std::ios::binary);
if(ifile.is_open())
{
auto length = static_cast<size_t>(ifile.tellg());
ifile.seekg(0, std::ios_base::beg);
auto buffer = allocator(length);
try
{
ifile.read(buffer, length);
readbytes = static_cast<size_t>(ifile.gcount());
}
catch (std::ios_base::failure &)
{
// handle the error
}
ifile.close();
}
return readbytes;
}
write_data()
is a function that takes the name of a file, a pointer to an array of characters, and the length of this array as arguments and writes the characters to the specified file. read_data()
is a function that takes the name of a file and a function that allocates a buffer and reads the entire content of the file to the buffer that is returned by the allocated function. The following is an example of how these functions can be used:
std::vector<unsigned char> output {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
std::vector<unsigned char> input;
if(write_data("sample.bin",
reinterpret_cast<char*>(output.data()),
output.size()))
{
if(read_data("sample.bin",
[&input](size_t const length) {
input.resize(length);
return reinterpret_cast<char*>(input.data());}) > 0)
{
std::cout << (output == input ? "equal": "not equal")
<< '
';
}
}
Alternatively, we could use a dynamically allocated buffer, instead of the std::vector
; the changes required for this are small in the overall example:
std::vector<unsigned char> output {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
unsigned char* input = nullptr;
size_t readb = 0;
if(write_data("sample.bin",
reinterpret_cast<char*>(output.data()),
output.size()))
{
if((readb = read_data(
"sample.bin",
[&input](size_t const length) {
input = new unsigned char[length];
return reinterpret_cast<char*>(input); })) > 0)
{
auto cmp = memcmp(output.data(), input, output.size());
std::cout << (cmp == 0 ? "equal": "not equal")
<< '
';
}
}
delete [] input;
However, this alternative is only provided to show that read_data()
can be used with different kinds of input buffers. It is recommended that you avoid the explicit dynamic allocation of memory whenever possible.
The way of reading data from a file to memory, as shown in this recipe, is only one of several. The following is a list of possible alternatives for reading data from a file stream:
std::vector
directly using std::istreambuf_iterator
iterators (similarly, this can be used with std::string
):
std::vector<unsigned char> input;
std::ifstream ifile("sample.bin", std::ios::binary);
if(ifile.is_open())
{
input = std::vector<unsigned char>(
std::istreambuf_iterator<char>(ifile),
std::istreambuf_iterator<char>());
ifile.close();
}
std::vector
from std::istreambuf_iterator
iterators:
std::vector<unsigned char> input;
std::ifstream ifile("sample.bin", std::ios::binary);
if(ifile.is_open())
{
ifile.seekg(0, std::ios_base::end);
auto length = ifile.tellg();
ifile.seekg(0, std::ios_base::beg);
input.reserve(static_cast<size_t>(length));
input.assign(
std::istreambuf_iterator<char>(ifile),
std::istreambuf_iterator<char>());
ifile.close();
}
std::istreambuf_iterator
iterators and an std::back_inserter
adapter to write to the end of the vector:
std::vector<unsigned char> input;
std::ifstream ifile("sample.bin", std::ios::binary);
if(ifile.is_open())
{
ifile.seekg(0, std::ios_base::end);
auto length = ifile.tellg();
ifile.seekg(0, std::ios_base::beg);
input.reserve(static_cast<size_t>(length));
std::copy(std::istreambuf_iterator<char>(ifile),
std::istreambuf_iterator<char>(),
std::back_inserter(input));
ifile.close();
}
Compared to these alternatives, however, the method described in the How to do it... section is the fastest one, even though the alternatives may look more appealing from an object-oriented perspective. It is beyond the scope of this recipe to compare the performance of these alternatives, but you can try it as an exercise.
<<
and >>
stream operatorsIn the previous recipe, we learned how to write and read raw data (that is, unstructured data) to and from a file. Many times, however, we must persist and load objects instead. Writing and reading in the manner shown in the previous recipe works for POD types only. For anything else, we must explicitly decide what is actually written or read, since writing or reading pointers, virtual tables (vtables), and any sort of metadata is not only irrelevant but also semantically wrong. These operations are commonly referred to as serialization and deserialization. In this recipe, we will learn how to serialize and deserialize both POD and non-POD types to and from binary files.
For the examples in this recipe, we will use the foo
and foopod
classes, as follows:
class foo
{
int i;
char c;
std::string s;
public:
foo(int const i = 0, char const c = 0, std::string const & s = {}):
i(i), c(c), s(s)
{}
foo(foo const &) = default;
foo& operator=(foo const &) = default;
bool operator==(foo const & rhv) const
{
return i == rhv.i &&
c == rhv.c &&
s == rhv.s;
}
bool operator!=(foo const & rhv) const
{
return !(*this == rhv);
}
};
struct foopod
{
bool a;
char b;
int c[2];
};
bool operator==(foopod const & f1, foopod const & f2)
{
return f1.a == f2.a && f1.b == f2.b &&
f1.c[0] == f2.c[0] && f1.c[1] == f2.c[1];
}
It is recommended that you first read the previous recipe, Reading and writing raw data from/to binary files, before you continue. You should also know what POD (a type that is both trivial and has a standard layout) and non-POD types are and how operators can be overloaded. You can check the closing notes of the Using type traits to query properties of types recipe, in Chapter 6, General-Purpose Utilities, for further details on POD types.
To serialize/deserialize POD types that do not contain pointers, use ofstream::write()
and ifstream::read()
, as shown in the previous recipe:
ofstream
and the write()
method:
std::vector<foopod> output {
{true, '1', {1, 2}},
{true, '2', {3, 4}},
{false, '3', {4, 5}}
};
std::ofstream ofile("sample.bin", std::ios::binary);
if(ofile.is_open())
{
for(auto const & value : output)
{
ofile.write(reinterpret_cast<const char*>(&value),
sizeof(value));
}
ofile.close();
}
ifstream
and read()
methods:
std::vector<foopod> input;
std::ifstream ifile("sample.bin", std::ios::binary);
if(ifile.is_open())
{
while(true)
{
foopod value;
ifile.read(reinterpret_cast<char*>(&value),
sizeof(value));
if(ifile.fail() || ifile.eof()) break;
input.push_back(value);
}
ifile.close();
}
To serialize non-POD types (or POD types that contain pointers), you must explicitly write the value of the data members to a file, and to deserialize, you must explicitly read from the file to the data members in the same order. To demonstrate this, we will consider the foo
class that we defined earlier:
write()
to serialize objects of this class. The method takes a reference to an ofstream
and returns a bool
indicating whether the operation was successful or not:
bool write(std::ofstream& ofile) const
{
ofile.write(reinterpret_cast<const char*>(&i), sizeof(i));
ofile.write(&c, sizeof(c));
auto size = static_cast<int>(s.size());
ofile.write(reinterpret_cast<char*>(&size), sizeof(size));
ofile.write(s.data(), s.size());
return !ofile.fail();
}
read()
, to deserialize the objects of this class. This method takes a reference to an ifstream
and returns a bool
indicating whether the operation was successful or not:
bool read(std::ifstream& ifile)
{
ifile.read(reinterpret_cast<char*>(&i), sizeof(i));
ifile.read(&c, sizeof(c));
auto size {0};
ifile.read(reinterpret_cast<char*>(&size), sizeof(size));
s.resize(size);
ifile.read(reinterpret_cast<char*>(&s.front()), size);
return !ifile.fail();
}
An alternative to the write()
and read()
member functions demonstrated earlier is to overload operator<<
and operator>>
. To do this, you should perform the following steps:
friend
declarations for the non-member operator<<
and operator>>
to the class to be serialized/deserialized (in this case, the foo
class):
friend std::ofstream& operator<<(std::ofstream& ofile,
foo const& f);
friend std::ifstream& operator>>(std::ifstream& ifile,
foo& f);
operator<<
for your class:
std::ofstream& operator<<(std::ofstream& ofile, foo const& f)
{
ofile.write(reinterpret_cast<const char*>(&f.i),
sizeof(f.i));
ofile.write(&f.c, sizeof(f.c));
auto size = static_cast<int>(f.s.size());
ofile.write(reinterpret_cast<char*>(&size), sizeof(size));
ofile.write(f.s.data(), f.s.size());
return ofile;
}
operator>>
for your class:
std::ifstream& operator>>(std::ifstream& ifile, foo& f)
{
ifile.read(reinterpret_cast<char*>(&f.i), sizeof(f.i));
ifile.read(&f.c, sizeof(f.c));
auto size {0};
ifile.read(reinterpret_cast<char*>(&size), sizeof(size));
f.s.resize(size);
ifile.read(reinterpret_cast<char*>(&f.s.front()), size);
return ifile;
}
Regardless of whether we serialize the entire object (for POD types) or only parts of it, we use the same stream classes that we discussed in the previous recipe: ofstream
for output file streams and ifstream
for input file streams. Details about writing and reading data using these standard classes have been discussed in that recipe and will not be reiterated here.
When you serialize and deserialize objects to and from files, you should avoid writing the values of the pointers to a file. Additionally, you must not read pointer values from the file since these represent memory addresses and are meaningless across processes and even in the same process some moments later. Instead, you should write data referred by a pointer and read data into objects referred by a pointer.
This is a general principle, and, in practice, you may encounter situations where a source may have multiple pointers to the same object; in this case, you might want to write only one copy and also handle the reading in a corresponding manner.
If the objects you want to serialize are of the POD type, you can do it just like we did when we discussed raw data. In the example in this recipe, we serialized a sequence of objects of the foopod
type. When we deserialize, we read from the file stream in a loop until the end of the file is read or a failure occurs. The way we read, in this case, may look counterintuitive, but doing it differently may lead to the duplication of the last read value:
If reading is done using a loop with an exit condition that checks the end of the file bit, that is, while(!ifile.eof())
, the last value will be added to the input sequence twice. The reason for this is that upon reading the last value, the end of the file has not yet been encountered (as that is a mark beyond the last byte of the file). The end of the file mark is only reached at the next read attempt, which, therefore, sets the eofbit
of the stream. However, the input variable still has the last value since it hasn't been overwritten with anything, and this is added to the input vector for a second time.
If the objects you want to serialize and deserialize are of non-POD types, writing/reading these objects as raw data is not possible. For instance, such an object may have a virtual table. Writing the virtual table to a file does not cause problems, even though it does not have any value; however, reading from a file, and, therefore, overwriting the virtual table of an object will have catastrophic effects on the object and the program.
When serializing/deserializing non-POD types, there are various alternatives, and some of them have been discussed in the previous section. All of them provide explicit methods for writing and reading or overloading the standard <<
and >>
operators. The second approach has an advantage in that it enables the use of your class in generic code, where objects are written and read to and from stream files using these operators.
When you plan to serialize and deserialize your objects, consider versioning your data from the very beginning to avoid problems if the structure of your data changes over time. How versioning should be done is beyond the scope of this recipe.
<<
and >>
stream operatorsHow writing or reading to and from streams is performed may depend on the language and regional settings. Examples include writing and parsing numbers, time values, or monetary values, or comparing (collating) strings. The C++ I/O library provides a general-purpose mechanism for handling internationalization features through locales and facets. In this recipe, you will learn how to use locales to control the behavior of input/output streams.
All of the examples in this recipe use the std::cout
predefined console stream object. However, the same applies to all I/O stream objects. Also, in these recipe examples, we will use the following objects and lambda function:
auto now = std::chrono::system_clock::now();
auto stime = std::chrono::system_clock::to_time_t(now);
auto ltime = std::localtime(&stime);
std::vector<std::string> names
{"John", "adele", "Øivind", "François", "Robert", "Åke"};
auto sort_and_print = [](std::vector<std::string> v,
std::locale const & loc)
{
std::sort(v.begin(), v.end(), loc);
for (auto const & s : v) std::cout << s << ' ';
std::cout << '
';
};
The locale names used in this recipe (en_US.utf8
, de_DE.utf8
, and so on) are the ones that are used on UNIX systems. The following table lists their equivalents for Windows systems:
UNIX |
Windows |
|
|
|
|
|
|
|
|
To control the localization settings of a stream, you must do the following:
std::locale
class to represent the localization settings. There are various ways in which to construct locale objects, including the following:C
locale at the program startup)C
, POSIX,
en_US.utf8
, and so on, if supported by the operating system// default construct
auto loc_def = std::locale {};
// from a name
auto loc_us = std::locale {"en_US.utf8"};
// from another locale except for a facet
auto loc1 = std::locale {loc_def,
new std::collate<wchar_t>};
// from another local, except the facet in a category
auto loc2 = std::locale {loc_def, loc_us,
std::locale::collate};
C
locale, use the std::locale::classic()
static method:
auto loc = std::locale::classic();
std::locale::global()
static method:
std::locale::global(std::locale("en_US.utf8"));
imbue()
method to change the current locale of an I/O stream:
std::cout.imbue(std::locale("en_US.utf8"));
The following list shows examples of using various locales:
auto loc = std::locale("de_DE.utf8");
std::cout.imbue(loc);
std::cout << 1000.50 << '
';
// 1.000,5
std::cout << std::showbase << std::put_money(1050)
<< '
';
// 10,50 €
std::cout << std::put_time(ltime, "%c") << '
';
// So 04 Dez 2016 17:54:06 JST
sort_and_print(names, loc);
// adele Åke François John Øivind Robert
std::locale
object from an empty string:
auto loc = std::locale("");
std::cout.imbue(loc);
std::cout << 1000.50 << '
';
// 1,000.5
std::cout << std::showbase << std::put_money(1050)
<< '
';
// $10.50
std::cout << std::put_time(ltime, "%c") << '
';
// Sun 04 Dec 2016 05:54:06 PM JST
sort_and_print(names, loc);
// adele Åke François John Øivind Robert
std::locale::global(std::locale("sv_SE.utf8")); // set global
auto loc = std::locale{}; // use global
std::cout.imbue(loc);
std::cout << 1000.50 << '
';
// 1 000,5
std::cout << std::showbase << std::put_money(1050)
<< '
';
// 10,50 kr
std::cout << std::put_time(ltime, "%c") << '
';
// sön 4 dec 2016 18:02:29
sort_and_print(names, loc);
// adele François John Robert Åke Øivind
C
locale:
auto loc = std::locale::classic();
std::cout.imbue(loc);
std::cout << 1000.50 << '
';
// 1000.5
std::cout << std::showbase << std::put_money(1050)
<< '
';
// 1050
std::cout << std::put_time(ltime, "%c") << '
';
// Sun Dec 4 17:55:14 2016
sort_and_print(names, loc);
// François John Robert adele Åke Øivind
A locale object does not actually store localized settings. A locale is a heterogeneous container of facets. A facet is an object that defines the localization and internationalization settings. The standard defines a list of facets that each locale must contain. In addition to this, a locale can contain any other user-defined facets. The following is a list of all standard-defined facets:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It is beyond the scope of this recipe to go through this list and discuss all of these facets. However, we could mention that std::money_get
is a facet that encapsulates the rules for parsing monetary values from character streams, while std::money_put
is a facet that encapsulates the rules for formatting monetary values as strings. In a similar manner, std::time_get
encapsulates rules for data and time parsing, while std::time_put
encapsulates rules for data and time formatting. These will form the subject of the next couple of recipes.
A locale is an immutable object containing immutable facet objects. Locales are implemented as a reference-counted array of reference-counted pointers to facets. The array is indexed by std::locale::id
, and all facets must be derived from the base class std::locale::facet
and must have a public static member of the std::locale::id
type, called id
.
It is only possible to create a locale object using one of the overloaded constructors or with the combine()
method, which, as the name implies, combines the current locale with a new compile-time identifiable facet and returns a new locale object. On the other hand, it is possible to determine whether a locale contains a particular facet using the std::has_facet()
function template, or to obtain a reference to a facet implemented by a particular locale using the std::use_facet()
function template.
In the preceding examples, we sorted a vector of strings and passed a locale object as the third argument to the std::sort()
general algorithm. This third argument is supposed to be a comparison function object. Passing a locale object works because std::locale
has an operator()
that lexicographically compares two strings using its collate facet. This is actually the only localization functionality that is directly provided by std::locale
; however, what this does is invoke the collate facet's compare()
method that performs the string comparison based on the facet's rules.
Every program has a global locale created when the program starts. The content of this global locale is copied into every default-constructed locale. The global locale can be replaced using the static method std::locale::global()
. By default, the global locale is the C
locale, which is a locale equivalent to ANSI C's locale with the same name. This locale was created to handle simple English texts, and it is the default one in C++ that provides compatibility with C. A reference to this locale can be obtained with the static method std::locale::classic()
.
By default, all streams use the classic locale to write or parse text. However, it is possible to change the locale used by a stream using the stream's imbue()
method. This is a member of the std::ios_base
class that is the base for all I/O streams. A companion member is the getloc()
method, which returns a copy of the current stream's locale.
In the preceding examples, we changed the locale for the std::cout
stream object. In practice, you may want to set the same locale for all stream objects associated with the standard C streams: cin
, cout
, cerr
, and clog
(or wcin
, wcout
, wcerr
, and wclog
).
<<
and >>
stream operatorsApart from the stream-based I/O library, the standard library provides a series of helper functions, called manipulators, that control the input and output streams using operator<<
and operator>>
. In this recipe, we will look at some of these manipulators and demonstrate their use through some examples that format the output to the console. We will continue covering more manipulators in the upcoming recipes.
The I/O manipulators are available in the std
namespace in the headers <ios>
, <istream>
, <ostream>
, and <iomanip>
. In this recipe, we will only discuss some of the manipulators from <ios>
and <iomanip>
.
The following manipulators can be used to control the output or input of a stream:
boolalpha
and noboolalpha
enable and disable the textual representation of Booleans:
std::cout << std::boolalpha << true << '
'; // true
std::cout << false << '
'; // false
std::cout << std::noboolalpha << false << '
'; // 0
left
, right
, and internal
affect the alignment of the fill characters; left
and right
affect all text, but internal
affects only the integer, floating point, and monetary output:
std::cout << std::right << std::setw(10) << "right
";
std::cout << std::setw(10) << "text
";
std::cout << std::left << std::setw(10) << "left
";
fixed
, scientific
, hexfloat
, and defaultfloat
change the formatting used for floating-point types (for both the input and output streams). The latter two have only been available since C++11:
std::cout << std::fixed << 0.25 << '
';
// 0.250000
std::cout << std::scientific << 0.25 << '
';
// 2.500000e-01
std::cout << std::hexfloat << 0.25 << '
';
// 0x1p-2
std::cout << std::defaultfloat << 0.25 << '
';
// 0.25
dec
, hex
, and oct
control the base that is used for the integer types (in both the input and output streams):
std::cout << std::oct << 42 << '
'; // 52
std::cout << std::hex << 42 << '
'; // 2a
std::cout << std::dec << 42 << '
'; // 42
setw
changes the width of the next input or output field. The default width is 0.setfill
changes the fill character for the output stream; this is the character that is used to fill the next fields until the specified width is reached. The default fill character is whitespace:
std::cout << std::right
<< std::setfill('.') << std::setw(10)
<< "right" << '
';
// .....right
setprecision
changes the decimal precision (how many digits are generated) for the floating-point types in both the input and output streams. The default precision is 6:
std::cout << std::fixed << std::setprecision(2) << 12.345
<< '
';
// 12.35
All of the I/O manipulators listed earlier, with the exception of setw
, which only refers to the next output field, affect the stream. Additionally, all consecutive writing or reading operations use the last specified format until another manipulator is used again.
Some of these manipulators are called without arguments. Examples include boolalpha
/noboolalpha
or dec
/hex
/oct
. These manipulators are functions that take a single argument, that is, a reference to a string, and return a reference to the same stream:
std::ios_base& hex(std::ios_base& str);
Expressions, such as std::cout << std::hex
, are possible because both basic_ostream::operator<<
and basic_istream::operator>>
have special overloads that take a pointer to these functions.
Other manipulators, including some that are not mentioned here, are invoked with arguments. These manipulators are functions that take one or more arguments and return an object of an unspecified type:
template<class CharT>
/*unspecified*/ setfill(CharT c);
To better demonstrate the use of these manipulators, we will consider two examples that format output to the console.
In the first example, we will list the table of contents of a book with the following requirements:
For this example, we will use the following classes and helper function:
struct Chapter
{
int Number;
std::string Title;
int Page;
};
struct BookPart
{
std::string Title;
std::vector<Chapter> Chapters;
};
struct Book
{
std::string Title;
std::vector<BookPart> Parts;
};
std::string to_roman(unsigned int value)
{
struct roman_t { unsigned int value; char const* numeral; };
const static roman_t rarr[13] =
{
{1000, "M"}, {900, "CM"}, {500, "D"}, {400, "CD"},
{100, "C"}, { 90, "XC"}, { 50, "L"}, { 40, "XL"},
{ 10, "X"}, { 9, "IX"}, { 5, "V"}, { 4, "IV"},
{ 1, "I"}
};
std::string result;
for (auto const & number : rarr)
{
while (value >= number.value)
{
result += number.numeral;
value -= number.value;
}
}
return result;
}
The print_toc()
function, as shown in the following code snippet, takes a Book
as its argument and prints its content to the console according to the specified requirements. For this purpose, we use the following:
std::left
and std::right
specify the text alignmentstd::setw
specifies the width of each output fieldstd::fill
specifies the fill character (a blank space for the chapter number and a dot for the chapter title)The implementation of the print_toc()
function is listed here:
void print_toc(Book const & book)
{
std::cout << book.Title << '
';
for(auto const & part : book.Parts)
{
std::cout << std::left << std::setw(15) << std::setfill(' ')
<< part.Title << '
';
std::cout << std::left << std::setw(15) << std::setfill('-')
<< '-' << '
';
for(auto const & chapter : part.Chapters)
{
std::cout << std::right << std::setw(4) << std::setfill(' ')
<< to_roman(chapter.Number) << ' ';
std::cout << std::left << std::setw(35) << std::setfill('.')
<< chapter.Title;
std::cout << std::right << std::setw(3) << std::setfill('.')
<< chapter.Page << '
';
}
}
}
The following example uses this method with a Book
object describing the table of contents from the book The Fellowship of the Ring:
auto book = Book
{
"THE FELLOWSHIP OF THE RING"s,
{
{
"BOOK ONE"s,
{
{1, "A Long-expected Party"s, 21},
{2, "The Shadow of the Past"s, 42},
{3, "Three Is Company"s, 65},
{4, "A Short Cut to Mushrooms"s, 86},
{5, "A Conspiracy Unmasked"s, 98},
{6, "The Old Forest"s, 109},
{7, "In the House of Tom Bombadil"s, 123},
{8, "Fog on the Barrow-downs"s, 135},
{9, "At the Sign of The Prancing Pony"s, 149},
{10, "Strider"s, 163},
{11, "A Knife in the Dark"s, 176},
{12, "Flight to the Ford"s, 197},
},
},
{
"BOOK TWO"s,
{
{1, "Many Meetings"s, 219},
{2, "The Council of Elrond"s, 239},
{3, "The Ring Goes South"s, 272},
{4, "A Journey in the Dark"s, 295},
{5, "The Bridge of Khazad-dum"s, 321},
{6, "Lothlorien"s, 333},
{7, "The Mirror of Galadriel"s, 353},
{8, "Farewell to Lorien"s, 367},
{9, "The Great River"s, 380},
{10, "The Breaking of the Fellowship"s, 390},
},
},
}
};
print_toc(book);
In this case, the output is as follows:
THE FELLOWSHIP OF THE RING
BOOK ONE
---------------
I A Long-expected Party...............21
II The Shadow of the Past..............42
III Three Is Company....................65
IV A Short Cut to Mushrooms............86
V A Conspiracy Unmasked...............98
VI The Old Forest.....................109
VII In the House of Tom Bombadil.......123
VIII Fog on the Barrow-downs............135
IX At the Sign of The Prancing Pony...149
X Strider............................163
XI A Knife in the Dark................176
XII Flight to the Ford.................197
BOOK TWO
---------------
I Many Meetings......................219
II The Council of Elrond..............239
III The Ring Goes South................272
IV A Journey in the Dark..............295
V The Bridge of Khazad-dum...........321
VI Lothlorien.........................333
VII The Mirror of Galadriel............353
VIII Farewell to Lorien.................367
IX The Great River....................380
X The Breaking of the Fellowship.....390
For the second example, our goal is to output a table that lists the largest companies in the world by revenue. The table will have columns for the company name, the industry, the revenue (in USD billions), the increase/decrease in revenue growth, the revenue growth, the number of employees, and the country of origin. For this example, we will use the following class:
struct Company
{
std::string Name;
std::string Industry;
double Revenue;
bool RevenueIncrease;
double Growth;
int Employees;
std::string Country;
};
The print_companies()
function in the following code snippet uses several additional manipulators to the ones shown in the previous example:
std::boolalpha
displays Boolean values as true
and false
instead of 1
and 0
.std::fixed
indicates a fixed floating-point representation, and then std::defaultfloat
reverts to the default floating-point representation.std::setprecision
specifies the number of decimal digits to be displayed in the output. Together with std::fixed
, this is used to indicate a fixed representation with a decimal digit for the Growth
field.The implementation of the print_companies()
function is listed here:
void print_companies(std::vector<Company> const & companies)
{
for(auto const & company : companies)
{
std::cout << std::left << std::setw(26) << std::setfill(' ')
<< company.Name;
std::cout << std::left << std::setw(18) << std::setfill(' ')
<< company.Industry;
std::cout << std::left << std::setw(5) << std::setfill(' ')
<< company.Revenue;
std::cout << std::left << std::setw(5) << std::setfill(' ')
<< std::boolalpha << company.RevenueIncrease
<< std::noboolalpha;
std::cout << std::right << std::setw(5) << std::setfill(' ')
<< std::fixed << std::setprecision(1) << company.Growth
<< std::defaultfloat << std::setprecision(6) << ' ';
std::cout << std::right << std::setw(8) << std::setfill(' ')
<< company.Employees << ' ';
std::cout << std::left << std::setw(2) << std::setfill(' ')
<< company.Country
<< '
';
}
}
The following is an example of calling this method. The source of the data shown here is Wikipedia (https://en.wikipedia.org/wiki/List_of_largest_companies_by_revenue, as of 2016):
std::vector<Company> companies
{
{"Walmart"s, "Retail"s, 482, false, 0.71,
2300000, "US"s},
{"State Grid"s, "Electric utility"s, 330, false, 2.91,
927839, "China"s},
{"Saudi Aramco"s, "Oil and gas"s, 311, true, 40.11,
65266, "SA"s},
{"China National Petroleum"s, "Oil and gas"s, 299,
false, 30.21, 1589508, "China"s},
{"Sinopec Group"s, "Oil and gas"s, 294, false, 34.11,
810538, "China"s},
};
print_companies(companies);
In this case, the output has a table-based format, as follows:
Walmart Retail 482 false 0.7 2300000 US
State Grid Electric utility 330 false 2.9 927839 China
Saudi Aramco Oil and gas 311 true 40.1 65266 SA
China National Petroleum Oil and gas 299 false 30.2 1589508 China
Sinopec Group Oil and gas 294 false 34.1 810538 China
As an exercise, you can try adding a table heading or even a grid line to precede these lines for a better tabulation of the data.
In the previous recipe, we looked at some of the manipulators that can be used to control input and output streams. The manipulators that we discussed were related to numeric values and text values. In this recipe, we will look at how to use standard manipulators to write and read monetary values.
You should now be familiar with locales and how to set them for a stream. This topic was discussed in the Using localized settings for streams recipe. It is recommended that you read that recipe before continuing.
The manipulators discussed in this recipe are available in the std
namespace, in the <iomanip>
header.
To write a monetary value to an output stream, you should do the following:
std::cout.imbue(std::locale("en_GB.utf8"));
long double
or a std::basic_string
value for the amount:
long double mon = 12345.67;
std::string smon = "12345.67";
std::put_money
manipulator with a single argument, the monetary value, to display the value using the currency symbol (if any is available):
std::cout << std::showbase << std::put_money(mon)
<< '
'; // £123.46
std::cout << std::showbase << std::put_money(smon)
<< '
'; // £123.46
std::put_money
with two arguments, the monetary value and a Boolean flag set to true
, to indicate the use of an international currency string:
std::cout << std::showbase << std::put_money(mon, true)
<< '
'; // GBP 123.46
std::cout << std::showbase << std::put_money(smon, true)
<< '
'; // GBP 123.46
To read a monetary value from an input stream, you should do the following:
std::istringstream stext("$123.45 123.45 USD");
stext.imbue(std::locale("en_US.utf8"));
long double
or std::basic_string
value to read the amount from the input stream:
long double v1;
std::string v2;
std::get_money()
with a single argument, the variable where the monetary value is to be written, if a currency symbol might be used in the input stream:
stext >> std::get_money(v1) >> std::get_money(v2);
// v1 = 12345, v2 = "12345"
std::get_money()
with two arguments, the variable where the monetary value is to be written and a Boolean flag set to true
, to indicate the presence of an international currency string:
stext >> std::get_money(v1, true) >> std::get_money(v2, true);
// v1 = 0, v2 = "12345"
The put_money()
and get_money()
manipulators are very similar. They are both function templates that take an argument representing either the monetary value to be written to the output stream or a variable to hold the monetary value read from an input stream, and a second, optional parameter, to indicate whether an international currency string is used. The default alternative is the currency symbol, if one is available. put_money()
uses the std::money_put()
facet settings to output a monetary value, and get_money()
uses the std::money_get()
facet to parse a monetary value. Both manipulator function templates return an object of an unspecified type. These functions do not throw exceptions:
template <class MoneyT>
/*unspecified*/ put_money(const MoneyT& mon, bool intl = false);
template <class MoneyT>
/*unspecified*/ get_money(MoneyT& mon, bool intl = false);
Both of these manipulator functions require the monetary value to be either a long double
or a std::basic_string
.
However, it is important to note that monetary values are stored as integral numbers of the smallest denomination of the currency defined by the locale in use. Considering US dollars as that currency, $100.00 is stored as 10000.0, and 1 cent, that is, $0.01, is stored as 1.0.
When writing a monetary value to an output stream, it is important to use the std::showbase
manipulator if you want to display the currency symbol or the international currency string. This is normally used to indicate the prefix of a numeric base (such as 0x
for hexadecimal); however, for monetary values, it is used to indicate whether the currency symbol/string should be displayed or not. The following snippet provides an example:
// print 123.46
std::cout << std::put_money(12345.67) << '
';
// print £123.46
std::cout << std::showbase << std::put_money(12345.67) << '
';
In the preceding snippet, the first line will just print the numerical value representing a currency amount, 123.46, while the second line will print the same numerical value but preceded by the currency symbol.
<<
and >>
stream operatorsSimilar to the monetary I/O manipulators that we discussed in the previous recipe, the C++11 standard provides manipulators that control the writing and reading of time values to and from streams, where time values are represented in the form of an std::tm
object that holds a calendar date and time. In this recipe, you will learn how to use these time manipulators.
Time values used by the time I/O manipulators are expressed in std::tm
values. You should be familiar with this structure from the <ctime>
header.
You should also be familiar with locales and how to set them for a stream. This topic was discussed in the Using localized settings for streams recipe. It is recommended that you read that recipe before continuing.
The manipulators discussed in this recipe are available in the std
namespace, in the <iomanip>
header.
To write a time value to an output stream, you should perform the following steps:
auto now = std::chrono::system_clock::now();
auto stime = std::chrono::system_clock::to_time_t(now);
auto ltime = std::localtime(&stime);
auto ttime = std::time(nullptr);
auto ltime = std::localtime(&ttime);
std::put_time()
to supply a pointer to the std::tm
object, representing the calendar date and time, and a pointer to a null-terminated character string, representing the format. The C++11 standard provides a long list of formats that can be used; this list can be consulted at http://en.cppreference.com/w/cpp/io/manip/put_time.imbue()
and then use the std::put_time()
manipulator:
std::cout.imbue(std::locale("en_GB.utf8"));
std::cout << std::put_time(ltime, "%c") << '
';
// Sun 04 Dec 2016 05:26:47 JST
The following list shows some examples of supported time formats:
"%F"
or "%Y-%m-%d"
:
std::cout << std::put_time(ltime, "%F") << '
';
// 2016-12-04
"%T"
:
std::cout << std::put_time(ltime, "%T") << '
';
// 05:26:47
"%FT%T%z"
:
std::cout << std::put_time(ltime, "%FT%T%z") << '
';
// 2016-12-04T05:26:47+0900
"%Y-W%V"
:
std::cout << std::put_time(ltime, "%Y-W%V") << '
';
// 2016-W48
"%Y-W%V-%u"
:
std::cout << std::put_time(ltime, "%Y-W%V-%u") << '
';
// 2016-W48-7
"%Y-%j"
:
std::cout << std::put_time(ltime, "%Y-%j") << '
';
// 2016-339
To read a time value from an input stream, you should perform the following steps:
std::tm
type to hold the time value read from the stream:
auto time = std::tm {};
std::get_time()
to supply a pointer to the std::tm
object, which will hold the time value, and a pointer to a null-terminated character string, which represents the format. The list of possible formats can be consulted at http://en.cppreference.com/w/cpp/io/manip/get_time. The following example parses an ISO 8601 combined date and time value:
std::istringstream stext("2016-12-04T05:26:47+0900");
stext >> std::get_time(&time, "%Y-%m-%dT%H:%M:%S");
if (!stext.fail()) { /* do something */ }
imbue()
and then use the std::get_time()
manipulator:
std::istringstream stext("Sun 04 Dec 2016 05:35:30 JST");
stext.imbue(std::locale("en_GB.utf8"));
stext >> std::get_time(&time, "%c");
if (stext.fail()) { /* do something else */ }
The two manipulators for time values, put_time()
and get_time()
, are very similar: they are both function templates with two arguments. The first argument is a pointer to an std::tm
object representing the calendar date and time that holds the value to be written to the stream or the value that is read from the stream. The second argument is a pointer to a null-terminated character string representing the format of the time text. put_time()
uses the std::time_put()
facet to output a date and time value, and get_time()
uses the std::time_get()
facet to parse a date and time value. Both manipulator function templates return an object of an unspecified type. These functions do not throw exceptions:
template<class CharT>
/*unspecified*/ put_time(const std::tm* tmb, const CharT* fmt);
template<class CharT>
/*unspecified*/ get_time(std::tm* tmb, const CharT* fmt);
The string that results from using put_time()
to write a date and time value to an output stream is the same as the one that results from a call to std::strftime()
or std::wcsftime()
.
The standard defines a long list of available conversion specifiers that compose the format string. These specifiers are prefixed with a %
, and, in some cases, are followed by an E
or a 0
. Some of them are also equivalent; for instance, %F
is equivalent to %Y-%m-%d
(this is the ISO 8601 date format), and %T
is equivalent to %H:%M:%S
(this is the ISO 8601 time format). The examples in this recipe mention only a few of the conversion specifiers, referring to ISO 8601 date and time formats. For the complete list of conversion specifiers, refer to the C++ standard or follow the links that were mentioned earlier.
It is important to note that not all of the conversion specifiers supported by put_time()
are also supported by get_time()
. Examples include the z
(offset from UTC in the ISO 8601 format) and Z
(time zone name or abbreviation) specifiers, which can only be used with put_time()
. This is demonstrated in the following snippet:
std::istringstream stext("2016-12-04T05:26:47+0900");
auto time = std::tm {};
stext >> std::get_time(&time, "%Y-%m-%dT%H:%M:%S%z"); // fails
stext >> std::get_time(&time, "%Y-%m-%dT%H:%M:%S"); // OK
The text represented by some conversion specifiers is locale-dependent. All specifiers prefixed with E
or 0
are locale-dependent. To set a particular locale for the stream, use the imbue()
method, as demonstrated in the examples in the How to do it... section.
<<
and >>
stream operatorsAn important addition to the C++17 standard is the filesystem
library that enables us to work with paths, files, and directories in hierarchical filesystems (such as Windows or POSIX filesystems). This standard library has been developed based on the boost.filesystem
library. In the next few recipes, we will explore those features of the library that enable us to perform operations with files and directories, such as creating, moving, or deleting them, but also querying properties and searching. It is important, however, to first look at how this library handles paths.
For this recipe, we will consider most of the examples using Windows paths. In the accompanying code, all examples have both Windows and POSIX alternatives.
The filesystem
library is available in the std::filesystem
namespace, in the <filesystem>
header. To simplify the code, we will use the following namespace alias in all of the examples:
namespace fs = std::filesystem;
A path to a filesystem component (file, directory, hard link, or soft link) is represented by the path
class.
The following is a list of the most common operations on paths:
assign()
method:
// Windows
auto path = fs::path{"C:\Users\Marius\Documents"};
// POSIX
auto path = fs::path{ "/home/marius/docs" };
operator /=
, the non-member operator /
, or the append()
method:
path /= "Book";
path = path / "Modern" / "Cpp";
path.append("Programming");
// Windows: C:UsersMariusDocumentsBookModernCppProgramming
// POSIX: /home/marius/docs/Book/Modern/Cpp/Programming
operator +=
, the non-member operator +
, or the concat()
method:
auto path = fs::path{ "C:\Users\Marius\Documents" };
path += "\Book";
path.concat("\Modern");
// path = C:UsersMariusDocumentsBookModern
root_name()
, root_dir()
, filename()
, stem()
, extension()
, and so on (all of them are shown in the following example):
auto path =
fs::path{"C:\Users\Marius\Documents\sample.file.txt"};
std::cout
<< "root: " << path.root_name() << '
'
<< "root dir: " << path.root_directory() << '
'
<< "root path: " << path.root_path() << '
'
<< "rel path: " << path.relative_path() << '
'
<< "parent path: " << path.parent_path() << '
'
<< "filename: " << path.filename() << '
'
<< "stem: " << path.stem() << '
'
<< "extension: " << path.extension() << '
';
has_root_name()
, has_root_directory()
, has_filename()
, has_stem()
, and has_extension()
(all of them are shown in the following example):
auto path =
fs::path{"C:\Users\Marius\Documents\sample.file.txt"};
std::cout
<< "has root: " << path.has_root_name() << '
'
<< "has root dir: " << path.has_root_directory() << '
'
<< "has root path: " << path.has_root_path() << '
'
<< "has rel path: " << path.has_relative_path() << '
'
<< "has parent path: " << path.has_parent_path() << '
'
<< "has filename: " << path.has_filename() << '
'
<< "has stem: " << path.has_stem() << '
'
<< "has extension: " << path.has_extension() << '
';
auto path2 = fs::path{ "marius\temp" };
std::cout
<< "absolute: " << path1.is_absolute() << '
'
<< "absolute: " << path2.is_absolute() << '
';
replace_filename()
and remove_filename()
, and the extension with replace_extension()
:
auto path =
fs::path{"C:\Users\Marius\Documents\sample.file.txt"};
path.replace_filename("output");
path.replace_extension(".log");
// path = C:UsersMariusDocumentsoutput.log
path.remove_filename();
// path = C:UsersMariusDocuments
// Windows
auto path = fs::path{"Users/Marius/Documents"};
path.make_preferred();
// path = UsersMariusDocuments
// POSIX
auto path = fs::path{ "\home\marius\docs" };
path.make_preferred();
// path = /home/marius/docs
The std::filesystem::path
class models paths to filesystem components. However, it only handles the syntax and does not validate the existence of a component (such as a file or a directory) represented by the path.
The library defines a portable, generic syntax for paths that can accommodate various filesystems, such as POSIX or Windows, including the Microsoft Windows Universal Naming Convention (UNC) format. Both of them differ in several key aspects:
/
, and a single current directory. Additionally, they use /
as the directory separator. Paths are represented as null-terminated strings of char
encoded as UTF-8.C:
), a root directory (such as
), and a current directory (such as C:WindowsSystem32
). Paths are represented as null-terminated strings of wide characters encoded as UTF-16.A pathname, as defined in the filesystem
library, has the following syntax:
C:
or //localhost
)There are two special filenames that are recognized: the single dot (.
), which represents the current directory, and the double dot (..
), which represents the parent directory. The directory separator can be repeated, in which case it is treated as a single separator (in other words, /home////docs
is the same as /home/marius/docs
). A path that has no redundant current directory name (.
), no redundant parent directory name (..
), and no redundant directory separators is said to be in a normal form.
The path operations presented in the previous section are the most common operations with paths. However, their implementation defines additional querying and modifying methods, iterators, non-member comparison operators, and more. The following sample iterates through the parts of a path and prints them to the console:
auto path =
fs::path{ "C:\Users\Marius\Documents\sample.file.txt" };
for (auto const & part : path)
{
std::cout << part << '
';
}
The following listing represents its result:
C:
Users
Marius
Documents
sample.file.txt
In this example, sample.file.txt
is the filename. This is basically the part from the last directory separator to the end of the path. This is what the member function filename()
would be returning for the given path. The extension for this file is .txt
, which is the string returned by the extension()
member function. To retrieve the filename without an extension, another member function called stem()
is available. Here, the string returned by this method is sample.file
. For all of these methods, but also all of the other decomposition methods, there is a corresponding querying method with the same name and prefix has_
, such as has_filename()
, has_stem()
, and has_extension()
. All of these methods return a bool
value to indicate whether the path has the corresponding part.
Operations with files, such as copying, moving, and deleting, or with directories, such as creating, renaming, and deleting, are all supported by the filesystem
library. Files and directories are identified using a path (which can be absolute, canonical, or relative), a topic that was covered in the previous recipes. In this recipe, we will look at what the standard functions for the previously mentioned operations are and how they work.
Before going forward, you should read the Working with filesystem paths recipe. The introductory notes from that recipe also apply here. However, all of the examples in this recipe are platform-independent.
For all of the following examples, we will use the following variables, and assume the current path is C:UsersMariusDocuments
on Windows and /home/marius/docs
for a POSIX system:
auto err = std::error_code{};
auto basepath = fs::current_path();
auto path = basepath / "temp";
auto filepath = path / "sample.txt";
We will also assume the presence of a file called sample.txt
in the temp
subdirectory of the current path (such as C:UsersMariusDocuments empsample.txt
or /home/marius/docs/temp/sample.txt
).
Use the following library functions to perform operations with directories:
create_directory()
. This method does nothing if the directory already exists; however, it does not create directories recursively:
auto success = fs::create_directory(path, err);
create_directories()
:
auto temp = path / "tmp1" / "tmp2" / "tmp3";
auto success = fs::create_directories(temp, err);
rename()
:
auto temp = path / "tmp1" / "tmp2" / "tmp3";
auto newtemp = path / "tmp1" / "tmp3";
fs::rename(temp, newtemp, err);
if (err) std::cout << err.message() << '
';
rename()
:
auto temp = path / "tmp1" / "tmp3";
auto newtemp = path / "tmp1" / "tmp4";
fs::rename(temp, newtemp, err);
if (err) std::cout << err.message() << '
';
copy()
. To recursively copy the entire content of a directory, use the copy_options::recursive
flag:
fs::copy(path, basepath / "temp2",
fs::copy_options::recursive, err);
if (err) std::cout << err.message() << '
';
create_directory_symlink()
:
auto linkdir = basepath / "templink";
fs::create_directory_symlink(path, linkdir, err);
if (err) std::cout << err.message() << '
';
remove()
:
auto temp = path / "tmp1" / "tmp4";
auto success = fs::remove(temp, err);
remove_all()
:
auto success = fs::remove_all(path, err) !=
static_cast<std::uintmax_t>(-1);
Use the following library functions to perform operations with files:
copy()
or copy_file()
. The next section explains the difference between the two:
auto success = fs::copy_file(filepath, path / "sample.bak", err);
if (!success) std::cout << err.message() << '
';
fs::copy(filepath, path / "sample.cpy", err);
if (err) std::cout << err.message() << '
';
rename()
:
auto newpath = path / "sample.log";
fs::rename(filepath, newpath, err);
if (err) std::cout << err.message() << '
';
rename()
:
auto newpath = path / "sample.log";
fs::rename(newpath, path / "tmp1" / "sample.log", err);
if (err) std::cout << err.message() << '
';
create_symlink()
:
auto linkpath = path / "sample.txt.link";
fs::create_symlink(filepath, linkpath, err);
if (err) std::cout << err.message() << '
';
remove()
:
auto success = fs::remove(path / "sample.cpy", err);
if (!success) std::cout << err.message() << '
';
All of the functions mentioned in this recipe, and other similar functions that are not discussed here, have multiple overloads that can be grouped into two categories:
std::error_code
: these overloads do not throw an exception (they are defined with the noexcept
specification). Instead, they set the value of the error_code
object to the operating system error code if an operating system error has occurred. If no such error has occurred, then the clear()
method on the error_code
object is called to reset any possible previously set code.std::error_code
type: these overloads throw exceptions if errors occur. If an operating system error occurs, they throw an std::filesystem::filesystem_error
exception. On the other hand, if memory allocation fails, these functions throw an std::bad_alloc
exception.All the examples in the previous section used the overload that does not throw exceptions but, instead, sets a code when an error occurs. Some functions return a bool
to indicate a success or a failure. You can check whether the error_code
object holds the code of an error by either checking whether the value of the error code, returned by the method value()
, is different from zero, or by using the conversion operator bool
, which returns true
for the same case and false
otherwise. To retrieve the explanatory string for the error code, use the message()
method.
Some filesystem
library functions are common for both files and directories. This is the case for rename()
, remove()
, and copy()
. The working details of each of these functions can be complex, especially in the case of copy()
, and are beyond the scope of this recipe. You should refer to the reference documentation if you need to perform anything other than the simple operations covered here.
When it comes to copying files, there are two functions that can be used: copy()
and copy_file()
. These have equivalent overloads with identical signatures and, apparently, work the same way. However, there is an important difference (other than the fact that copy()
also works for directories): copy_file()
follows symbolic links. To avoid doing that and, instead, copy the actual symbolic link, you must use either copy_symlink()
or copy()
with the copy_options::copy_symlinks
flag. Both the copy()
and copy_file()
functions have an overload that takes an argument of the std::filesystem::copy_options
type, which defines how the operation should be performed. copy_options
is a scoped enum
with the following definition:
enum class copy_options
{
none = 0,
skip_existing = 1,
overwrite_existing = 2,
update_existing = 4,
recursive = 8,
copy_symlinks = 16,
skip_symlinks = 32,
directories_only = 64,
create_symlinks = 128,
create_hard_links = 256
};
The following table defines how each of these flags affects a copy operation, either with copy()
or copy_file()
. The table is taken from the 27.10.10.4 paragraph of the C++17 standard:
Option group controlling |
|
|
(Default) Error; file already exists |
|
Do not overwrite existing file; do not report an error |
|
Overwrite the existing file |
|
Overwrite the existing file if it is older than the replacement file |
Option group controlling |
|
|
(Default) Do not copy subdirectories |
|
Recursively copy subdirectories and their contents |
Option group controlling |
|
|
(Default) Follow symbolic links |
|
Copy symbolic links as symbolic links rather than copying the files that they point to |
|
Ignore symbolic links |
Option group controlling |
|
|
(Default) Copy contents |
|
Copy the directory structure only, do not copy non-directory files |
|
Make symbolic links instead of copies of files; the source path will be an absolute path unless the destination path is in the current directory |
|
Make hard links instead of copies of files |
Another aspect that should be mentioned is related to symbolic links: create_directory_symlink()
creates a symbolic link to a directory, whereas create_symlink()
creates symbolic links to either files or directories. On POSIX systems, the two are identical when it comes to directories. On other systems (such as Windows), symbolic links to directories are created differently than symbolic links to files. Therefore, it is recommended that you use create_directory_symlink()
for directories in order to write code that works correctly on all systems.
When you perform operations with files and directories, such as the ones described in this recipe, and you use the overloads that may throw exceptions, ensure that you try
-catch
the calls. Regardless of the type of overload used, you should check the success of the operation and take appropriate action in the case of a failure.
Operations such as copying, renaming, moving, or deleting files are directly provided by the filesystem
library. However, when it comes to removing content from a file, you must perform explicit actions.
Regardless of whether you need to do this for text or binary files, you must implement the following pattern:
In this recipe, we will learn how to implement this pattern for a text file.
For the purpose of this recipe, we will consider removing empty lines, or lines that start with a semicolon (;
), from a text file. For this example, we will have an initial file, called sample.dat
, that contains the names of Shakespeare's plays but also empty lines and lines that start with a semicolon. The following is a partial listing of this file (from the beginning):
;Shakespeare's plays, listed by genre
;TRAGEDIES
Troilus and Cressida
Coriolanus
Titus Andronicus
Romeo and Juliet
Timon of Athens
Julius Caesar
The code samples listed in the next section use the following variables:
auto path = fs::current_path();
auto filepath = path / "sample.dat";
auto temppath = path / "sample.tmp";
auto err = std::error_code{};
We will learn how to put this pattern into code in the following section.
Perform the following operations to remove content from a file:
std::ifstream in(filepath);
if (!in.is_open())
{
std::cout << "File could not be opened!" << '
';
return;
}
std::ofstream out(temppath, std::ios::trunc);
if (!out.is_open())
{
std::cout << "Temporary file could not be created!"
<< '
';
return;
}
auto line = std::string{};
while (std::getline(in, line))
{
if (!line.empty() && line.at(0) != ';')
{
out << line << 'n';
}
}
in.close();
out.close();
auto success = fs::remove(filepath, err);
if(!success || err)
{
std::cout << err.message() << '
';
return;
}
fs::rename(temppath, filepath, err);
if (err)
{
std::cout << err.message() << '
';
}
The pattern described here is the same for binary files too; however, for our convenience, we are only discussing an example with text files. The temporary file in this example is in the same directory as the original file. Alternatively, this can be located in a separate directory, such as a user temporary directory. To get a path to a temporary directory, you can use std::filesystem::temp_directory_path()
. On Windows systems, this function returns the same directory as GetTempPath()
. On POSIX systems, it returns the path specified in one of the environment variables TMPDIR
, TMP
, TEMP
, or TEMPDIR
; or, if none of them are available, it returns the path /tmp
.
How content from the original file is copied to the temporary file varies from one case to another, depending on what needs to be copied. In the preceding example, we have copied entire lines, unless they are empty or start with a semicolon. For this purpose, we read the content of the original file, line by line, using std::getline()
until there are no more lines to read. After all the necessary content has been copied, the files should be closed, so they can be moved or deleted.
To complete the operation, there are three options:
remove()
function to delete the original file and rename()
to rename the temporary file to the original filename.copy()
or copy_file()
functions) and then delete the temporary file (use remove()
for this).If you take the first approach mentioned here, then you must make sure that the temporary file that is later replacing the original file has the same file permissions as the original file; otherwise, depending on the context of your solution, it can lead to problems.
The filesystem
library provides functions and types that enable developers to check for the existence of a filesystem object, such as a file or directory, its properties, such as the type (the file, directory, symbolic link, and more), the last write time, permissions, and more. In this recipe, we will look at what these types and functions are and how they can be used.
For the following code samples, we will use the namespace alias fs
for the std::filesystem
namespace. The filesystem
library is available in the header with the same name, <filesystem>
. Also, we will use the variables shown here, path
for the path of a file and err
for receiving potential operating system error codes from the filesystem APIs:
auto path = fs::current_path() / "main.cpp";
auto err = std::error_code{};
Also, the function to_time_t
shown here, will be referred in this recipe:
template <typename TP>
std::time_t to_time_t(TP tp)
{
using namespace std::chrono;
auto sctp = time_point_cast<system_clock::duration>(
tp - TP::clock::now() + system_clock::now());
return system_clock::to_time_t(sctp);
}
Before continuing with this recipe, you should read the Working with filesystem paths recipe.
Use the following library functions to retrieve information about filesystem objects:
exists()
:
auto exists = fs::exists(path, err);
std::cout << "file exists: " << std::boolalpha
<< exists << '
';
equivalent()
:
auto same = fs::equivalent(path,
fs::current_path() / "." / "main.cpp");
std::cout << "equivalent: " << same << '
';
file_size()
:
auto size = fs::file_size(path, err);
std::cout << "file size: " << size << '
';
hard_link_count()
:
auto links = fs::hard_link_count(path, err);
if(links != static_cast<uintmax_t>(-1))
std::cout << "hard links: " << links << '
';
else
std::cout << "hard links: error" << '
';
last_write_time()
:
auto lwt = fs::last_write_time(path, err);
auto time = to_time_t(lwt);
auto localtime = std::localtime(&time);
std::cout << "last write time: "
<< std::put_time(localtime, "%c") << '
';
stat
function), use the status()
function. This function follows symbolic links. To retrieve the file attributes of a symbolic link without following it, use symlink_status()
:
auto print_perm = [](fs::perms p)
{
std::cout
<< ((p & fs::perms::owner_read) != fs::perms::none ?
"r" : "-")
<< ((p & fs::perms::owner_write) != fs::perms::none ?
"w" : "-")
<< ((p & fs::perms::owner_exec) != fs::perms::none ?
"x" : "-")
<< ((p & fs::perms::group_read) != fs::perms::none ?
"r" : "-")
<< ((p & fs::perms::group_write) != fs::perms::none ?
"w" : "-")
<< ((p & fs::perms::group_exec) != fs::perms::none ?
"x" : "-")
<< ((p & fs::perms::others_read) != fs::perms::none ?
"r" : "-")
<< ((p & fs::perms::others_write) != fs::perms::none ?
"w" : "-")
<< ((p & fs::perms::others_exec) != fs::perms::none ?
"x" : "-")
<< '
';
};
auto status = fs::status(path, err);
std::cout << "type: " << static_cast<int>(status.type()) << '
';
std::cout << "permissions: ";
print_perm(status.permissions());
is_regular_file()
, is_directory()
, is_symlink()
, and so on:
std::cout << "regular file? " <<
fs::is_regular_file(path, err) << '
';
std::cout << "directory? " <<
fs::is_directory(path, err) << '
';
std::cout << "char file? " <<
fs::is_character_file(path, err) << '
';
std::cout << "symlink? " <<
fs::is_symlink(path, err) << '
';
These functions, used to retrieve information about the filesystem files and directories, are, in general, simple and straightforward. However, some considerations are necessary:
exists()
, either by passing the path or an std::filesystem::file_status
object that was previously retrieved using the status()
function.equivalent()
function determines whether two filesystem objects have the same status, as retrieved by the function status()
. If neither path exists, or if both exist but neither is a file, directory, or symbolic link, then the function returns an error. Hard links to the same file object are equivalent. A symbolic link and its target are also equivalent.file_size()
function can only be used to determine the size of regular files and symbolic links that target a regular file. For any other types of file objects, such as directories, this function fails. This function returns the size of the file in bytes, or -1
if an error has occurred. If you want to determine whether a file is empty, you can use the is_empty()
function. This works for all types of filesystem objects, including directories.last_write_time()
function has two sets of overloads: one that is used to retrieve the last modification time of the filesystem object, and one that is used to set the last modification time. Time is indicated by a std::filesystem::file_time_type
object, which is basically a type alias for std::chrono::time_point
. The following example changes the last write time for a file to 30 minutes earlier than its previous value:
using namespace std::chrono_literals;
auto lwt = fs::last_write_time(path, err);
fs::last_write_time(path, lwt - 30min);
status()
function determines the type and permissions of a filesystem object. However, if the file is a symbolic link, the information returned is about the target of the symbolic link. To retrieve information about the symbolic link itself, the symlink_status()
function must be used. Permissions are defined as an enumeration, std::filesystem::perms
. Not all the enumerators of this scoped enum
represent permissions; some of them represent controlling bits, such as add_perms
, to indicate that permissions should be added, or remove_perms
, to indicate that permissions should be removed. The permissions()
function can be used to modify the permissions of a file or a directory. The following example adds all permissions to the owner and user group of a file:
fs::permissions(
path,
fs::perms::add_perms |
fs::perms::owner_all | fs::perms::group_all,
err);
is_regular_file()
, is_symlink()
, or is_directory()
. The following examples that check whether a path refers to a regular file are equivalent:
auto s = fs::status(path, err);
auto isfile = s.type() == std::filesystem::file_type::regular;
auto isfile = fs::is_regular_file(path, err);
All of the functions discussed in this recipe have an overload that throws exceptions if an error occurs, and an overload that does not throw but returns an error code via a function parameter. All of the examples in this recipe used this approach. More information about these sets of overloads can be found in the Creating, copying, and deleting files and directories recipe.
So far in this chapter, we have looked at many of the functionalities provided by the filesystem
library, such as working with paths, performing operations with files and directories (creating, moving, renaming, deleting, and so on), and querying or modifying properties. Another useful functionality when working with the filesystem is to iterate through the content of a directory. The filesystem
library provides two directory iterators, one called directory_iterator
, which iterates the content of a directory, and one called recursive_directory_iterator
, which recursively iterates the content of a directory and its subdirectories. In this recipe, we will learn how to use them.
For this recipe, we will consider a directory with the following structure:
test/
├──data/
│ ├──input.dat
│ └──output.dat
├──file_1.txt
├──file_2.txt
└──file_3.log
In this recipe, we will work with filesystem paths and check the properties of a filesystem object. Therefore, it is recommended that you first read the Working with filesystem paths and Checking the properties of an existing file or directory recipes.
Use the following patterns to enumerate the content of a directory:
directory_iterator
:
void visit_directory(fs::path const & dir)
{
if (fs::exists(dir) && fs::is_directory(dir))
{
for (auto const & entry : fs::directory_iterator(dir))
{
auto filename = entry.path().filename();
if (fs::is_directory(entry.status()))
std::cout << "[+]" << filename << '
';
else if (fs::is_symlink(entry.status()))
std::cout << "[>]" << filename << '
';
else if (fs::is_regular_file(entry.status()))
std::cout << " " << filename << '
';
else
std::cout << "[?]" << filename << '
';
}
}
}
recursive_directory_iterator
when the order of processing the entries does not matter:
void visit_directory_rec(fs::path const & dir)
{
if (fs::exists(dir) && fs::is_directory(dir))
{
for (auto const & entry :
fs::recursive_directory_iterator(dir))
{
auto filename = entry.path().filename();
if (fs::is_directory(entry.status()))
std::cout << "[+]" << filename << '
';
else if (fs::is_symlink(entry.status()))
std::cout << "[>]" << filename << '
';
else if (fs::is_regular_file(entry.status()))
std::cout << " " << filename << '
';
else
std::cout << "[?]" << filename << '
';
}
}
}
directory_iterator
to iterate the content of a directory. However, instead, call it recursively for each subdirectory:
void visit_directory(
fs::path const & dir,
bool const recursive = false,
unsigned int const level = 0)
{
if (fs::exists(dir) && fs::is_directory(dir))
{
auto lead = std::string(level*3, ' ');
for (auto const & entry : fs::directory_iterator(dir))
{
auto filename = entry.path().filename();
if (fs::is_directory(entry.status()))
{
std::cout << lead << "[+]" << filename << '
';
if(recursive)
visit_directory(entry, recursive, level+1);
}
else if (fs::is_symlink(entry.status()))
std::cout << lead << "[>]" << filename << '
';
else if (fs::is_regular_file(entry.status()))
std::cout << lead << " " << filename << '
';
else
std::cout << lead << "[?]" << filename << '
';
}
}
}
Both directory_iterator
and recursive_directory_iterator
are input iterators that iterate over the entries of a directory. The difference is that the first one does not visit the subdirectories recursively, while the second one, as its name implies, does. They both share a similar behavior:
.
) and dot-dot (..
) are skipped.begin()
and end()
for both directory_iterator
and recursive_directory_iterator
, which enables us to use these iterators in range-based for
loops, as shown in the examples earlier.Both iterators have overloaded constructors. Some overloads of the recursive_directory_iterator
constructor take an argument of the std::filesystem::directory_options
type, which specifies additional options for the iteration:
none
: This is the default that does not specify anything.follow_directory_symlink
: This specifies that the iteration should follow symbolic links instead of serving the link itself.Skip_permission_denied
: This specifies that you should ignore and skip the directories that could trigger an access denied error.The elements that both directory iterators point to are of the directory_entry
type. The path()
member function returns the path of the filesystem object represented by this object. The status of the filesystem object can be retrieved with the member functions status()
and symlink_status()
for symbolic links.
The preceding examples follow a common pattern:
for
loop to iterate all the entries of a directory.filesystem
library, depending on the way the iteration is supposed to be done.In our examples, we simply printed the names of the directory entries to the console. It is important to note, as we specified earlier, that the content of the directory is iterated in an unspecified order. If you want to process the content in a structured manner, such as showing subdirectories and their entries indented (for this particular case) or in a tree (in other types of applications), then using recursive_directory_iterator
is not appropriate. Instead, you should use directory_iterator
in a function that is called recursively from the iteration, for each subdirectory, as shown in the last example from the previous section.
Considering the directory structure presented at the beginning of this recipe (relative to the current path), we get the following output when using the recursive iterator, as follows:
visit_directory_rec(fs::current_path() / "test");
[+]data
input.dat
output.dat
file_1.txt
file_2.txt
file_3.log
On the other hand, when using the recursive function from the third example, as shown in the following listing, the output is displayed ordered on sublevels, as intended:
visit_directory(fs::current_path() / "test", true);
[+]data
input.dat
output.dat
file_1.txt
file_2.txt
file_3.log
Remember that the visit_directory_rec()
function is a non-recursive function that uses the recursive_directory_iterator
iterator, while the visit_directory()
function is a recursive function that uses the directory_iterator
. This example should help you to understand the difference between the two iterators.
In the previous recipe, Checking the properties of an existing file or directory, we discussed, among other things, the file_size()
function that returns the size of a file in bytes. However, this function fails if the specified path is a directory. To determine the size of a directory, we need to iterate recursively through the content of a directory, retrieve the size of the regular files or symbolic links, and add them together. However, we must make sure that we check the value returned by file_size()
, that is, -1
cast to an std::uintmax_t
, in the case of an error. This value, indicating a failure, should not be added to the total size of a directory.
Consider the following function to exemplify this case:
std::uintmax_t dir_size(fs::path const & path)
{
auto size = static_cast<uintmax_t>(-1);
if (fs::exists(path) && fs::is_directory(path))
{
for (auto const & entry : fs::recursive_directory_iterator(path))
{
if (fs::is_regular_file(entry.status()) ||
fs::is_symlink(entry.status()))
{
auto err = std::error_code{};
auto filesize = fs::file_size(entry);
if (filesize != static_cast<uintmax_t>(-1))
size += filesize;
}
}
}
return size;
}
The preceding dir_size()
function returns the size of all the files in a directory (recursively), or -1
, as an uintmax_t
, in the case of an error.
In the previous recipe, we learned how we can use directory_iterator
and recursive_directory_iterator
to enumerate the content of a directory. Displaying the content of a directory, as we did in the previous recipe, is only one of the scenarios in which this is needed. The other major scenario is when searching for particular entries in a directory, such as files with a particular name, extension, and so on. In this recipe, we will demonstrate how we can use the directory iterators and the iterating patterns shown earlier to find files that match a given criterion.
You should read the previous recipe, Enumerating the content of a directory, for details about directory iterators. In this recipe, we will also use the same test directory structure that was presented in the previous recipe.
To find files that match particular criteria, use the following pattern:
recursive_directory_iterator
to iterate through all the entries of a directory and recursively through its subdirectories.This pattern is exemplified in the find_files()
function shown here:
std::vector<fs::path> find_files(
fs::path const & dir,
std::function<bool(fs::path const&)> filter)
{
auto result = std::vector<fs::path>{};
if (fs::exists(dir))
{
for (auto const & entry :
fs::recursive_directory_iterator(
dir,
fs::directory_options::follow_directory_symlink))
{
if (fs::is_regular_file(entry) &&
filter(entry))
{
result.push_back(entry);
}
}
}
return result;
}
When we want to find files in a directory, the structure of the directory and the order its entries, including subdirectories, are visited in is probably not important. Therefore, we can use the recursive_directory_iterator
to iterate through the entries.
The function find_files()
takes two arguments: a path and a function wrapper that is used to select the entries that should be returned. The return type is a vector of filesystem::path
, though. Alternatively, it could also be a vector of filesystem::directory_entry
. The recursive directory iterator used in this example does not follow symbolic links, returning the link itself and not the target. This behavior can be changed using a constructor overload that has an argument of the type filesystem::directory_options
and by passing follow_directory_symlink
.
In the preceding example, we only consider the regular files and ignore the other types of filesystem objects. The predicate is applied to the directory entry, and, if it returns true
, the entry is added to the result.
The following example uses the find_files()
function to find all of the files in the test directory that start with the prefix file_
:
auto results = find_files(
fs::current_path() / "test",
[](fs::path const & p) {
auto filename = p.wstring();
return filename.find(L"file_") != std::wstring::npos;
});
for (auto const & path : results)
{
std::cout << path << '
';
}
The output of executing this program, with paths relative to the current path, is as follows:
testfile_1.txt
testfile_2.txt
testfile_3.log
A second example shows how to find files that have a particular extension, in this case, the extension .dat
:
auto results = find_files(
fs::current_path() / "test",
[](fs::path const & p) {
return p.extension() == L".dat";});
for (auto const & path : results)
{
std::cout << path << '
';
}
The output, again relative to the current path, is shown here:
testdatainput.dat
testdataoutput.dat
These two examples are very similar. The only thing that is different is the code in the lambda function, which checks the path received as an argument.
3.149.27.202