Chapter 5.3. I.30: Encapsulate rule violations

Hiding the unsightly things in life

I have a room in my house that nobody may enter. It is too small to be a bedroom and too large to be a closet, but not quite large enough to be a useful office. I have no idea what the architect was thinking, but I have found a use for this room.

I like things to be neat and tidy. I like a place for everything and everything in its place. It means I can find things when I need them, rather than spending time trying to search for things that have not been correctly put away.

Unfortunately, the universe does not always support me in my aims. I carefully categorize things so that similar items are stored together, but I may have a surplus of, for example, staplers or hole punchers or staple removers because I am old and still live in a world of pens and staples and A4 ring binders.

The surplus goes into The Room.

The trouble with this approach is that there is no sensible way of storing surplus staplers. You simply put them all together in one place and hope for the best. I am not going to go into detail about why I might have surplus staplers and why I retain rather than discard them. It suffices to say that I have a small room, hidden from outside observers, filled with clutter. Some might say “tightly packed.”

The point is that sometimes, despite your best intentions, things do not go according to plan.

Engineering is practiced in real life. It is messy, especially when it encounters living things with big bundles of conflicting desires and choices. The finest guidelines and the best of intentions can crumble about you when faced with an unavoidable intrusion from a lower level of abstraction. This guideline exhorts you to minimize the visibility of such horrors.

Recall the code example in Chapter 2.1 where the problem of parsing an options file demonstrated how things could get out of hand through scope creep. The solution was careful separation of concerns and abstraction behind well-designed interfaces. As an afterthought, the solution was able to parse from any source that could be presented as a stream. However, what if that facility had been an upfront requirement?

The example presented with this item in the Core Guidelines is similar in nature. Here, the program is told where to get its input from and must capture an appropriate stream. Typically, the program would be run from the command line and take up to two command-line parameters. There are three possible input sources: standard input, the command line, and an external file. Therefore, the first parameter would be an identifier for the input source, perhaps 0, 1, or 2, while the second parameter would be a command or a filename. This would lead to command lines like these:

sample.exe 0             Read commands from standard input
sample.exe 1 help        Run the help command
sample.exe 2 setup.txt   Run the commands in setup.txt

We reproduce the example code from Chapter 2.1, slightly modified, here:

enum class input_source { std_in, command_line, file };
bool owned; std::istream* input;
switch (source) { case input_source::std_in: owned = false; input = &std::cin; break; case input_source::command_line: owned = true; input = new std::istringstream{argv[2]}; break; case input_source::file: owned = true; input = new std::ifstream{&argv[2]}; break; } std::istream& in = *input;

From the top, we have an enumeration called input_source describing our three options, a bool called owned, and a std::istream called input. Assume that source is initialized from the first command-line parameter: the case statements compare it with the input_source enumerations. The std::istream object is set to point to either an existing std::istream object, std::cin, or a new std::ifstream or std::istringstream object. This presents us with a problem: sometimes we are going to have to destroy our stream, and sometimes we are not. We cannot statically identify which case applies, so a flag is needed to signal dynamically what should happen. If the flag is set to true, then the stream must be destroyed.

This code does not respect the Core Guidelines. For example, ES.20: “Always initialize an object” is quite unambiguous because it is so often violated with undesirable results. Here, owned and input are not initialized. Their values are not ready yet, and the author is reluctant to assign a value to them only to immediately change them before they are used.

How do we encapsulate this rule violation?

Keeping up appearances

We have already looked at the Immediately Invoked Lambda Expression (IILE) pattern, so let’s try that:

auto input = [&]() -> std::pair<bool, std::istream*> {
  auto source = input_source_from_ptr(argv[1]);
  switch (source) {
  case input_source::std_in:
    return {false, &std::cin};
  case input_source::command_line:
    return {true, new std::istringstream{argv[2]}};
  case input_source::file:
    return {true, new std::ifstream{argv[2]}};
  }
}();

This is an improvement, provided you are comfortable with the idiom.

However, there is another guideline being ignored here, and that is I.11: “Never transfer ownership by a raw pointer (T*) or reference (T&).” The problem is quite clear: the author must remember whether the function owns the stream. They should certainly not try and destroy the standard input. The raw pointer doesn’t provide enough information about ownership, so the owned object has to provide additional context. Then the author must ensure that they write

if (input.first) delete input.second;

and that it is executed in every code path. This sounds like a job for Resource Acquisition Is Initialization1 (RAII), which we can model using a class.

1. RAII is discussed in Chapter 5.6.

We’ll call the class command_stream. It needs to optionally own a std::istream, so we can start with this:

class command_stream {
private:
  bool owned;         // Possibly owns the std::istream
  std::istream* inp;  // Here is the std::istream
};

There’s not much in there. The destructor is trivial:

class command_stream {
public:
  ~command_stream() {
    if (owned) delete inp;
  }
private: bool owned; // Possibly owns the std::istream std::istream* inp; // Here is the std::istream };

The constructor should take a parameter to indicate which input stream it should be forwarding to, and an optional filename or command. Fortunately, we already have an enumeration we can use, so our class now looks like this:

class command_stream {
public:
  command_stream(input_source source, std::string token) {
    switch (source) {
    case input_source::std_in:
      owned = false;
      inp = &std::cin;
      return;
    case input_source::command_line:
      owned = true;
      inp = new std::istringstream{ token };
      return;
    case input_source::file:
      owned = true;
      inp = new std::ifstream{ token };
      return;
  }
   ~command_stream() {
     if (owned) delete inp;
  }
private: bool owned; // Possibly owns the std::istream std::istream* inp; // Here is the std::istream };

However, we seem to have coupled the enumeration to the command_stream class. Do we really need to do that? It’s always worth spending a little time decoupling early.

Of course, we don’t need to import the input_source enumeration into the class. We can simply create three constructors instead. The simplest case is that the std::istream is std::cin, in which case nothing needs to be created and there are no ownership issues. We can make that the default constructor:

class command_stream { public:
  command_stream()
    : owned(false)
    , inp(&std::cin) {}
  ~command_stream() {
    if (owned) delete inp;
  }
private: bool owned; // Possibly owns the std::istream std::istream* inp; // Here is the std::istream };

In fact, we can do better than that and use default member initialization:

class command_stream {
public:
  command_stream() = default;
  ~command_stream() {
    if (owned) delete inp;
  }
private: bool owned = false; // Possibly owns the std::istream std::istream* inp = &std::cin; // Here is the std::istream };

The other two construction methods both need to take a std::string and nothing else, so we need to differentiate between them. There are several ways to do this, but we’re going to use a tag.

A tag is a struct with no members. It’s a way of enabling the overload of a function, since overloading takes place by parameter type. Let’s define a tag called from_command_line to differentiate the two remaining constructors:

class command_stream {
public:
  struct from_command_line {};
  command_stream() = default;
  command_stream(std::string filename)
    : owned(true)
    , inp(new std::ifstream(filename))
  {}
  command_stream(std::string command_list, from_command_line)
    : owned(true)
    , inp(new std::istringstream(command_list))
  {}
  ~command_stream() {
    if (owned) delete inp;
  }
private: bool owned = false; // Possibly owns the std::istream std::istream* inp = &std::cin; // Here is the std::istream };

Finally, we need it to behave like a std::istream, which means supplying a conversion operator:

class command_stream {
public:
  struct from_command_line {};
  command_stream() = default;
  command_stream(std::string filename)
    : owned(true)
    , inp(new std::ifstream(filename))
  {}
  command_stream(std::string command_list, from_command_line)
    : owned(true)
    , inp(new std::istringstream(command_list))
  {}
  ~command_stream() {
    if (owned) delete inp;
  }
  operator std::istream&() { return *inp; }
private: bool owned = false; // Possibly owns the std::istream std::istream* inp = &std::cin; // Here is the std::istream };

There we have it. Everything is tidily hidden away behind the public interface. The default values of owned and inp are false and std::cin. These remain unchanged when an instance is initialized with the default constructor. Other constructors are used for initializing from a string stream or a file stream. We can now rewrite our code fragment thus:

auto input = [&]() -> command_stream {  auto source = input_source_from_
ptr(argv[1]);
  switch (source) {
  case input_source::std_in:
    return {};
  case input_source::command_line:
    return {{argv[2]}, command_stream::from_command_line{}};
  case input_source::file:
    return {argv[2]};
  }
}();

This is rather clearer. The conversion operator allows us to treat this object as if it were of type std::istream&, so we can use the overloaded chevron operators (operator >> and operator <<) for comfortable and familiar syntax. When the name falls out of scope, the command_stream object will be destroyed, and the ownership flag will ensure appropriate destruction of the std::istream object. This is a somewhat complex piece of implementation, but it is abstracted away behind a simple interface.

Summary

We hope you will have noticed that abstraction is a running theme throughout this book: abstraction localizes and minimizes complexity. In Chapter 2.1 we looked at how abstraction is more than merely encapsulating and data hiding: it allows us to create levels of complexity and expose only the bare minimum required to interact with an object.

In Chapter 2.2 we looked at how arguments increase in number, and observed that this could be down to a missing abstraction. The complexity of comprehending many parameters is hidden behind the abstraction that describes the role those parameters play.

In Chapter 4.1 we carried out a similar transformation and collected together a messy bunch of data into a single abstraction, returning that rather than taking multiple out parameters.

The Core Guidelines are explicitly motivated by abstraction in some cases. In C.8: “Use class rather than struct if any member is non-public,” the reason for so doing is to make it clear that something is being abstracted. Since class members are private by default, a class signals to you that there are things that you do not need to see.

In Core Guideline ES.1: “Prefer the standard library to other libraries and to ‘handcrafted code,’” it is observed that library code tends to be of a higher level of abstraction. As I remarked earlier, I have participated in code reviews where the engineer simply rewrote std::rotate or std::mismatch without realizing it, and did a worse job.

In Core Guideline ES.2: “Prefer suitable abstractions to direct use of language features,” the use of abstraction is explicit in the title. This guideline demonstrates that application concepts are further from the bare language than they are from appropriate combinations of library features.

Even the term “abstract base class” describes an entity that hides complexity by only exposing a public API, insulating the client from the implementation details. Abstraction is fundamental to developing software in C++.

When a guideline violation seems unavoidable, facilities exist to minimize and hide the violation. One of the driving principles of C++ is zero-cost abstraction, so make use of that to maximize the clarity and comprehensibility of your code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.190.159.10