CHAPTER 1

image

The Art of Building Modern Software

The history of software development is short. It has been less than one hundred years since people wrote and managed to execute the first computer program. Although brief, this history is reminiscent of the history of any other intellectual invention. The most interesting parallel I’ve heard so far is the comparison of the history of computer science with the way people tried to understand the real world. One result of this comparison also yields an explanation of why a good application programming interface (API) is needed. Let’s walk down that road now.

Rationalism, Empiricism, and Cluelessness

The initial impulse for this kind of understanding came from Galileo Galilei’s law of falling objects, which postulates that two objects regardless of their weight always fall with the same acceleration. This goes completely against natural expectations, as anyone who has tried to drop a brick and a piece of paper knows that they are unlikely to reach the ground at the same time. The brilliance of Galileo and other modern scientists was that he attributed that to mutual cooperation of various natural laws, where the law of free fall was just one of them. How did Galileo discover his law? He did a mental experiment. He imagined two solid balls of the same size and weight being dropped. Indeed, they would reach the ground at the same time. Then he imagined the same experiment but with one solid ball and one ball cut in the middle into two parts, but with both parts being closely attached to each other. The result of this experiment ends up exactly the same as the first one—both objects fall synchronously. Now what happens if we slowly start to separate the two pieces of the ball? We can even keep them connected with a small wire to still form one body. Indeed, regardless of the two halves of the ball being a centimeter, meter, or even further apart, this object would continue to fall with the same speed as the full solid ball. And last but not least, the same result would be obtained even if we remove the wire! This result is completely against natural experience.

The renaissance of modern science seemed to create two major, yet extreme, philosophical approaches. Rationalism treated reason to be the primary source of information and postulated that using just a pure thought, it is possible to understand and describe the real world. The set of philosophers supporting this idea includes the progenitors of modern science Rene Descartes (1596–1650) and Gottfried Wilhelm Leibniz (1646–1716), as well as Benedict Spinoza (1632–1677), the creator of pantheism. Experience says that a piece of paper falls more slowly than stone. This pure mental experiment explains that weight does not affect acceleration of falling objects.

UNCONSCIOUS MATH AND PHYSICS

You’ll find, while reading the book, that I am obsessed with physics parables. Yes, I am, because, after reading Petr Vopĕnka’s book2 about the importance of unconsciousness in the success of modern mathematics and physics, I just cannot get his philosophical explanations from my mind. From time to time I reuse some of his observations, however only in a very condensed form, as his book has more than 800 pages and carefully builds proper understanding of all the terms. That is not the case for this book. Deep explanation of all his concepts is beyond its capacity and purpose. As such, please excuse occasional simplifications.

People say that Galileo discovered his famous law by throwing stones from the leaning tower of Pisa. Maybe he really tried that, but it was the mental experiment that could explain the behavior without doubts. It was the first time that just plain thought could prove observation and experience wrong. Although in reality lighter objects fall more slowly than heavier ones, which is what we know from our experience, we now know that other laws interfering with gravity cause differences in falling speed. This was the experiment that showed the power of pure reason and that gave Leibniz and Descartes their impetus to favor reason over experience. It was the catalyst for the whole philosophical movement of rationalism. Indeed, this approach believes that the subject of research is and has to be reasonable. Indeed, if discoverable by reason, it needs to have a reasonable origin.

On the other side of the English Channel there was empiricism. Nearly at the same time, great British minds such as David Hume (1711–1776), John Locke (1632–1704), and George Berkeley (1685–1753) insisted that the primary source of understanding is experience. Without seeing, hearing, or feeling the world, the mind has no chances to “think it up.” To understand means to experience—or, in a more scientific way, to do experiments. Even here we can trace the roots of Galileo, the first scientist who propagated scientific experiments as a source for verification that an idea or a hypothesis is valid. From the empiricist point of view, the world does not need to be reasonable. It might not be fully known, it might even not exist at all, and in fact doesn’t matter. It isn’t necessary to understand it all if the perception of the senses makes sense.

From today’s point of view those two extreme ways to perceive the world are not in fact that far away. At least current science understands the value of an experiment to verify its theories. Also, Descartes understood the need for an experiment in science as well. So for us it shouldn’t be a big problem to merge two opposing views into one. And yes, these days it’s quite easy to do so. For most of our lives we don’t care much about the philosophical aspects of our surroundings, we care more about the results. Life is supposed to be entertaining, not boring, and reasonable. However, things we use daily “just” need to work—we usually don’t care how they work. For example, we’re completely clueless about cars and mobile phones. We feel it’s reasonable to use them, we just have no clue how they do what they do. We live in total cluelessness.

RATIONAL APPROACH TO CLUELESSNESS

Writing APIs and books for an international audience is hard. Personal preferences and also cultural differences influence the way we approach problems that we face. Rationalists prefer to talk about theory—about the internal connections behind real objects—and only later they create real examples mapping the theory to the real world. Empiricists, on the other hand, would like to gain as much practical experience as possible, and only later, if ever, make judgments about the relation between objects of the world.

This book explains API design from the viewpoint of selective cluelessness. It sees APIs as a perfect tool to help us maximize cluelessness, while getting reliable results. It is essential to get a correct feeling for what cluelessness really is. However, we’ll build our understanding of that term from a rationalist’s point of view— we start with theory and not examples. This might not be the preferred approach for everyone; however, I cannot satisfy both camps at once. Anyway, do not despair—as soon as the theory is over, and we have a common vocabulary for the science of API design, there will be more than enough practical applications.

Cluelessness is a way of life for a majority of us. It is the result of the merging of rationalism and empiricism that applies these days. It is everywhere around us. It is present even in the way we program and do software engineering.

Evolution of Software So Far

In the 1940s and early ‘50s, programming was hard. People had to learn machine code to speak the computer’s language, know the sizes and number of registers, and in worse cases even handle the screwdriver and connected wires that physically carried the signal between individual computing units. The ratio between the work needed to think up an algorithm and the slavery to turn it into an executable program was harshly tilted toward the boring, mechanical jobs.

FORTRAN was like heaven-sent simplification. Just like an empiricist, it allowed programmers to perceive the world of computation of mathematical formulas with just limited senses. Programmers no longer needed to understand assembly language or worry about the technical internals of computers. They could completely forget about these details and concentrate much more on the important thing—on converting a mathematical formula into algorithmic steps to compute it. FORTRAN simplified the software development process while only minimally limiting the things people could compute: a huge win for empiricism.

However, programming still was not an easy discipline and the appetite for simplicity continued to grow. New languages such as COBOL came up with visions such as “approachable by novice programmers” and “language readable by management,” and simplified certain tasks associated with programming even more. Today nobody is seriously considering writing a new system in COBOL. However, in those days COBOL significantly reduced the amount of knowledge needed to access and manipulate a database compared to what was needed with plain assembler or even FORTRAN. Empiricism was on the rise.

However, not everyone liked this. There are and always have been people who believe in reason—who think that the world and things in it should be reasonable. Those people can be found all around us, even among programmers. In the ‘50s, rationalists such as John McCarthy invented the LISP language based on the mathematical model of lambda calculus, which as such had a strong theoretical back end. As mathematics is almost always a rationalistic discipline, this back end was backed up by pure reasoning. It’s rumored that during the design of LISP there was a period when “mathematical neatness became more important than anything else.” What a sign of rationalism! The language need not be useful, it might not even be implementable, but it has to be pure and clean and reasonable.

Some say there have been two schools of computer science—European and American. Although the Americans are usually more pragmatic (of course, as pragmatism was invented there), the Europeans often search for the great vision. You can observe this in computer engineering design as well. There are various examples where great European minds prefer rationalism over functionality. Edsger W. Dijkstra—the inventor of message-based computing and the semaphore synchronization pattern—wrote in Selected Writings on Computing: A Personal Perspective that “programming emerged as a tough engineering discipline with a strong mathematical flavor.”3 If it was, then we would all go with the way of rationalism. However, when I look around and see how people program all those accounting programs, hospital patient databases, and so on, I have a feeling that there is just as little math in that as in cooking. Indeed, good algorithms may require mathematical background. However, as another influential European mind, Niklaus Wirth—the inventor of Pascal, Oberon, and other systems—noted: “Simple, elegant solutions are more effective, but they are harder to find and require more time.” Of course, he was right. In an era when time to market is one of the most valued measures of success, there is no time to invest in what could be a never-ending search for pure and clean solutions.

It looks like it’s time to confirm that the rationalist approach has no space in today’s software engineering world, especially because we are running out of programmers who worship at the church of rationalism. Or, as Figure 1-1 illustrates, we are running out of programmers almost completely. This is not new, as another quote from Dijkstra illustrates: “Good programming is probably beyond the intellectual capabilities of today’s average programmer.” True. However, the hunger for new programs is increasing. What can be done about that?

9781430209737_1_Fig1.jpg

Figure 1-1. Will code HTML for food

Following the analogy of understanding our world, where rationalism or empiricism are not of nearly any importance to regular human beings, the obvious advice is to get clueless. The situation with programming is similar. The world, or at least our society, doesn’t need every human to be a philosopher to work. It’s organized in a way that there is room for the less educated—that is, the more clueless of us—and still things seem to work. Similarly, software engineering doesn’t need every programmer to be a highly educated scientist. If we want to deliver as much software as we do now or even more, we need a system where programmers can be clueless and still produce reliable systems.

Indeed, the aforementioned cluelessness is not complete unawareness of programming. Obviously, just randomly typing characters on the keyboard is unlikely to produce a compilable program. Knowing how to code is indeed a prerequisite for a programmer (just like the ability to watch, absorb, and discuss TV ads is a necessary skill among certain human societies). The point of software engineering cluelessness is to enable the programmers to know less and still achieve good results. It cannot be generally said which bits of knowledge are necessary and which are not. However, the goal is to find such coding practices that allow developers to know less than everything—that is, to select the pieces of knowledge they need. I’ll call this selective cluelessness.

Gigantic Building Blocks

An average system built in the first decade of the 21st century is like a massive pile of dirt with no—or just a little—elegance behind it. The primary motivation is always to get things done with as little effort as possible. As such, engineering teams tend to reuse existing software frameworks even when they are more heavyweight than needed.

PUTTING A WEB PAGE ON THE WEB

Recently I needed to put a dynamic web page on my server. I had two choices: either open a socket on some port, read streams from incoming connections, and write something in a reply; or assemble the system from existing technologies. I tried both.

The “write from scratch” approach worked fine. I read the specification for HTTP, parsed the incoming header, and wrote out the output. That was a relatively small piece of code and worked well after a little bit of debugging. However, then I needed an additional feature—the ability to secure the page, handle POST requests, and so on. I could read the RFC documents and implement these as well. However, this turned out to be more work than I wanted to spend on it.

That is why I tried the “assembling” approach. I took the Tomcat web server, wrote one servlet, tuned a few properties in the config files, and voilà! Everything worked—with just one drawback. Instead of 50KB of code, I suddenly had over 1MB!

Systems these days are composed of huge building blocks. Nobody writes the whole stack of technologies alone. Instead, you are more likely to install a reliable and cheap operating system. Then on top of that, place a commonly used web server and add a database server. With this multimegabyte installation, you’re ready to solve the basic problems—such as generating an HTML page. At that point this is an easy task. However, nobody can claim that the whole system is simple. In fact it’s quite complex, and no single person on the planet could understand it all. This is a perfect example of cluelessness: the programmer who creates the HTML page gets the job done while having just a minimal understanding of the whole system.

This whole approach brings to me the image of coding with a bulldozer—if a database is needed, let’s use raw power and put a database server into the system. Or if we need a more reliable runtime, let’s install Java and an application server. Regardless of how heavyweight these frameworks are, you can always find a big enough bulldozer to move them on top of the pile of dirt—that is, the application. If the application starts to consume too much memory, from the rationalism point of view you might consider optimizing it. However, when you have a bulldozer, all the problems look the same and you just buy another few gigabytes of memory. If the system continues to be sluggish, you might invest in clustering or some form of virtualization—and the pile of bits of the application just grows and grows and almost never gets smaller.

The question is whether using the bulldozer coding approach is that bad. In fact, it’s likely more productive than trying to search for Wirth’s “simple, elegant solutions” that are more effective, but too time-consuming to find. If you look around the Web, you can see pretty good applications and servers (likely) written using some sort of bulldozer-like approach. Amazon, Yahoo!, and so on are gigantic sites, which work well and without major problems. That seems to indicate that the bulldozer style is a viable way of designing and coding modern software systems. The incredible computing power we all have tilts the balance heavily toward brute-force computing. It really works!

Beauty, Truth, and Elegance

I’m pretty sure many of you found the previous glorification of cluelessness irritating. How can a pile of dirt massaged with a heavy crawler tractor replace elegance? How can such applications be correct when they’re so ugly? This cannot work! Well, it can; we just need to look more closely at the preoccupation many of us have.

The roots of our sciences are still beneath us and they influence the way we think all the time. They were lain down by the ancient Greeks many centuries ago and still form the way we treat the relationship between truth and beauty. From the Greek philosophers’ point of view, the most valuable scientific knowledge was clear—with its meaning not hidden by any associated misinterpretations—and sharp; that is, not fuzzy. Unsurprisingly, the most highly regarded science was geometry. The main reason was that it wasn’t a science about the real world, but rather the geometric one—that is, the one where a line between two points is straight, where a sphere is absolutely round, and so on. This level of perfection was unmatchable by any other science, especially those concerned with the study of the real world. A spherical stone might look like a geometric sphere when seen from a distance. However, the closer you get, the more it is obvious that its spherical shape was just an illusion of perception. So the science of rocks is less sharp and cannot be as clear as pure geometry. It needs to take into account mistakes made in the formation of its objects.

A significant difference between the Greeks’ geometric world and the real world was its stability. Although the objects in the real world are ever changing—for example, today’s stones can be smashed to pieces or carved out into sculptures tomorrow—objects of the geometric world are absolutely stable. For example, a circle around a square is always going to be the same, and a right angle will always be 90 degrees. As a result, thoughts and reasoning about geometry—about its objects and their relations—will have eternal validity. A truth in geometry, unlike a truth about the real world, will last forever. That is why the Greeks saw geometry as a science about absolute truths.

Objects in the geometric world vary by their complexity. For example, to define a circle you need its center and radius. To define an ellipse, you need two radii instead of one. That means the circle is easier to define than the ellipse. Similarly, defining a square is simpler than defining a rectangle. Given geometry’s preference for clarity, it follows that the circle and the square are more “pure” or more beautiful than the ellipse and rectangle. As soon as Greek philosophers realized this, the geometric world became not just a place of truth, but also a place of beauty. Since then truth and beauty have been seen as mates, not only in geometry, but in art and elsewhere. Greek sculptures are so geometric, created with so much attention paid to various ratios between their parts (for example, a head is supposed to be 1/8 of the whole body, there is the golden ratio, and so on) that they engaged the geometric truth, beauty, and elegance.

The classic Greek style and Greek ideas of proportion, beauty, and harmony became ubiquitous in both art and science during the Renaissance. Indeed, the arts of that time, as the name suggests, built upon the Greek aesthetic heritage. However, the influence extended significantly beyond the field of arts. It entered philosophy as well as the new science about the real world just being born: physics. Galileo and others brought geometry into the real world. They took the ideal and perfect geometric world and put it behind the real one. They started to look through the real world as through a glass pane, seeing not only the real world, but also the geometric one behind it. For example, they started to see objects of the real world as mass points, their movements as trajectories, their rotation as movements on a circle. This all drew geometry closer and closer to the real world. The geometric world became the world behind the real one. And together with geometry, truth and beauty entered the real world.

The enormous success of Renaissance physics, which was mastered by Isaac Newton’s physical laws and its previous marriage with geometry, makes physics look like the most perfect science. Not only did it describe the real world, but also it did so with the same elegance as geometry. Planets move on elliptical trajectories; objects thrown from a tower fall following laws of the real world, which science knows and understands. Based on this knowledge, it is possible to predict the future. Physics knows the world, knows how it behaves, knows the truth behind its laws. As a result, the world is no longer a dark and obscure place, but instead it is fully beautiful. Once again, those who know Newton’s physics can confirm that truth and beauty are engaged again, this time not just in the geometric world, but in reality.

Newtonian physics is the final masterpiece of the Renaissance; it is the finished version of Renaissance physics. It is elegant, correct, and beautiful. It describes the world as built around Euclid’s geometric space, and in fact it is the best and most finite form of rationalism. Such a world is discoverable by the pure mind without using experience. However, since then physics has changed. Einstein helped us recognize that space is not like that of Euclid, but rather that it is curved. Quantum theory disqualifies geometry (for which size is no issue) from being the model of the real world. Things just get too complicated, and modern physics isn’t related to Greek geometry at all. Science can still be useful and predict various truths, but as a side effect of moving away from geometry, the world seems to be a less and less suitable place for beauty.

On the other hand, most adults (including software engineers) know just as much about physics as was discovered by Newton. Many of us have heard about the theory of relativity. However, not many of us can explain it. That is why the illusion of the world being well organized and beautiful is still with us, and it even seems to be supported by the sharpest of natural sciences—Newton’s physics. That is why we seem to be convinced and expect that other parts of the world will be beautiful as well. Indeed, every science tries to be as pure as geometry and physics. Only if it merges truth with beauty is it generally accepted as good science. Perhaps beauty is simply more naturally recognizable and memorable, as opposed to chaotic theories with no structure.

ENIGMA MOVIE

Recently I had a chance to see the movie Enigma, a romance on top of a security encryption story from World War II. The main mathematician was asked whether he liked math. His answer: “I like numbers, because in numbers the truth and beauty emerge. You find out, you are walking in the right direction when everything becomes nicer. And then the numbers get you closer to the secret matter of all things.” I can hardly imagine better glorification of the Greeks’ ideal of truth, beauty, and elegance. How deeply inside of most of us the feeling that these three belong together has to be, when a sentence like this appears in a romantic movie for nonscientists.

Computer science and software engineering are no exception in their preference for beauty and truth. However, you should keep in mind that the primary goal of making software is to get a reliable release out the door. In the rush of the final stages of the release cycle, beauty is indeed one of the last things engineers think about. We prefer to fix or, more accurately, work around critical bugs and then release the product. In fact, simplicity and elegance is not the goal at all. There is no place for it, although the feeling that it is needed is still there. Now that we are aware of this, we can respond by offering cluelessness as the software development methodology for now and the future.

More Cluelessness!

We’ve seen that simplicity and elegance are not the goal of today’s successfully deployed software systems. Just like in philosophy, rationalism is too academic for everyday understanding of the real world. The most promising development style seems to be a pragmatic use of the “bulldozer” approach: reuse components that are already available, compose applications from big chunks of premade libraries, glue them together, and make sure it works, even without fully understanding how. Although many would reject this point of view, it is the de facto style, mostly unconsciously, behind today’s biggest software projects. The question to ask is whether we can make this clueless approach work even better, now that we have become fully conscious of it.

The thing to admire about the bulldozer style is that it can produce good results even as the participants, mostly programmers, don’t understand most of the system. This might be frightening at first. On the other hand, we do this all the time. You don’t want to understand the design of a car to drive it. You don’t need to understand chemistry to clean teeth. In a similar way, you don’t need to understand Windows’ code to write a simple Win32 application. You can become pretty efficient in programming for Windows just by having a working knowledge of the APIs (in this case the Win32 API) and knowing where to find the appropriate documentation. This is true for almost all systems. To code for Linux, Java, and the Web, you can learn just a tip of an iceberg and that is enough to get most of the coding done. The reason for this is the abstraction that wraps around every library or framework. This abstraction—the API— hides all the complexities. This brings us to the main theme of this book.

The more selectively clueless you are, the more reliable the system is. Throughout this book, we’ll explore various ways to help people not understand everything yet still use your library at once at its certain revision, in a way that will continue to work in subsequent releases.

THE ORIGIN OF CLUELESSNESS

My first encounter with the term “cluelessness” dates to October 2006 when I heard a presentation at the OOPSLA conference in Portland, Oregon. Martin Rinard, an invited speaker at that conference, had a controversial, yet inspiring talk called “Minimizing Understanding in the Construction and Maintenance of Software Systems.”

He provided the observation that the human brain is finite and can handle only a limited amount of data. If the goal is to build larger and larger applications, we need to learn how to do it while knowing just limited amounts of them. His talk outlined three directions to explore:

  • Program verification
  • System engineering
  • Living with errors

In his presentation he then continued to explore how to make a program acceptable while having some errors in it. Indeed, it was a horrible vision for those who love truth, beauty, and elegance; however, an acceptable compromise for those who try to produce real software systems.

Although I liked Mr. Rinard’s introduction very much and I agree with his conclusions with respect to errors, this book will now concentrate on system engineering and, to an extent, program verification. However, learning how to reach as much cluelessness as possible, while still producing reliable systems, will still be a tempting and desirable goal.

The term cluelessness is not meant to be offensive. It is here to distinguish between different types of understanding. There can be shallow understanding, where we understand any subject just as much as necessary to use it, and deep understanding, where we understand the principle behind things. In everyday life we usually need only shallow understanding. If we need to use a television, we don’t need a deep understanding of how televisions actually work. If we need to figure out the right location, we don’t need to understand the elaborate system of satellites flying around the Earth. Their function as well as the importance of their mutual positions is beyond our horizon; for us it is enough to recognize the latitude and longitude on the display of our GPS. Of course, there are situations when someone needs to understand more. For example, people who repair GPS devices, cars, or TVs need to have a much wider horizon. Still, even those people need only shallow understanding. They don’t need to know about every little detail. It’s certainly possible to learn almost everything about TVs and cars, and if we needed to, we could do so. However, such deep understanding isn’t usually necessary, and that’s why most of us stick with shallow knowledge in our day-to-day actions.

In a similar sense, cluelessness in software development means that we can rely only on shallow understanding, at least most of the time. The term “selective cluelessness” is here as a reminder that we actively choose what is and what is not important for us to know about deeply. That’s why selective cluelessness—for the rest of the book called usually only cluelessness—is a thoroughly positive term.

________________

2.    Petr Vopĕnka, Úhelný kámen evropské vzdelanosti a moci (Prague: Práh, 1999).

3.    Edsger Dijkstra, Selected Writings on Computing: A Personal Perspective (New York: Springer-Verlag, 1982).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.81.206