A Brief History of Software Engineering

The first modern computer was an electromechanical, typewriter-sized device developed in Poland in the late 1920s for enciphering messages. The device was later sold to the German Commerce Ministry, and in the 1930s the German military adopted it for enciphering all wireless communication. Today we know it as the Enigma.

Enigma used mechanical rotors to change the route of electrical current flow to a light board in response to a letter key being pressed, resulting in a different letter being output (the ciphered letter). Enigma was not a general-purpose computer: it could only do enciphering and deciphering (which today we call encryption and decryption). If the operator wanted to change the encryption algorithm, he had to physically alter the mechanical structure of the machine by changing the rotors, their order, their initial positions, and the wired plugs that connected the keyboard to the light board. The "program" was therefore coupled in the extreme to the problem it was designed to solve (encryption), and to the mechanical design of the computer.

The late 1940s and the 1950s saw the introduction of the first general-purpose electronic computers for defense purposes. These machines could run code that addressed any problem, not just a single predetermined task. The downside was that the code executed on these computers was in a machine-specific "language" with the program coupled to the hardware itself. Code developed for one machine could not run on another. In fact, at the time there was no distinction between the software and the hardware (indeed, the word "software" was coined only in 1958). Initially this was not a cause for concern, since there were only a handful of computers in the world anyway. As machines became more prolific, this did turn into a problem. In the early 1960s the emergence of assembly language decoupled the code from specific machines, enabling it to run on multiple computers. That code, however, was now coupled to the machine architecture: code written for an 8-bit machine could not run on a 16-bit machine, let alone withstand differences in the registers or available memory and memory layout. As a result, the cost of owning and maintaining a program began to escalate. This coincided more or less with the widespread adoption of computers in the civilian and government sectors, where the more limited resources and budgets necessitated a better solution.

In the 1960s, higher-level languages such as COBOL and FORTRAN introduced the notion of a compiler: the developer would write in an abstraction of machine programming (the language), and the compiler would translate that into actual assembly code. Compilers for the first time decoupled the code from the hardware and its architecture. The problem with those first-generation languages was that the code resulted in nonstructured programming, where the code was internally coupled to its own structure via the use of jump or go-to statements. Minute changes to the code structure often had devastating effects in multiple places in the program.

The 1970s saw the emergence of structured programming via languages such as C and Pascal, which decoupled the code from its internal layout and structure using functions and structures. The 1970s was also the first time developers and researchers started to examine software as an engineered entity. To drive down the cost of ownership, companies had to start thinking about reuse—that is, what would make a piece of code able to be reused in other contexts. With languages like C, the basic unit of reuse is the function. But the problem with function-based reuse is that the function is coupled to the data it manipulates, and if the data is global, a change to benefit one function in one reuse context is likely to damage another function used somewhere else.

Object-Orientation

The solution to these problems that emerged in the 1980s, with languages such as Smalltalk and later C++, was object-orientation. With object-orientation, the functions and the data they manipulated were packaged together in an object. The functions (now called methods) encapsulated the logic, and the object encapsulated the data. Object-orientation enabled domain modeling in the form of a class hierarchy. The mechanism of reuse was class-based, enabling both direct reuse and specialization via inheritance. But object-orientation was not without its own acute problems. First, the generated application (or code artifact) was a single, monolithic application. Languages like C++ have nothing to say about the binary representation of the generated code. Developers had to deploy huge code bases every time they needed to make a change, however minute, and this had a detrimental effect on the development process and on application quality, time to market, and cost. While the basic unit of reuse was a class, it was a class in source format. Consequently, the application was coupled to the language used—you could not have a Smalltalk client consuming a C++ class or deriving from it. Language-based reuse implied uniformity of skill (all developers in the organization had to be skilled enough to use C++), which led to staffing problems. Language-based reuse also inhibited economy of scale, because if the organization was using multiple languages it necessitated duplication of investments in framework and common utilities. Finally, having to access the source files in order to reuse an object coupled developers to each other, complicated source control, and coupled teams together, since it made independent builds difficult. Moreover, inheritance turned out to be a poor mechanism for reuse, often harboring more harm than good because the developer of the derived class needed to be intimately aware of the implementation of the base class (which introduced vertical coupling across the class hierarchy).

Object-orientation was oblivious to real-life challenges, such as deployment and versioning issues. Serialization and persistence posed yet another set of problems. Most applications did not start by plucking objects out of thin air; they had some persistent state that needed to be hydrated into objects. However, there was no way of enforcing compatibility between the persisted state and the potentially new object code. Object-orientation assumed the entire application was always in one big process. This prevented fault isolation between the client and the object, and if the object blew up, it took the client (and all other objects in the process) with it. Having a single process implies a single uniform identity for the clients and the objects, without any security isolation. This makes it impossible to authenticate and authorize clients, since they have the same identity as the object. A single process also impedes scalability, availability, responsiveness, throughput, and robustness. Developers could manually place objects in separate processes, yet if the objects were distributed across multiple processes or machines there was no way of using raw C++ for the invocations, since C++ required direct memory references and did not support distribution. Developers had to write host processes and use some remote call technology (such as TCP sockets) to remote the calls, but such invocations looked nothing like native C++ calls and did not benefit from object-orientation.

Component-Orientation

The solution for the problems of object-orientation evolved over time, involving technologies such as the static library (.lib) and the dynamic library (.dll), culminating in 1994 with the first component-oriented technology, called COM (Component Object Model). Component-orientation provided interchangeable, interoperable binary components. With this approach, instead of sharing source files, the client and the server agree on a binary type system (such as IDL) and a way of representing the metadata inside the opaque binary components. The components are discovered and loaded at runtime, enabling scenarios such as dropping a control on a form and having that control be automatically loaded at runtime on the client's machine. The client only programs against an abstraction of the service: a contract called the interface. As long as the interface is immutable, the service is free to evolve at will. A proxy can implement the same interface and thus enable seamless remote calls by encapsulating the low-level mechanics of the remote call. The availability of a common binary type system enables cross-language interoperability, so a Visual Basic client can consume a C++ COM component. The basic unit of reuse is the interface, not the component, and polymorphic implementations are interchangeable. Versioning is controlled by assigning a unique identifier for every interface, COM object, and type library.

While COM was a fundamental breakthrough in modern software engineering, most developers found it unpalatable. COM was unnecessarily ugly because it was bolted on top of an operating system that was unaware of it, and the languages used for writing COM components (such as C++ and Visual Basic) were at best object-oriented but not component-oriented. This greatly complicated the programming model, requiring frameworks such as ATL to partially bridge the two worlds. Recognizing these issues, Microsoft released .NET 1.0 in 2002. .NET is (in the abstract) nothing more than cleaned-up COM, MFC, C++, and Windows, all working seamlessly together under a single new component-oriented runtime. .NET supports all the advantages of COM and mandates and standardizes many of its ingredients, such as type metadata sharing, dynamic component loading, serialization, and versioning.

While .NET is at least an order of magnitude easier to work with than COM, both COM and .NET suffer from a similar set of problems:

Technology and platform

The application and the code are coupled to the technology and the platform. Both COM and .NET are available only on Windows. Both also expect the client and the service to be either COM or .NET and cannot interoperate natively with other technologies, be they Windows or not. While bridging technologies such as web services make interoperability possible, they force the developers to let go of almost all of the benefits of working with the native framework, and they introduce their own complexities and coupling with regard to the nature of the interoperability mechanism. This, in turn, breaks economy of scale.

Concurrency management

When a vendor ships a component, it cannot assume that its clients will not access it with multiple threads concurrently. In fact, the only safe assumption the vendor can make is that the component will be accessed by multiple threads. As a result, the components must be thread-safe and must be equipped with synchronization locks. However, if an application developer is building an application by aggregating multiple components from multiple vendors, the introduction of multiple locks renders the application deadlock-prone. Avoiding the deadlocks couples the application and the components.

Transactions

If multiple components are to participate in a single transaction, the application that hosts them must coordinate the transaction and flow the transaction from one component to the next, which is a serious programming endeavor. This also introduces coupling between the application and the components regarding the nature of the transaction coordination.

Communication protocols

If components are deployed across process or machine boundaries, they are coupled to the details of the remote calls, the transport protocol used, and its implications for the programming model (e.g., in terms of reliability and security).

Communication patterns

The components may be invoked synchronously or asynchronously, and they may be connected or disconnected. A component may or may not be able to be invoked in either one of these modes, and the application must be aware of its exact preference. With COM and .NET, developing asynchronous or even queued solutions was still the responsibility of the developer, and any such custom solutions were not only difficult to implement but also introduced coupling between the solution and the components.

Versioning

Applications may be written against one version of a component and yet encounter another in production. Both COM and .NET bear the scars of DLL Hell (which occurs when the client at runtime is trying to use a different, incompatible version of the component than the one against which it was compiled), so both provide a guarantee to the client: that the client would get at runtime exactly the same component versions it was compiled against. This conservative approach stifled innovation and the introduction of new components. Both COM and .NET provided for custom version-resolution policies, but doing so risked DLL Hell-like symptoms. There was no built-in versioning tolerance, and dealing robustly with versioning issues coupled the application to the components it used.

Security

Components may need to authenticate and authorize their callers, but how does a component know which security authority it should use, or which user is a member of which role? Not only that, but a component may want to ensure that the communication from its clients is secure. That, of course, imposes certain restrictions on the clients and in turn couples them to the security needs of the component.

Off-the-shelf plumbing

In the abstract, interoperability, concurrency, transactions, protocols, versioning, and security are the glue—the plumbing—that holds any application together.

In a decent-sized application, the bulk of the development effort and debugging time is spent on addressing such plumbing issues, as opposed to focusing on business logic and features. To make things even worse, since the end customer (or the development manager) rarely cares about plumbing (as opposed to features), the developers typically are not given adequate time to develop robust plumbing. Instead, most handcrafted plumbing solutions are proprietary (which hinders reuse, migration, and hiring) and are of low quality, because most developers are not security or synchronization experts and because they were not given the time and resources to develop the plumbing properly.

The solution was to use ready-made plumbing that offered such services to components. The first attempt at providing decent off-the-shelf plumbing was MTS (Microsoft Transactions Server), released in 1996. MTS offered support for much more than transactions, including security, hosting, activation, instance management, and synchronization. MTS was followed by J2EE (1998), COM+ (2000), and .NET Enterprise Services (2002). All of these application platforms provided adequate, decent plumbing (albeit with varying degrees of ease of use), and applications that used them had a far better ratio of business logic to plumbing. However, by and large these technologies were not adopted on a large scale, due to what I term the boundary problem. Few systems are an island; most have to interact and interoperate with other systems. If the other system doesn't use the same plumbing, you cannot interoperate smoothly. For example, there is no way of propagating a COM+ transaction to a J2EE component. As a result, when crossing the system boundary, a component (say, component A) had to dumb down its interaction to the (not so large) common denominator between the two platforms. But what about component B, next to component A? As far as B was concerned, the component it interacted with (A) did not speak its flavor of the plumbing, so B also had to be dumbed down. As a result, system boundaries tended to creep from the outside inward, preventing the ubiquitous use of off-the-shelf plumbing. Technologies like Enterprise Services and J2EE were useful, but they were useful in isolation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.170.63