Foreword

Humankind is defined by language; but civilization is defined by writing.”

Computer science is defined by computer languages.

Will the information age be defined by the web of software languages?[2]

[2] The first two sentences are the first two sentences of the book The world’s writing systems, co-edited by P. T. Daniels and W. Bright. These three sentences attempt to summarize in a very concise way the logical progression from Stone Age and natural languages toward Information Age and software languages.

Since the invention of writing and then of printing, many books have been written on many different topics. So one should always ask, why another book? In particular, why did the author spend so much time and effort in writing this book? Should we read it? Is there something new and important here? The answer is yes. My recommendation is clear: Don’t miss this book!

I’ve spent a couple of decades trying to understand software, which is certainly the most complex and versatile artefact ever built by humans. Instead of reading the future, I’ve dedicated a lot of time and effort to reading the past, following Winston Churchill’s advice: “The farther back you can look, the farther forward you are likely to see.” I realized that getting a deep understanding of software required understanding the history of not only computer science but also information technology and, ultimately, going back to prehistory, when everything started.

Nobody knows for sure when articulated language first appeared, but it was in prehistoric times, during the Stone Age. There is a general agreement on the fact that Homo sapiens is the only species with such elaborated language. Other species use different communication means, but they remain very limited. For instance, bees use a “domain-specific language” to indicate to other bees the direction of food. By contrast, human language distinguishes itself for being general-purpose and reflexive. This latter property describes a language’s capability to describe facts about itself. This book, which throughout talks about languages, is a perfect example of the metalinguistic capacity of humans. By reading this book, you will learn how metalanguages can be used to define languages. This task is not only extremely enjoyable for the intellect but also has strong practical implications.

Writing is, without any doubt, one of the first and most important information technologies ever invented. You could not read this book without it. As a matter of fact, the invention of writing marks the shift from pre-history to history. Writing did not pop up all at once, though. Very rudimentary techniques, such as notched bones or pebbles, were used in prehistoric times for information recording and basic information processing. But with the neolithic revolution, this situation started to change.

The domestication of plants and animals led to agriculture and livestock farming. Social life organized around settlements enabled (1) the separation of activities, (2) the specialization of knowledge and skills, (3) the appearance of “domain experts,” and (4) the diversification of artifacts produced. While some men worked in fields, other specialized as shipmen, butchers, craftsmen, and so on. The notions of property and barter appeared, bringing new requirements for information recording and processing. In Mesopotamia, protowriting appeared in the form of clay tablets; in Egypt, in the form of tags. Domain-specific languages were used to record the number of sheep, grain units, or other goods. Other kinds of tablets were used to organize tasks.

As villages grew into cities and then into kingdoms, the complexity of social structures increased in an unprecedented manner. Protowriting turned into writing, and civilizations appeared in Egypt and Mesopotamia. Civilization is based on writing. Without writing, rulers could not establish laws, orders could not be sent safely to remote troops, taxes could not be computed, lands and properties could not be administrated, and so on. Early protowriting systems were used locally and evolved in an ad hoc way. The development of writing, by contrast, was somehow controlled and managed by states, via the caste of scribes and the establishment of schools and then libraries to keep bodies of texts or administrative records. In other words, one of the characteristics of civilizations is that written languages are “managed.” As you will see, this book provides important insights about various roles in language management and language engineering, which is one step further. Note that many kinds of notations, either graphical or textual, have been devised. Almost all scientific and engineering disciplines use some domain-specific languages: chemical notations in chemistry, visual diagrams in electronics, architectural drawings, all kinds of mathematical notations, to name just a few.

All the languages and technologies mentioned so far were based on the use of symbols produced and interpreted by humans. By contrast, Charles Babbage invented the concept of an “information machine.” The whole idea was to replace “human computers” using ink and paper to perform repetitive calculations. For this purpose, the machine would have to read and write symbols in some automated fashion. In some sense, computer science is “automatic writing.”

In the previous century, computers became a reality. The first computers, such as the Colossus, ENIAC, or Mark I, were truly technological monsters. Operators had to continuously feed the machines. Humans were dominated by the machines. This period, which I call Paleo informatics, is similar to the Paleolithic, when cavemen were dominated by nature. Whereas the Paleolithic is characterized by small groups of individuals performing a single activity—namely, finding food—Paleo informatics was characterized by small groups of computer shamans focusing on programming. The term computer language was associated with computer science, just as if computers really cared about languages humans invented to deal with them. I guess this is one of the biggest misunderstandings in computer science.

The Paleo informatics age is over. The shift to neoinformatics, though unnoticed, is not recent. In the past decades, the relationships between humans and computers have dramatically changed in favor of humans. Just as the neolithic revolution marked the domestication of nature by humans, humans have now domesticated computers. Even children can use computing devices. This new relation between our society and computers leads to the (re)emergence of the term informatics. Simply put, informatics is computer science in context. Informatics provides the whole picture, including both humans and computers; computer science corresponds to the narrow view focusing on computers.

What is interesting is that with the shift from Paleo informatics to neoinformatics, the size of social structures in the software world has been increasing: Single programmers writing programs have been replaced by teams of engineers developing and evolving software. Just like the neolithic period, neoinformatics is characterized by (1) the separation of activities, (2) the specialization of knowledge and skills, (3) the appearance of “domain experts,” and (4) the diversification of (software) artifacts. Programming is no longer the unique activity in the software world. Neoinformatics is characterized by a multiplication of languages, including programming languages, but also specification languages (Z), requirement languages, modeling languages (UML), architecture description languages (Wright), formatting languages (LaTeX), business-process languages (BPEL), model-transformation languages (ATL), metalanguages (BNF), query language (SQL), and so on. It would be easy to cover a whole page with such a list. Moreover, the concept of language can take many different incarnations and names: metamodels, grammars, ontologies, schemas, logics, models, calculus, and so forth.

Since all these languages are used in the context of software production, I fully agree with the author that they should be called software languages. I know that many people would have suggested the use of the term computer language, as it initially referred to programming languages but later became more vague. I strongly disagree with that solution. I even believe that this term should be banned! First, it should be banned because of the unbalanced and inappropriate emphasis on computers; second, it should be banned because it does not fit with the trend in informatics toward the “disappearing computer,” or the “invisible computer.”

So far, software languages have been developed in a rather ad hoc way. But this should change in the future. Many experts predict that ubiquitous computing and ultra-large-scale systems will be part of the future of informatics. This will mean building software-intensive systems of incredible complexity. This will be achievable only through the collaboration of many experts from many disciplines. What is more, these systems will certainly have to evolve over centuries. It is likely that their continuous evolution will involve complex consortiums gathering many organizations, governmental or not.

I predict that the need to manage software languages explicitly will become more obvious each day in the future. On the one hand, the implication of more actors means more communication and hence the need for vernacular software languages (i.e., interchange languages). On the other hand, specialization can be achieved by domain-specific languages. They act as vehicular software languages (i.e., local languages). Ultra-large-scale systems will therefore imply controlling and managing complex networks of software languages. The heterogeneity of the Internet and of the information on it will lead to what could be called the web of (software) languages.

As a matter of fact, this vision is not new. If you closely observe what is going on in some fields, some human activities are based on a similar scenario. Consider, for instance, very complex artifacts, such as planes. Designing and building Airbuses requires the collaboration of many companies in many countries. Many experts in many different fields are supposed to design, realize, or test parts of the whole system. These experts may use different natural languages for internal documents and many different technical or scientific languages. For instance, the graphical formalism used to design the cockpit is certainly totally different from the language used to model the deformations of wings, which is itself totally different from the language used to describe electronic circuits. Building complex systems, such as planes, also implies complex language networks.

Software languages are already there and so must be considered from a scientific point of view. I call software linguistics the subset of linguistics (i.e., the science of languages) that will be devoted to the study of software languages. Software linguistics should be contrasted both with natural linguistics—its counterpart for natural languages—and with computational linguistics, or applying computational models in the context of (natural) linguistics. Software linguistics is just the other way around: linguistics applied to software.

Languages crossed the ages—from the Stone Age to the information age—but their diversity and variety are ever increasing. By contrast to the large body of books that describe very specific language-oriented techniques, such as parsing or implementing visual editors, this book provides the first global view on software language engineering, while also providing all the necessary technical details for understanding how these techniques are interrelated. If you want to know more about software language engineering, you could not be luckier: You have exactly the right book in your hands! Just read it.

Jean-Marie Favre Software Language Archaeologist and Software Anthropologist LIG, ACONIT, University of Grenoble, France

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.25.32