5
Between Knowledge Indexing and Existence Indexing

But they were wrong; it is always wrong to explain what happens in a country by the character of its inhabitants. For the inhabitant of a country has at least nine characters: a professional, a national, a civic, a class, a geographic, a sexual, a conscious, an unconscious, and possibly even a private character to boot. He unites them in himself, but they dissolve him, so that he is really nothing more than a small basin hollowed out by these many streamlets that trickle into it and drain out of it again, to join other such rills in filling some other basin. Which is why every inhabitant of the earth also has a tenth character that is nothing else than the passive fantasy of spaces yet unfilled. This permits a person all but one thing: to take seriously what his at least nine other characters do and what happens to them; in other words, it prevents precisely what should be his true fulfillment. This interior space – admittedly hard to describe – is of a different shade and shape in Italy from what it is in England, because everything that stands out in relief against it is of a different shade and shape; and yet it is in both places the same: an empty, invisible space, with reality standing inside it like a child’s toy town deserted by the imagination. (Musil 2011, pp. 42–43)

How many metadata fields does it take to qualify a human being? Is it possible to free oneself from it and thus to manage to fill this space of freedom without being too influenced for all that by other elements, which are already filled by the genetic and historical characters that necessarily accompany its existence? Musil’s description raises questions about the invisible spaces that still remain. Virtual or fantasized space, interior spaces, secret gardens, not much seems to escape the possibilities of indexing. The discovery of the territories of the unconscious and those of our genes are leading us further and further through the will to know more about ourselves as a species, but also to know more about others as societies.

If part of the movement appears to be motivated by science, other developing motives seem to be linked to accounting, managerial, even racial and inhuman logics. The rationality that participates in these methods, sometimes inspired by scientific methods, does not necessarily guarantee a form of sapience. The processes seek on the one hand to know more by accumulating data, but on the other hand seek to reduce the individual to better control him.

The creation and accumulation of documents is thus accompanied by an increasing production of metadata, which is increasingly standardized or optimized. Metadata are sometimes disappointing in the documentary sense, but also allow comparative and predictive effects. Google’s examination of profiles for advertising purposes is sometimes surprising in its interpretation of genres and age classes. Errors are frequent and obvious, hence the interest in trying to obtain more reliable metadata under the pretext of security arguments, notably by retrieving geolocation and personalization data such as telephone numbers.

The goal is to file, tidy up, find and make decisions. This requires not only methods but also ways of seeing, that is, ideologies. The same facts or even the same data can be interpreted in completely different ways. Even the analysis of the facts and the methods of collection can be radically different and ultimately lead to a different understanding of reality.

The possibilities of leaving aside perhaps more relevant “minority relations” are clearly possible with the choice of the majority decision, which is moreover the problem of the hero of Philip K. Dick’s story:

But you must realize now that the original report, the majority report, was not a fake. Nobody falsified it. Ed Witwer didn’t create it. There’s no plot against you, and there never was. If you’re going to accept this minority report as genuine you’ll have to accept the majority one, also. (Dick 2013, “The Minority Report”, sec. VII)

It is then difficult to be able to change the categorization that has been enacted and the decisions it covers. The margin of freedom that Musil foresaw then becomes all the weaker the more the other elements of categorization are fixed.

The individual can then be represented by a bundle of indices which are then recorded and which allow both the distinction: the reduction to a single individual entity, but also the comparison and thus the attachment to the group.

5.1. An index question

The realization of indexes cannot be decorrelated from these aspects, because it is always a question of relating one element to another, one unit to a larger whole, and of making comparisons, of making exclusions, of spotting what seems to go together and what appears to be an anomaly.

The question of indexing, that is to say, as much the question of indexing as the question of inquisition, is indeed that of history, or rather historiae in the sense of inquiry and investigation. But this writing is not only that of historians, but that of investigators of the present who are becoming more and more predictors of the future.

This index question shows a tension between two types of methods and logics. The first is that which seeks the truth by relying on traces, clues that make it possible to reconstruct the facts. The second seeks the scattered elements in order to gather documentation that must provide proof of an error, that is to say, etymologically a heresy. What is at stake here is the demonstration of guilt. In both cases, proof is needed. In this sense, the documentary and a fortiori hyperdocumentary regimes are regimes of the provision of proof. They tend to become, with current technologies, probabilistic regimes, insofar as it is more a question of considering what can happen rather than what has happened previously.

Umberto Eco gives a face to these two opposing logics, the one that seeks the truth of the facts, and the one that tracks down the error to better demonstrate it and above all denounce it. In The Name of the Rose, the Franciscan monk William of Baskerville is a medieval detective concerned with the truth of the facts who seeks to go beyond the most simplistic appearances, in which he turns out to be close to William of Ockham. The expression “Ockham’s razor” thus designates the fact of not trying to invent new words or concepts when they already exist which allow one to express what one wishes to show. The onomastic choice of Baskerville obviously makes it possible to identify the methodological and genre-related kinship with Sir Arthur Conan Doyle and his hero Sherlock Holmes.

Thus William of Baskerville is described by the young storyteller, his secretary Adso of Melk, the “Dr. Watson” of this medieval detective novel:

I did not then know what Brother William was seeking, and to tell the truth, I still do not know today, and I presume he himself did not know, moved as he was solely by the desire for truth, and by the suspicion–which I could see he always harbored–that the truth was not what was appearing to him at any given moment. (Eco 2004, p. 12)

William is wary of evidence and appearances and is worried about the future arrival of the terrible inquisitor, Bernardo (or Bernard) Gui, a true character, whereas William of Baskerville is fictional:

For years Bernard was the scourge of heretics in the Toulouse area, and he has written a Practica oficii inquisitionis heretice pravitatis for the use of those who must persecute and destroy Waldensians, Beghards, Fraticelli, and Dolcinians. (Eco 2004, p. 124)

Heresy was fought at that time first because it was against the dominant dogma. It is above all a reaction that needs the official dogma to try to stand out:

Otherness was thus eliminated, erased, not only for lack of a strong enough political or social base, but just as much, and perhaps even more so, for lack of being able to articulate itself as different from the system of reference, for lack of having another code than the same doctrinal one, which it questioned, to identify its practice. (De Certeau 2002 p. 183, author’s translation)

De Certeau thus considers that the basic theology of medieval heresies was finally quite similar, whereas modern heresies managed to construct a different otherness alongside the official one, thus opening the way to the Reformation. The latter was, moreover, constituted with new documentary logics that were those of a renewed accessibility to documents and thus to the study of the Gutenberg Bible.

Whatever the era, what is at stake is the confrontation of dogmas and truths and the means to impose oneself.

Between the informational and intellectual needs that require access to information, to sources, in order to answer intellectual, spiritual questions, and the need to know more about what others think and do, there is only one link that religion has very quickly grasped. If etymologically religion is what binds, it also makes documentary choices that exclude or integrate documents based on processes of verifying veracity, but which are generally more processes of manifestation of power. This was the case with the New Testament and its fixation at the Councils of Nicaea (325) and Rome (382), which excluded apocryphal texts and, more particularly, texts that could contradict the dogma being constructed.

Therefore, we understand that the logic of documentary research does not only consist in better understanding our world, but in favoring documents that reinforce dogmas and the powers that be, in order to better condemn and exclude those who produce documents that question them. The worst thing is not to make irreverent remarks, but to produce visible and readable traces of them. Heresy in its documentary forms was understood as a form of virus to be eradicated by a process that consisted of identifying it as such and destroying it to limit its spread.

The methods of collection, research, file creation, and the desire to classify and organize appear to be very similar, except that the focus is more on preservation and transmission in one case, and on destruction in the other. Consequently, knowledge indexing and existence indexing are in fact two sides of the same coin. Depending on the periods of history, one of the two sides seems to be privileged. The balance seems to be constantly uncertain. Often, it seems that the balance is broken in favor of the processes of indexing existences, in large part also because advances in the organization of knowledge end up making great strides in the indexing mechanisms of individuals.

5.2. The two faces of indexing

One of the most striking tensions in this history was the reliance on bibliographical and documentary work to advance inquisitorial processes or record-keeping on individuals. One of the most telling examples is the universal bibliography produced by Conrad Gesner (1516–1565), which was previously described as a “drifting machine” and which managed to record an important bibliography of resources in Latin, Greek, Hebrew and Arabic. His choice was based on the need to produce a useful bibliography for those who wished to catalog their own library since it allowed them to put their own call number on the book. Gesner’s compilation work was mainly aimed at informing a potential reader about interesting acquisitions he might make or about documents that might ultimately correspond to his need for information. Considering that the works he had catalogued all presented interesting informational elements, he informed the reader that he had not specifically sought to verify their full truthfulness and religious conformity. It is then that this work of indexing knowledge served as a source for a famous work of indexing existences, carried out by the Inquisition. In fact, a few years after the publication of Gesner’s work in 1545, the Catholic authorities decided to set up a new index, that of the prohibited books at the end of the Council of Trent that same year. The index librorum prohibitorum in its first edition of 1559 was based on several sources, including Gesner’s work, because it was certainly very useful for quickly noticing potentially heretical works. At the same time, Gesner was also indexed. What interested the authorities was ascertaining proof of heresy, and for this there was nothing better than being able to establish this fact through documentary evidence. The passage from blacklisting the book to denouncing its author was therefore rapid. Indexicability then corresponds as much to the documentary construction that is carried out by the index as to the necessary construction that leads to the discovery of proof.

This type of list is found throughout history and proves to be as terrible for the authors as it is for their works. For example, the Nazis had specific lists set up to expunge all works by Jews. This is the case of the Bernhard list and then the Otto list, which lists the works prohibited by the Nazi regime in France. The second list was drawn up by the German authorities with the collaboration of the French Publishers’ Union and publishing houses in September 1940. Major seizures were made in bookstores afterwards. The list initially contained works by Jewish authors, then it came to include Marxist authors after the break of the German-Soviet pact, as well as British and American authors.

Another example is that of John Edgar Hoover. Clint Eastwood’s film about him is another interesting example of this shift. In the film, the man who will become the head of the FBI demonstrates what he wants to do with the FBI by showing his secretary the Library of Congress where he used to work as a storekeeper to pay for his law studies. Demonstrating to his secretary how quickly he can find the information in the files of the federal library, he tells her that he wants to do the same for the FBI. This is how he developed the first national fingerprint file.

In the same vein of misappropriation, one can still think of the scientific publications themselves and the key role of bibliometrics and scientometrics. Bibliometrics is a concept forged by Paul Otlet in 1934:

‘Bibliometry’ will be the defined part of the Bibliology that deals with the measurement or quantity applied to books (bibliological arithmetic or mathematics). (Otlet 1934, p. 14, author’s translation)

The term is then repeated and correlated with that of scientometry. Among these main actors are Derek Solla Price, Eugene Garfield and Vassili Nalimov, who developed a whole work around bibliographies (Le Deuff 2018). Initially conceived as a discipline to study science, its evolution and trends through its productions (articles, books, patents, laboratory activities, etc.) led scientometry to produce indicators to better understand what was happening within the sciences and even to attempt to establish laws to understand their evolution, the emergence of disciplines, the success of certain currents, etc. However, certain indicators have aroused the interest of research management, to the point of producing a kind of “derivative machinery” of the initial project which ultimately aimed to understand science and the production of knowledge. The famous h-index, which is only one scientometric indicator among others, has become a tool that allows management to measure the ratio between a researcher’s output and the number of citations he generates for research evaluation purposes. Even within the logic of knowledge indexing, there has been a drift and a reversal of logic which has ended up harming the discipline of scientometry itself, which is now mistakenly equated with a managerial aim. It is also a good example of the overuse and over-interpretation of indicators whose intrinsic quality even ends up suffering at the scientometric level since publication strategies have been induced by the over-valuation of this type of indicator. Among these aberrations, particularly those stemming from Shanghai type ranking logic, organizations have attempted to reform themselves to achieve organizational changes that impact a large number of employees for reasons that consist of merging scientific entities in the hope of being included in the top 100 best universities. We are here in an indexation of existences that resides in performance measures for researchers, particularly in certain disciplines. The result is an institutional disorganization that has harmful effects on those working in the field (students, administrative and technical staff, teacher-researchers) for the sole purpose of promoting often biased performance objectives.

It is important to understand that the challenge is not only to file knowledge and classify knowledge, but also to file individuals and to be able to classify them as much by categorical logics as by “ranking” strategies. However, the opposition between the indexing of knowledge and the indexing of existence is more complex than it seems, especially since the search for scientific truth often involves a willingness to control and observe individuals, as Michel Foucault has demonstrated on several occasions in his work (e.g. Foucault 2015). Documentary regimes are clearly based on an increase in the power of knowledge institutions, which are also institutions of power. It remains to be seen who the leaders of these institutions are and their real desires. The confrontation between William de Baskerville and Bernard Gui is not about to end, if not for the fact that the latter seem to have far superior means and weapons.

5.3. The need for an indexing ethic

The current situation is now complex, for while it is tempting to envisage questioning institutions and their excessive power, or even the abuses they can cause in the management of individual lives, it actually seems difficult to totally separate the indexing of knowledge from the indexing of existence in order to retain only the first aspect.

If ethical concern is now fashionable, it also implies documentary and data processing processes that are not always easy to enforce, because they are also technical and professional knowledge. All too often, the temptation is then to consider any metadata production as bad. But this is not the case. Conversely, awareness of the scope of metadata is not always clearly visible. If we start from the example of libraries, the possibilities of borrowing works are based on mechanisms that are also those of people's files. It is now necessary to be identified, or even authenticated, while the system generates a list of borrowings. Whether it is with old borrower cards or with profiles from library management information systems, personal data and metadata sets are generated, the scope of which raises questions about the right of access that the system could produce.

The queries generated on the databases, but also the requests of the readers are as much information about the informational needs of a person or a group of people as the data of the profiled queries of a search engine.

Is it necessary to consider that we must anonymize everything to remain in the protection of individuals and users? This is not so certain, as the actions of documentary mediation require a knowledge of people and not only of the information available. This point is well known to youth librarians or teacher-documentalists who often have to interpret or anticipate the needs of young audiences. If these young audiences often have information needs that they wish to keep secret and that must be respected, knowledge providers sometimes find that they need to be able to analyze the other needs that surround their need for information. For example, a student looking for a book on childbirth may in fact alert us to other issues that may lead the librarian to send him or her to see the facility nurse quickly.1

It is therefore necessary to advocate for indexes that are indicators as objects of analysis and interpretation rather than preachers. In the end, this also shifts the question away from objects to the methods of those who conceive them and the purposes they will serve.

Inevitably, there are still risks for all document companies, whether they rely on human or machine processing. Once again, Umberto Eco in Foucault’s Pendulum gives us an excellent example of this with the documentation work carried out by the narrator of the story:

Still, I was accumulating experience and information, and I never threw anything away. I kept files on everything. I didn’t think to use a computer (they were coming on the market just then; Belbo was to be a pioneer). Instead, I had cross-referenced index cards. Nebulae, Laplace; Laplace, Kant; Kant, Konigsberg, the seven bridges of Konigsberg, theorems of topology... It was a little like that game where you have to go from sausage to Plato in five steps, by association of ideas. Let’s see: sausage, pig bristle, paintbrush, Mannerism, Idea, Plato. Easy. Even the sloppiest manuscript would bring twenty new cards for my hoard. I had a strict rule, which I think secret services follow, too: No piece of information is superior to any other. Power lies in having them all on file and then finding the connections. There are always connections; you have only to want to find them. (Eco 2001, p. 225)

The documentary approach consists in creating links, and the history of the optimized management of files resides in this approach. In documentary matters, links can be hypertextual, but they are often designed with a specific classification that allows for a more or less rigorous categorization according to a nomenclature, for example such as Otlet and Lafontaine did using the UDC for RBU. It is still possible to imagine other systems, in particular with the Zettelkasten, a card system designed by the researcher Niklas Luhman and which has been adapted with the markdown card software Zettlr.

But what we note here in Umberto Eco’s description is that the narrator’s system is that of potentialities, or even of circumstantial links, which brings quasi quantum logics into the links between the cards, which is certainly a fictional ideal for telling stories, but which can only be questioned when it comes to reconstructing facts. Foucault’s Pendulum shows how easily conspiratorial unreason can be nourished by methods stemming from the organization of knowledge. We have already shown these similarities on several occasions (Le Deuff 2008, 2015).

More worrisome, finally, is the fact that the apparently rigorous method could also be that of police and intelligence services. What Umberto Eco advanced as a kind of provocation now seems to be an almost predominant logic insofar as concomitances and similarities make it possible to try to prevent possible terrorist risks. However, in the end it is sometimes difficult to distinguish the potential from the real, the fictional from the true. It is here that the “documentary” tension is situated as the triumph of an omnipresent documentality that somewhat neglects documentarity, as a documentary quality that is based both on the quality of the information gathered and therefore requires standardized treatments, but also verifications, as well as methods for creating links between the scattered elements that are not pure fantasy, not to say pure delusion. The risk of a relevant but minority report is therefore very real.

On this point, the hyperdocumentation described by Otlet also concerns the territories of the irrational, and we will return to this in more detail in Chapter 8, but it is also part of prerogatives that function as much through the mechanisms of reason as through those of the unreasonable.

From then on, the hyperdocumentation that is being constructed places each individual and citizen as a potentially dangerous individual whose dangerousness can be judged as such from the moment apparent links are detected. Worse still, the links ultimately prejudge the past and the past actions of similarly categorized individuals in order to envisage the future, which tends to reinforce social determinisms and racist prejudices.

Hyperdocumentation is undeniably linked to this long history of indexing, which – as we must never forget – predates the history of computer science.

5.4. A long history of indexing

The history of metadata and indexing is longer than is sometimes thought. We showed with the history of the digital humanities (Le Deuff 2018) the need to examine this history long before the development of computing and the communication devices that we currently use.

Between the instrument of designation, especially of the interesting passage to read that appears in the form of an index symbol (Sherman 2015), this small hand with the index finger that indicates to the reader where he should concentrate his attention, and the fact that we can click on a hypertext link symbolized by a hand and an index finger that suggests the place where we should click, a long story unfolds.

We have shown that this history is a source of tension, manipulation and the use of very similar methods for extremely different, even ethically opposed, purposes. Indexations appear to be present in the will to know and have remained for a long time in this tension by bringing to the individual an access to knowledge, but also to an individual recognition with related rights. However, the systems have allowed for a better knowledge of individuals and groups of individuals and have therefore facilitated control measures for the benefit of the authorities, particularly state authorities.

They are therefore essential to the constitution of human societies and to the construction of the individual, as Ronald Day rightly showed:

I argue that documentary indexing and indexicality play a major and increasing role in organizing personal and social identity and value and in reorganizing social and political life. This phenomenon has resulted in a rewriting of personal and social psychologies of the Western tradition of the past two hundred years, and it is altering notions of self and personhood, texts and textuality, and personal judgment and the role of critique in thought and politics. Today those foundations of Enlightenment thought, such as individual natural powers, freedom from surveillance, and the rights of speech, are routinely overrun and erased with the important aid of documentary systems in the service of state and corporate power and profit, in both democratic and nondemocratic states. (Day 2014, p. ix)

But Day points to a major historical risk. That of a travesty or even betrayal of documentary logics, initially designed to be instruments of knowledge in their successive improvement into instruments of control in their own right. If we have seen in the previous examples that the switches between the two types of indexing were potentially frequent, Day seems to consider that from now on the documentary mission is essentially thought out and parameterized with a view to indexing individuals. This means that knowledge indexing tools, which have very often tended to be a source of innovation and exploration, are now lagging behind new control and monitoring systems, as well as profiling systems for advertising and marketing strategies and attention-grabbing logics.

For example, if you study a library catalog, it is significantly less effective in its ability to suggest and keep a reader's attention than the recommendation systems used by Netflix or Amazon’s purchase incentive methods. The challenge is to couple the indexes of available resources with the index of user profiles in order to achieve, through algorithms, user satisfaction at almost every moment. Drumond, Millerand and Coutant (Drumond et al. 2018) evoke in this regard, the development of a form of “economy of enjoyment” that is far removed from the forms thought to access knowledge.

However, documentary skills are finally found in the new economic system of the web, which remains unquestionably a documentary web, but in a sense that has evolved gradually. SEO strategies remain documentary strategies that aim to optimize content for better accessibility, but it is a matter of ensuring a commercialization rather than access to knowledge. E-reputational strategies are not devoid of documentary prerogatives either, except that above all they consist of individuals becoming a series of the most easily accessible documents.

The similarities that exist in documentation techniques with those of surveillance and marketing, but also those of propaganda, explain why the tension between indexing knowledge and indexing existences can be found within information professionals themselves.

5.4.1 Tension among those involved in documentation

The complexity of the thinking and actions of documentation professionals and certain famous actors in the field is a good example to understand why defenders of access to knowledge, of access to public reading, sometimes also prove to be actors of the “dark side” of indexing.

Paul Otlet himself, in his desire to index and organize everything, imagined applying classificatory logic to external domains in a 1906 article entitled “De quelques applications non bibliographiques de la classification décimale (Of some non-bibliographical applications of decimal classification)”:

But apart from the Tables of Decimal Bibliographic Classification, the very principle of decimal classification can receive many interesting applications. Its use for the establishment of the criminal record and for anthropometric reporting seems to require special attention. We are not unaware of the importance that the identification of delinquents has taken on: it allows their history to be reconstructed outside of their own allegations; it renders vain any attempt to conceal their names; it no longer makes the police dependent on a simple interrogation. We know that in France, Mr. Bertillon has attached his name to an identification system based on physical measurements and characters: for example, one measures height, arm span, length of the index finger; one notes cranial dimensions, iris color, etc. These objective bases escape the lie.

The method consists in assigning to each anthropometric individual (delinquents, conscripts, etc.) a classifying number based on the most characteristic recognitive elements of their physical person and inscribed, as it were, in their organs. One can then always find, under the same classifying number, whose training elements are invariable, all the documents (photographs, documents, reports, etc.) relating to the same individual. (Otlet 2006, p. 96, author’s translation)

We note that here, Otlet’s logic remains potentially totalitarian insofar as it consists of really embracing all the faces of indexing, which also takes into account existences. We note both the power of Paul Otlet’s thought, but also its limits. By choosing to remain on a decimal logic, he moves away from the work of Leibniz to remain in line with that of Dewey. As a result, if Otlet produces a true theory of information, he makes a pragmatic error, because the decimal system is more complex to develop than a binary system in machine calculation. We also note the belief in anthropometric forms whose racist drifts have been demonstrated along with their scientific falsity. In Otlet’s case, one has the impression that these data must be used to facilitate the authentication of the individual.

But if Otlet’s proposal here is essentially theoretical and aims to improve police processing by working on the issue of individual identification, this same logic will develop through fingerprint files that will gradually be centralized to facilitate information sharing. Currently, the tension concerns DNA with a police will to potentially benefit from the DNA of all citizens in order to be able to more easily confuse criminals. This type of project is worrisome because it immediately places the citizen as a potential criminal. However, our ability to resist such “files” seems less and less obvious, especially since private organizations now allow the study of our DNA to inform us about our origins. It remains to be seen, however, whether this type of information will be reused, both by the private organization that collected it, but also by the interest that these files will arouse among other actors, such as health laboratories and state authorities. Not to mention the regular risks of easier access to this data due to weakened cyber security. We know from experience that the “impermeability” of personal data files is never guaranteed in the long term.

Motivations differ: they can be ideological, marketing, whichever, they are the proof that documentary questions and a fortiori hyperdocumentary questions, when it is a question of cross-referencing personal data or going as far as possible in the potentialities of indexing, are not questions to be taken lightly and that they are certainly not stakes where all the actors are endowed with good intentions.

Thus, information-documentation actors will take on relatively strange political roles, compared to their initial positions, and will thus slide into forms of active collaboration. Among these names, known to documentation specialists, is Eric de Grolier. Part of the CGT (Confédération générale du travail2), rather left-leaning in his defense of public reading and his attempts to develop operations in this direction with his wife Georgette during the Popular Front, Eric de Grolier played an unsavory role during the Second World War. The Grolier couple is known for their work on classification, their reflections on documentary issues, and their willingness to develop professional training programs (Fayet-Scribe 2006). But we forget Grolier’s collaborationist career during the Vichy regime and the role of the UFOD (Union française des organismes de documentation) in this history. A career that he tried to minimize but that Arnaud Mercier managed to retrace (Mercier 2020).

Grolier obtained a strategic position and was then able to begin research on propaganda by studying mainly Russian, German and American works. He sought to develop a science of propaganda for the benefit of the new regime by drawing on the exemplary and centralized elements he found in the Nazi, fascist and Franco regimes. He also drew on the work of Roubakine (1922). This Russian researcher, a friend of Paul Otlet and a theorist of bibliopsychology who wished to better understand the psychological issues surrounding the reception of books, had quickly taken refuge in Switzerland during the pre-revolutionary periods. Roubakine went on to develop an important work and was encouraged to publish his work in French by Otlet and Ferrière. Roubakine feared that his work might be misused by developing, from the study of the reception of books, means and strategies to more easily convince and manipulate opinion. In some ways, it was indeed a betrayal of this kind that Grolier realized by using such work to try to forge opinions. Grolier not only developed propaganda strategies, but also sought to rely on an essential and important documentary organization in the history of documentation in France: the UFOD, an institution to which we will return below with the role of its director Jean Gérard.

When the liberation came, Eric de Grolier’s career was obviously going to be impacted, as Arnaud Mercier explains:

Sylvie Fayet-Scribe told us in an interview on March 11, 2019, that Eric de Grolier had discussed with her his exclusion by Julien Cain, a surviving Jewish deportee, who was appointed director of the Bibliothèques de France et de la Lecture Publique in 1946, and who catalogued him as a ‘collaborator’, thus no longer to be seen or supported. (Mercier 2020, author’s translation)

The irony of the story is that Grolier was eventually “catalogued” in turn, and somewhere on the index. If he was not officially condemned, he would be condemned more unofficially. And he was not alone in this.

But it is impossible not to go further in mentioning the UFOD. For its director, Jean Gérard was also sentenced to six months in prison on release for collaboration (Richards 1992). Jean Gérard was the architect of the 1937 World Congress of Documentation at the Maison de la Chimie, he was also a manager and a tireless worker, according to the portrait given by Danielle Fauque (2016), which shows his essential role in the international associations for Chemistry within the IOC (International Office of Chemistry) and documentation. Jean Gérard was a complex character. At the same time as he was the director of UFOD, he was also the head of a company for the reproduction of scientific articles by microfilm, SOPRODOC (Society for Documentary Production). Gérard took advantage of the Vichy regime to try to impose a form of monopoly on access to scientific articles, protesting against the clandestine project developed by Joliot-Curie and Jean Wyart, which was in competition with him. This was the documentary project that emerged from the recent CNRS founded in 1939 by Henri Laugier and which had a difficult start (Astruc and Ali 1997).

Gérard and Grolier imagined a centralized documentary strategy for access to scientific information. If Grolier seemed to accompany this strategy with moralizing values in the spirit of Pétainism, Gérard seemed to favor financial contributions.

It is impossible at this stage not to question the role of the actors of the UFOD at that time, among whom was the famous documentation theorist Suzanne Briet, co-founder with Gérard of the UFOD, an institution that was to receive the support of Julien Cain, who was its honorary president. Suzanne Briet also remains a complex character. She was at the same time feminist in her activities, but was sometimes surprisingly conservative, especially at the end of her life (“the role of women is to stay at home”), while managing to spare the services of the National Library despite a hostile and restrictive Vichy regime. She finally seems to have played the role of a “buffer” that allowed the service to continue to function even if problematic events took place there:

Although Suzanne Briet was not identified as part of the resistance network at the BN, she recounts an incident during the occupation when she arrived at work one day to find that twenty-two staff members had been arrested as communists, including her secretary and her principal librarian. She then went to the prefect of police and offered guarantees of their innocence. Although they were released, it was not possible for her to get them reintegrated into the staff until after the liberation. (Maack 2004, p. 732)

The responsibilities that Julien Cain entrusted to her seem to attest to Briet’s probity. Briet would remind us on several occasions of the importance of solidarity among librarians around the world. Somewhere in Briet’s mind and in the actions she undertook, she saw the library as a place where peace could be exercised. Even if she succeeded in doing this, and if the solidarity among librarians worked, libraries remained affected by war activities.

Paul Otlet’s Mundaneum also suffered inconvenience with the arrival of the Nazis. However, the indexing of knowledge and the desire to catalog that we find in Otlet’s work, as well as in Briet’s catalog room at the Bibliothèque Nationale, were certainly based on a logic of solidarity, but this philosophy seems to be gradually changing with the fact that the scientific and technical documentation and information centers are becoming instruments to support research, as well as scientific leaderships in the field of information. If the same movements to have efficient computing services can be found everywhere in Western countries, it is above all in order to avoid falling behind in the race for innovation.

Knowledge has finally become a strategic, financial, commercial and national issue. The field of indexes has thus become a battleground that will be perfectly illustrated by that of the web index.

5.5. Between documentarity and monumentality

Where does indexing finally stop when it leaves the halls of libraries and documentation centers to concern documents in digital format?

Since the development of search engines, the size of their indexes has been increasing and their limits are constantly being pushed back. Hyperdocumentation has been well embodied in the principles of search engines like Google, which has managed to push the limits of the index in several dimensions by responding to:

  • – the question of size, with the need to have servers and robots efficient enough to increase the number of indexed sites and to ensure regular updates. In a way, this is the space-time logic of the index.
  • – the question of the diversity of documentary forms: The first robots originally indexed the HTML pages that constituted the essence of the web and its logic. But very quickly, the web hosted a diversity of formats that were not indexed by the robots. It is on this level that Google’s hyperindexing was also built up, taking care to index the non-HTML documents most present on web servers: PDF files, office files, including those from the Microsoft suite such as Excel spreadsheet files, PowerPoint and Word documents. These indexing possibilities have enriched the documents present in the index, but have also introduced unexpected indexing phenomena. In fact, many PDFs or office documents were put online in an insecure manner and a simple query allowed, and sometimes still allows, access to documentation that was not supposed to be public. Queries that mix “confidential” with a file type filtering mark such as “filetype: pdf” sometimes find unprotected documents. This extension of indexing potential has been a delight for jokers, hackers and reverse engineering. It was thus quickly possible to take control of webcams remotely because of targeted requests from URL forms automatically generated by webcams for which users had forgotten to put passwords ... not thinking for a moment that the management page of their webcam could be indexed by the search engine. The philosophy is now the same for a large number of activities that generate sharing links on social networks. In February 2020, a loophole made it possible to find invitation links to WhatsApp3 groups from Google queries. The access possibilities allowed access to personal data of group members, including personalities such as journalists or politicians.

This hyperdocumentation correlated to the power of hyperindexation forces us to wonder if we have not gradually moved from an invisible web that has long claimed that it could represent the 8/9 of the web to a web that is too visible or totally transparent, whose tangible limits we can no longer really measure because the fact of generating links automatically and uncontrollably opens up almost endless potential for indexing.

It could be objected that, as opposed to hyperdocumentation, there exists a form of hypodocumentation, that is under-indexed places not in the sense of poorly indexed but outside the indexing of traditional search engines. Media discourses have described these spaces as the environments of the Dark Web, a new metaphor of the inaccessible, the secret, the unsearchable and the uncommendable.

Several scenarios exist here: that of indexing that is deliberately nonperennial and changing to avoid being spotted and finally “blacklisted”. One obviously thinks of all kinds of criminal activities. At the level of web indexing, it has always been offered to the webmaster to choose if all or part of his site could be indexed, especially if the contents are rather restricted to private or family communities. It was enough to include in the robot.txt file the elements to inform the robot of what it was appropriate to do. Here, there has always been a form of doubt.

This doubt, is finally that of the degree of obedience of the robot. If it is indicated to him not to follow certain pages or links, it will obey if it is a question of ensuring a quality of referencing and of proposing alternative contents which could interest a broad public. However, if the objective is to index precisely parts that one would like to prohibit from traditional engines, it is clearly tempting for other tools not to do so when it is a question of indexing private documents or potentially strategic or illegal information. So, on the contrary, we will ask the robot to index the forbidden.

A few years ago, these indexing issues had a strong communicational effect on the American systems for fighting against illegal activities, particularly those based on the exploitation of humans (trade, prostitution, slavery). The famous DARPA had then launched a call for the MEMEX project, an obvious reference to the initial project of the scientist Vannevar Bush, except that we had clearly left the organization of knowledge and the treatment of scientific resources to focus on other potentialities. This episode marked once again this passage between the indexing of knowledge and the indexing of existences both by the chosen theme, but also by the type of information that was sought. In the end, it was a question of identifying the networks of people who participated in this exploitation and, to a lesser extent, people in a situation of exploitation.

5.6. Which indexation regime?

The often-legitimate criticisms of personal data and the production of registers are part of an analysis that is that of a mistrust, even a rather strong defiance of government action. In this regard, several paradoxes are very often observed. Authorities, particularly political ones, are criticized for not understanding how the technologies work and at the same time for wanting to control citizens through massive registration strategies. Among these state records are notably observed and denounced:

  • – those produced by the police authorities (national police, local police, intelligence services). Several hundreds of files have been produced arbitrarily and often outside the law;
  • – those produced by social services in the broadest sense and which can be as much a control authority as a delivery authority;
  • – those that transit between private actors and the State for official purposes in the fight against various illegal activities;
  • – those that are captured, sucked up by the traces left on the spaces of the Web and from which it is possible by various treatments to infer new data;
  • – those generated after the cross-referencing of various files in a total file logic. Their coupling with biometric data and facial recognition strategies suggests a traced and controlled existence logic.

These discourses generally fear the establishment of a dictatorship made easy by such devices. Their current deployment in societies like China shows that their reality is no longer so dystopian. However, it is currently difficult to distinguish between what is potential and what is real.

There is also the discussion of the levels of acceptance of the presence of these monitoring technologies with an increasing growth in our permeability to these devices. Worse still, we end up sometimes accepting devices that would have seemed completely aberrant ten years ago.

The acceptance of new standards constitutes a new rhetoric of informational powers, the progress of which has already been shown by several works in line with those of Michel Foucault. Moreover, it is probably a progressive extension of the informational and documentary logics that succeeded Otlet’s vision.

5.7. Should we stop indexing?

Is it necessary to give in to permanent paranoia, to try to enter into dissident or even Luddite logics? The democratic game finds itself questioned in its possibilities of making choices, all the more so as the control of identities is progressing especially in private spaces that seek to authenticate the people who exchange and who finally contractualize. This is the case on sites such as Airbnb, where it is now required to transmit identity documents to ensure the veracity of the transaction and to verify that there is no deception. A facial recognition system attests to your real identity by comparing your ID photo to the one taken by your webcam. Here, it is not always clear how this works exactly. But it is likely that behind the system of easy recognition by artificial intelligence and machine learning is a hardworking class of digital workers who check the concomitance between the two photos. What is worrying here is that the prerogative of identity is no longer assumed by sworn state officials, and is delegated to external services where employees are paid by the job. These systems of delegation also end up multiplying the risk of data theft.

It is likely that, in the long term, private or state services acting outside of any official legality will manage to accumulate this type of data.

How then to react and legislate in a system that is completely globalized, but for which the essence of documentation is not that of access to knowledge, but rather based on the constitution of personal files?

What should we do when our degree of acceptance ends up taking the form of resignation in the face of the difficulty of putting in place reliable long-term alternatives?

How to substitute the potentialities of a documentary surveillance regime for a watch regime?

One can only deplore the fact that discourses tend to want to evacuate the entire documentary process, notably by presenting metadata as the absolute evil, as can be seen in many discussions on social networks with reference to Edward Snowden.

Worse, it can lead to a total questioning of documentary logic by considering that metadata are power issues that can lead to death. Indeed, the May 2014 declaration “we kill people based on metadata” of General Michaeal Hayden, former director of the CIA and the NSA, is often used to condemn metadata definitively.

However, fighting against people’s indexing logics does not mean much since the constitution of files implies as much the recognition of rights as of control logics. So what can be said about undocumented people, who precisely need to find themselves somewhere indexed and recognized as individuals with rights, on pain of exclusion and expulsion at the border? Researcher Aurora Chang (2011) considered that she had become “hyperdocumented” because of her academic status, but that she was initially “undocumented” in her childhood as an illegal immigrant.

If metadata can be instruments for classifying, categorizing and thus excluding and genociding people, they are also used to grant a whole range of rights. Removing them amounts to considering that there is no longer any lasting identity, no nationality, no recognition of anything. The reverse nightmare of hyperindexation correlated with control mechanisms is, conversely, the agonizing freedom of having no means of approval and recognition of one’s own existence:

What would it be like to permanently lose access to all your data all at once? Beyond just simple informational identity theft or misplaced data records, envision a scenario of permanent personal data deletion. The prospect, when thought through, is truly frightening. What would you do if you somehow became permanently detached from all your personal data? What could you do if you somehow became permanently unrepresentable by all data systems? This is precisely what the informational person dreads most: the permanent and irreversible erasure of the entirety of their personal information and therefore their entire informational identity. No driver’s license, no passport, no bank account number, no credit report, no college transcripts, no employment contract, no medical insurance card, no health records, and, at the bottom of them all, no registered certificate of birth. The scenario is chilling: everyone around you well attached to their data while you are dataless, informationless, and as a result truly helpless. What would you make of yourself? What could others make of you? What would the bureaucracy be able to do when you petition it with your plight, given the fact that no bureaucracy can address a subject as other than their information? (Koopman 2019, p. 4)

It is therefore necessary to design all these mechanisms in a more balanced way, and not to reduce them to mere methods of “subjugation”. The question certainly deserves a wider exposure, but it does not merit intellectual reductions in its examination.

Undeniably, indexing issues are not just the territory of information professionals, just as algorithmic issues are not just the territory of computer scientists and mathematicians. Consequently, scientific issues clearly require reconciliations or collaborations that return to large and inefficient sharing, which is what the digital humanities are trying to do.

Flusser had put the problem in its proper context:

From the moment when scientific interest broadened to include animate things (botany, zoology), and later man himself (psychology, sociology, economics), the epistemological problem posed by mathematics, mentioned above, took on a disturbing dimension. Of course, we can quantify animate things, but each time we do so, an essential aspect escapes us through the intervals. The aspect which, precisely, differentiates the animate thing from the inanimate. The aspect of living and of life. This is why the science of the 19th century had to make a choice. Either to continue to quantify, and thus resign itself to losing the essential in the phenomenon of life. Or to develop a new, non-quantifiable theory of knowledge, and thereby resign itself to losing the dimension of accuracy. Nineteenth-century science chose both methods without excluding either of them. It was divided into “hard” (quantified) and “soft” (non-quantifiable) sciences. We suffer, until today, from this division. (Flusser 2019, p. 73, author’s translation)

It is in this renewal between the qualitative and quantitative that reflections on ethics can take place, particularly those that raise the question of data and algorithms, but especially the question of design. Indexing and design are in fact much closer than one might think. One can only deplore the fact that design studies neglect this absolutely essential approach, since the design of devices and interfaces is currently a science of designation and attention in the service of post-modern propaganda, inasmuch as it involves advertising propaganda on the one hand, and fragmented and dispersed propaganda on the other, at the political level.

The informational and communicational design is that of a designation that “instructs” in that it gives instructions, not only to the desired functionalities of the interface, but also to the users whose uses one wishes to induce, in a logic that is that of affordances.

Consequently, it is not only a question of studying the phenomena of data indexing and capitation and the constitution of profiles, but also of studying the algorithms associated with this collection as well as the devices constituted.

Richard Saul Wurman had announced this tension in 1999 in the preface of the book Information Design:

What a wonderful moment to have the catastrophe or the perception of the catastrophe as it effects a bankrupt educational system, come at the same moment when technology and entertainment, or what I call technotainment, and information, or what I call Information Architecture, have a raison d’être, a purpose, a reason for being, an exciting need to develop their wings, to stretch their arms, to shake their fingers and to think up things that they can only think up now because of the availability of an information technology and a network for distribution and accessibility. This is a cornucopia of the future. A future where learning the design of learning, the design of understanding becomes a major business. (Wurman 1999, p. XIII)

This informational design issue finally forces us to evolve standards and practices so that constraint is discreetly replaced by immediate pleasure. The fear of being indexed and the progressive awareness of hyperindexing mechanisms are accompanied by daily practices that demonstrate that other, much more satisfactory mechanisms encourage people to “give” of themselves and especially of their ego. Hyperdocumentation can then rely more easily on these personal documentary practices, which have never been so widespread because documentation is no longer solely about seeking to know more and to increase one’s own knowledge, but about demonstrating more and more the reality of one’s own existence.

  1. 1 This example is authentic and personal. That day, as a librarian, I deciphered the situation and the real need that went far beyond just the need for information.
  2. 2 A French union, historically close to the communist party.
  3. 3 Cox, J. (2020). Google is letting people find invites to some private WhatsApp groups [Online]. Available at: https://www.vice.com/en_us/article/k7enqn/google-is-letting-people-find-invites-to-some-private-whatsapp-groups.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.162.242