Chapter 12

The Meta-lexicon Representing the ASW Universe of Discourse

 

12.1. Introduction

Remember that the meta-lexicon of conceptual terms, in a manner of speaking, constitutes the “heart” of the ASW metalinguistic resources*. This is what provides the vocabulary needed to define and elaborate the descriptive models used by the analyst – via the working interface of the ASW Studio (and, more particularly, the Description Workshop) – to analyze the audiovisual texts in an archive.

As said in the previous chapter, this meta-lexicon is made up to two perfectly complementary parts:

– the first part is given over to the conceptual vocabulary which covers the ASW discourse-object, i.e. the different types of objects of analysis* that make up the ASW universe of discourse* and are likely to be described and indexed by the analyst;

– the second part is reserved for the conceptual vocabulary which covers the analytical activities* made available to the analyst to describe said objects.

In this chapter, we shall present that part of the meta-lexicon of conceptual terms reserved for identifying and denoting the types of analytical objects of the ASW universe of discourse.

Section 12.2 is given over to a few explanations as regards the relations between “conceptual term” and “theme”.

In Section 12.3, we shall again discuss the question of defining the topical structure – an issue which, as explained in Chapters 5, constitutes a central part of the thematic structure the analyst uses to describe and explicitize when analyzing the content of an audiovisual text or corpus.

In section 12.4, we shall come back to the idea of an audiovisual archive’s universe of discourse which, in practical terms, is processed using a library of descriptive models peculiar to the archive in question.

Section 12.5 and 12.6 are dedicated to an in-depth discussion of the principles of the organization of the meta-lexicon of conceptual terms which identify and represent the analytical objects of the ASW universe of discourse.

Finally, section 12.7 again briefly describes the various stages in the creation of that meta-lexicon, and of that presented in Chapters 14, devoted to the identification and representation of the analytical activities in the ASW universe of discourse.

12.2. “Conceptual term” and “theme” – a few explanations

Before presenting that part of the conceptual vocabulary which represents the ASW universe of discourse, let us further specify what we mean by “conceptual term”.

The conceptual term expresses a concept or, rather, a notion, a theme. A theme is a knowledge space which enables an actor (an agent) to recognize and classify situations, objects or events, interact with them and use them appropriately in accordance with his interests, needs or desires. In that sense, the great phenomenologist and sociologist Alfred Schütz defines the theme as a typical schema or a schema of typification (Typisierungsschema, in the original German [SCH 03]). For instance, a large number of historical villages in continental Europe have a typically concentric topography with a central square, often dominated by the church, the town hall, sometimes the school, meeting places and locations for economic exchanges, etc. This arrangement (both spatial and social) constitutes a typical structure which conditions our cultural understanding of a village, of a small rural community. In a manner of speaking it provides an implicit definition – a definition gleaned from experience which enables us to classify such-and-such an agglomeration in the term village or the term European historical village.

However, as we also know, this representation can become an obstacle to our activities if we find ourselves in an agglomeration with a different spatial (and social) organization, unclassifiable and therefore incomprehensible, “chaotic”, etc. in relation to the schema – to the knowledge space – which we use instinctively, routinely to recognize and classify agglomerations and interact with them.

This little example shows that a theme understood as a knowledge space or indeed in the phenomenological sense as a typical schema, always has an indexical function. This means that it always depends on a historical, social and cultural context. It may be more or less familiar to a social actor, more or less controversial, specialized, formalized, etc. Again, we refer here to Schütz’s excellent explanations [SCH 03] on the subject of the thematic structure of the social world (explanations taken up again by Habermas [HAB 81] in his theory of the communicative action). The explicitation of a theme (which can always be revised) is, in this sense, a question of cultural semiotics*, or semiotic anthropology, after C. Geertz [GEE 86].

Understood thus, a theme is very similar to a model of description*. Indeed, the (English) expression “village”, used to denote our intuitive understanding of spatial agglomerations of the type [Village], is rather an abbreviation for the more appropriate linguistic expression “Historical village in continental Europe”. The abridged expression “village” is indeed useful, but dangerous: it implies a sort of pretension to universality of our implicit and culturally indexed definition of the term “village”, and thus, like so many other linguistic expressions we use on a daily basis, constitutes the potential forum for an attitude with could be classified as culture-centrist.

The conceptual term [Village] in its implicit acceptance as “historical village in Europe” is organized – so to speak – by a set of interactions between different conceptual terms which denote the historical and geographical context relevant for our understanding of the object “village”, of its architecture, its topographical structure, its socio-demographic size, etc. Depending on individual preferences or dominant stereotypical visions, some of these elements may become more important than others; the schema itself may be adapted and integrate new elements enabling account to be taken of the evolutions of historical villages which, for instance, are located near to the huge metropolises.

What we can take away from this little example, again referring to Schütz and to Greimas’ work in lexical semantics [GRE 66], is that a theme in the sense of a space of knowledge (and recognition) should be apprehended in reference to a thematic configuration expressed by a selection and grouping of conceptual terms, rather than in reference to a single conceptual term taken individually (on this topic, see our remarks in Chapters 6, section 6.3). In other words, a single conceptual term only acquires a meaning in relation to other conceptual terms, with which it expresses a theme, a notion.

Thus, the metalanguage of description should not be reduced merely to the taxonomically-organized vocabulary of conceptual terms. On the contrary, as has already been shown with a whole series of examples, it relies upon the fundamental concept of the configuration (a concept which is also central in Greimas’ semiotic theory* [GRE 79]). That is, it relies on the selection of a set of conceptual terms and their positioning using specific relations such as logical relations, attributive relations, locating relations, rhetorical relations, and so on.

However, in order to be able to select conceptual terms constituting a conceptual configuration, we must already have a well-defined vocabulary of such selectable terms.

12.3 The definitional structure of a topic

The vocabulary of conceptual terms in the ASW meta-lexicon whose root term is [Object of analysis] (see Figure 11.3) serves primarily (though not exclusively!) to represent the referential domains of knowledge thematized or thematizable in the corpora making up the archives which form the experimental workshops of the ASW-HSS project1: cultural diversity (the referential domain of the Culture Crossroads Archives2), literary heritage (the referential domain of the Literature from Here and Elsewhere portal3) and archaeology (the referential domain of the Arkeonauts’ Workshop portal4).

These referential domains are dealt with using models of thematic description* (and, more specifically, topical5 description), i.e. using configurations or structures of situations, practices, actors, works, environments and surroundings, etc. – conceptual structures or configurations which the analyst, if need be, adapts to his work of analysis and specifies with information from the audiovisual text or corpus being analyzed.

Yet the notion of the universe of discourse* of an audiovisual text (of an audiovisual corpus or archive) is not limited to the referential domains of knowledge. In addition, as we have already shown many times, the objects of the text and the discourse, i.e. the specificities and constraints of the instruments or tools for mediatizing a piece of knowledge, expressing it, communicating it, appropriating it and also conserving and transmitting it. As has already been explained (see Chapters 1), the audiovisual text deals with a domain of knowledge from a certain point of view and in reference to a particular cultural framework. It thematizes and expresses certain aspects of it, ranking them and developing them “in its own way”and in response to a given context of mediatization of knowledge. Thus, the conceptual terms* which enable us to represent the objects “text” and “discourse” constitute indispensable elements of the vocabulary of conceptual terms whose root is the term [Object of analysis].

Let us recall our example developed in Chapters 3 as regards the description/ indexation of audiovisual texts which speak about cultural constructs (technical, dress, intangible cultures, etc.) of civilizations on the American continent in a given historical era. The (simplest) definition of the thematic structure to be able to systematically process this type of content stipulates:

– on the one hand, a definition of the purely referential part of the topical structure in question;

– and on the other hand, a definition of the parts comprising discourse production around the topic, the (audiovisual) expression of the topic or the explicitation of the analyst’s “view” of the topic as it is treated in a given text.

Each part of the topical structure is defined by a term or set of conceptual terms between which specific relations are established. Thus, the purely referential part of the topical structure Cultural construct of a civilization on American soil is made up of the following selection of conceptual terms:

– [Cultural construct];

– [Civilization];

– [America];

– [Period].

These conceptual terms are positioned in relation to one another according to the following relations:

{refers to}: {[Civilization] refers to [Cultural construct]};

– {is geographically located}: {{[Civilization] refers to [Cultural construct]} is geographically located in [Geographical region: <America>]};

– {is chronologically located in}: {{[Civilization] refers to [Cultural construct]} is chronologically located in [Period]}.

The conceptual relations form another part of the ASW metalanguage of description*, where they effectively constitute the library of schemas and sequences (see Chapters 16) with which the conceptual terms are positioned in relation to one another.

The interactive working forms shown in Figure 3.13.6 integrate these structures, and together they make up the definitional configuration underlying the description/indexation of the topic Cultural construct of a civilization on American soil. The analyst of an audiovisual text selects one or more conceptual terms on his working interface and gives an account of them, indexing them freely, describing them, annotating them, etc. while still respecting certain rules of use including, in particular, that which stipulates that certain conceptual terms presuppose other conceptual terms (for instance, in our case, the conceptual term [Civilization] is presupposed by the term [Cultural construct] and the instantiated conceptual term [Geographical region: <America>] is presupposed by the term [Period]; for more detailed explanations, see Chapters 3).

12.4. The ASW universe of discourse

The structures or thematic configurations form the main collections of the CCA, LHE or ArkWork libraries of models for describing audiovisual content. Figure 12.1 shows an extract from the collections making up the LHE library of models of description*. As we can see, the LHE referential domain of knowledge is approached in the form of a hierarchical system of topoi called subjects. At the root of the LHE library of models of description, we find four main categories of subjects:

1. The subject “French literature” – a class of subjects which deal with various aspects of French literature (history, authors, ceuvres, literary schools, etc.);

2. The subject “World literature” – a class of subjects which deal with national literatures, literature by language, works and authors, and so on;

3. The subject “Literary life” – a class of subjects which deal with literary practices, uses of literature, literary publishing, diffusion and criticism;

4. The subject “Literary research” – a class of subjects which deal, finally, with actual research devoted to literature, to the literary text, to literary history, to the reception of literature, etc.

These four classes of subjects together form the four main collections of interactive forms devoted to describing the audiovisual content of the texts making up the LHE archives. From these collections, the analyst will choose the appropriate form to describe and index his audiovisual text or corpus. Let us note in passing that neither the organization of the collections (and sub-collections) of forms for analyzing the subjects, nor the number of them, is set in stone – they can evolve in accordance with the analyst’s needs or, more generally, the goals and analytical policies of such-and-such an archive. However, to change a library of forms for analyzing particular subjects into a corpus of audiovisual texts is a painstaking and complicated task, which can have significant consequences for corpora of audiovisual texts that have already been analyzed and published.

Figure 12.1 shows that the “major” subject French literature is itself broken down into five more specialized subjects, including French literature by type. This subject is, in turn, made up of the following four subjects:

– the subject Literature by theme in the history of French literature – a form which facilitates the description of subjects relating for example to travel literature, geographic literature, fantasy literature, etc.;

– the subject Literature by diegetic type in the history of French literature – a form which enables us to touch upon subjects relating for example to novel literature, poetry, theater, etc.;

– the subject Literature by social context in the history of French literature – a form which enables us to touch upon subjects relating for example to popular literature, royal court literature, literature for young readers, etc.;

– the subject Regional literatures in the history of French literature – a form which enables us to describe subjects relating for example to literature from Brittany, Alsace, Picardy, Occitania, Corsica, etc.

Figure 12.1. Overall view of the library of models for describing audiovisual corpora forming the LHE archives

image

Breaking the LHE domain down in this way into a set of typical topoi, configurations or topical structures*, is a choice motivated by a “policy” which underlies the goal(s) of an archive: goals regarding empirical coverage of the domain, goals regarding the publication and diffusion of one’s audiovisual heritage, goals relating to the long-term preservation and transmission of one’s heritage.

In other words, the topoi which we chose in the context of our ASW-HSS research project, and which gave rise to the LHE archives, are of course in no way obligatory or exclusive. Other digital archives or libraries which deal with the literary domain may conceive it differently. Any attempt at analysis must still necessarily be based on structures or thematic configurations (and, more specifically, topical configurations) and therefore assume intellectual and “policy” choices. Once the choice has been made to use a topos or a set of topoi representative of a referential domain of knowledge, each topos must be explicitized and described generically in the form of a definitional configuration, i.e. in the form of a structure which defines the internal organization of the topos. We have seen a series of concrete examples of these configurations in Chapters 5, and will come back to them later on (see section 12.4).

The qualification of a topos in the form of an explicit structure or definitional configuration can only be carried out using a metalanguage – hence the crucial importance, for any metalanguage, of the ASW meta-lexicon in general and the vocabulary of conceptual terms whose root is [Object of analysis], in particular.

Together with the library of schemas and sequences (see Chapters 16) representing the relational part (i.e. the library of relations between the conceptual terms) and the thesaurus (see Chapters 15), the meta-lexicon of conceptual terms is one of the most essential components of any metalanguage of description of textual corpora – audiovisual or otherwise.

Let us now come back to the question of the referential domain of knowledge thematized (or simply thematizable) in an audiovisual corpus. All the topoi identified and qualified in the form of definitional configurations together make up the particular vision which an archive (or library) has of its domain of knowledge.

It is easy to see, here, that there remains a small degree of ambiguity in the use of the term referential domain of knowledge:

– intuitively and “pre-analytically”, this term means the given reality about which an archive or library speaks;

– explicitly and analytically, however, the term means the representation, the “vision” that an archive or library has of a given reality in the form of a set of topical structures.

Here, we are only interested in the second accepted meaning of the term “referential domain of knowledge”. However, even if two archives which deal with the same referential domain (in the first sense of the term, the intuitive and preanalytical sense) have two different visions of it, which manifest themselves in the form of two different systems of topical structures, each of the two archives may rely on the same metalanguage of description – and thus on the same meta-lexicon of conceptual terms – to create its topical structures and its library of descriptive models. In other words, an archive devoted to literary knowledge, but which does not use the same models of description as the LHE archives, may nonetheless use the ASW metalanguage to elaborate its vision of the referential domain in question in the form of a topical structure or a system of topical structures.

We can clearly see here the advantage of such a metalanguage that, among other things, not only enables us to take account of a certain degree of relativity, a certain range of visions of the same “given reality”, but can also serve as a common resource for competing design and modeling. Finally, this metalanguage also enables us (to a certain point) to ensure the translatability and interoperability of the metadata relating to the content (the subjects) of the audiovisual texts, even if they relate to working forms which belong to rival libraries of forms.

In addition, even if a referential domain of knowledge (in the second sense, see above) is peculiar to a specific audiovisual archive (or library), a specific topic structure need not necessarily be so. Thus, if two archives which deal with the “same” referential domain of knowledge (in the first sense) have a different vision of it (i.e. in the form of a divergent library of models for describing the content), a given topical structure can be used exactly as it is, or with some local modifications, in both libraries. Even more generally, a topical structure such as that shown in Figure 3.2 and which defines the (very general) fact that (any) civilization refers to (any) cultural construct, may be pertinent for a whole variety of archives and libraries, even if they share very few interests and domains of knowledge (in the1 first sense)

However, we must not lose sight of the fact that the ASW meta-lexicon of conceptual terms of the objects of analysis ultimately represents a certain (theoretical) view of the lifeworld of the social actors and of its mediatization in the form of a discourse and a text that the analyst can use to inform his domain of expertise while adapting it to the appropriate specificities. Thus, like all ontologies, the vocabulary of conceptual terms which belong to the ASW meta-lexicon representing the analytical objects in the ASW universe of discourse, is “limited” threefold:

– the vocabulary expresses a certain view of the analytical objects which come, notably (as we shall see later on) from the social world and its mediatization;

– the vocabulary expresses that view at a certain level of generality or “granularity”;

– the vision expressed by the vocabulary is intrinsically partial.

However, the modifications to the vocabulary of conceptual terms of the analytical objects in the ASW universe of discourse* will refer to these three types of limitation and will thus become controllable. An important issue is being able to reconcile the “rigidity” of the organization of the meta-lexicon of generic conceptual terms with its compulsory adaptation to the specific expectations and requirements of the users (the concept designers and analysts) and to the world’s historical evolution.

12.5. The general organization of the vocabulary relating to analytical objects in the ASW universe of discourse

Let us now consider Figure 12.2, which shows the canonic base and the higher categories within the vocabulary of conceptual terms of the analytical objects in the ASW universe of discourse. These conceptual terms, whose root term is [Object of analysis], are identified, defined and classified to cater for the needs of analysis of varied audiovisual corpora, including those which document the domains of history and literature, archaeology and cultural diversity.

As we have already pointed out, the empirical scope of a conceptual term taken in isolation from this vocabulary goes far beyond the empirical scope of the three aforementioned domains, meaning it can be used to define models for describing audiovisual resources which have nothing to do with those domains. A distinction must therefore be drawn between the following two levels:

– the level of the model for describing* an audiovisual text or corpus;

– and the level of the conceptual terms* which make up the model of description.

Taken in isolation, a conceptual term is obviously not specific to a chosen domain of analysis; considered in relation to one or more other terms with which it forms a conceptual configuration* [STO 87; STO 93], it becomes specific and peculiar to a domain of analysis (such as that of the audiovisual corpus which documents, for example the major schools of thought in French literature or archaeological digs around the world).

Figure 12.2. Canonic base of the vocabulary of conceptual terms representing analytical objects in the ASW universe of discourse

image

The canonic base of the vocabulary of conceptual terms representing analytical objects in the ASW universe of discourse is made up of three conceptual terms (see Figure 12.2 ) which have an organizational value rather than a truly descriptive one. Because they are so very general, it is not envisaged for them to form concepts or notions to be specified and indexed during a specific task of analysis of an audiovisual text or corpus (The one exception to this rule is a content analysis form which offers the analyst the option of freely selecting from the entire ASW vocabulary of conceptual terms defining analytical objects those he needs in order to define his topical structure). On the other hand, these three conceptual terms are essential for the actual taxonomic structure of the meta-lexicon.

While still remaining “faithful” to our theoretical and practical framework for describing audiovisual corpora, the choice of the conceptual terms from the canonic base of the vocabulary of conceptual terms defining analytical objects in the ASW universe of discourse is based on certain formal ontologies (known as top level ontologies), which include, specifically, the DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) ontology6. Although the vocabulary of conceptual terms representing analytical objects in the ASW universe of discourse does not contain a “carbon copy” of the basic categories of the DOLCE ontology, as Figure 12.2 shows, certain basic conceptual terms refer directly to the work of N. Guarino and her team.

Thus (see Figure 12.2), we have lifted the distinction between conceptual terms belonging to the branch [Object “Endurant”] and those belonging to the branch [Object “Perdurant”]. This distinction, which is amply described, discussed and formalized in the existing body of specialist literature (see e.g. [MAS 03]), is “echoed” at the descriptive level in the distinction drawn in structural semiotics* between:

objects and entities, as well as groupings of them into more complex structures;

– and processes and practices or indeed situations and states (in the sense of state of action) which perpetuate over time.

As we shall see further on, this seemingly so categorical distinction is highly abstract, and can sometimes become difficult to handle when attempting to categorize conceptual terms which have a true descriptive value and a concrete impact on the task of analysis.

The third basic conceptual term – [Object “Region”] – is also inspired by the DOLCE ontology. Under the umbrella of the conceptual term [Object “Region”], we in fact classify all the terms which express the concept of expanses and locations of physical space, imaginary spaces or abstract spaces, as well as periods and moments in time.

Thus, the general organization of the vocabulary of conceptual terms characterizing analytical objects in the ASW universe of discourse has three main branches. “Within” these branches, we find higher-level conceptual categories, i.e. categories which organize other, more specialized, conceptual categories, richer from an intensional point of view and less broad, more circumscribed from the extensional point of view. The branch [Object “Endurant”] thus has two subbranches (Figure 12.2):

– the sub-branch [Natural object] which refers to physical (material, biological, etc.) entities;

– the sub-branch [Object of value] which refers first to physical entities of a particular functional status (i.e. which play a particular role in the life of a human or anthromorphic agent) and secondly to entities of meaning, i.e. to entities (with a non-specified support) which form part of the culture, the horizon of meaning (to use Schütz’s term [SCH 03]) of an agent.

Also, the distinction between [Natural object] and [Object of value] is in some ways reminiscent of the distinction between “Nature” and “Culture” in Greimas’ semantic theory [GRE 70] and in Lévi-Strauss’ structural anthropology [LÉV 58] – a crucially important distinction, formulated and defined according to the constraints and peculiarities of a particular social language (i.e. a language specific to a social actor) or set of languages [WIT 03]. It also echoes the distinction drawn between the intrinsic characteristics of an object and an object’s characteristics based on an observer (a subject) in the social ontology developed by Searle [SEA 95].

That said, the two conceptual terms [Natural object] and [Object of value] do not have any kind of descriptive value either – the heuristic interest they hold lies instead in their capacity to classify and organize lower-level conceptual categories, i.e. more specialized conceptual categories.

The second branch of the vocabulary of conceptual terms characterizing analytical objects in the ASW universe of discourse has the conceptual term [Object “Perdurant”] as a taxon. In general, the terms in this category serve to describe actions and social practices, situations or states. They also serve to describe causal processes for which there is no real identifiable intentional agent, but which can occur in the natural world as well as in the social and historical world. Figure 12.2 shows the general organization of this branch in the form of a taxonomy of conceptual terms constructed around the two basic categories [Stative object] and [Process object].

The taxon [Object Region] initiates the third branch shown in Figure 12.2. This brings together all the conceptual terms which refer to physical (natural or social places or expanses, to moments and periods in time or indeed to abstract (imaginary, mathematical, etc.) regions. As Figure 12.2 shows, the taxonomy of conceptual terms in this branch develops around the following two basic categories: [Object “Spatial region”] and [Object “Temporal region”].

In the next chapter (Chapter 13 ), we shall look at certain parts of the metalexicon characterizing analytical objects in greater detail.

12.6. Questions relating to the organization of the ASW vocabulary of conceptual terms representing analytical objects

Just like any piece of work intended to put a metalanguage of description in place, this task is dictated by its object (i.e. the corpora forming the experimentation workshops of the ASW-HSS project) and its objectives (to analyze the content of audiovisual texts from the aforementioned workshops and use these analyses in the service of a greater diffusion of the said content).

Thus, the vocabulary of conceptual terms representing analytical objects in the ASW universe of discourse* is, of course, incomplete: some of its parts appear to be solidly developed; others less so. This is the case, for instance, of the domains of knowledge relating to psychology, law and economics – domains which are not well represented in our working corpora, although they do of course form part of the ASW universe of discourse*, i.e. the universe of discourse which concerns the lifeworld of social actors.

Finally, the granularity (the level of descriptive precision) is calibrated in relation to our project’s declared object and objectives. However, there is nothing to say that it must be pertinent (in its raw state) for other projects. For instance, an analytical project which concerns only one “sub-domain” in archaeology (i.e. the archaeology of such-and-such an era, such-and-such an approach to archaeology, etc.) will probably need more specialized conceptual terms than those currently available.

Alternatively, let us take the example of the FMSH-ARA archives7, which contain audiovisual corpora on the major processes of the modern world (industrialization, economic development, migration, globalization etc.), socioeconomic situations (employment, quality of life, wealth and poverty in the world, etc.), conflict situations (revolts, wars etc.), situations of oppression and denial of others (marginalization, genocides, etc.), representations and ideologies (nationalism, communitarianism, racism, etc.), and so on. In order to give an account of such subjects*, the ASW vocabulary of conceptual terms had to be adapted, even very recently.8 Given the newness but also the importance of this workshop in the preservation and exploitation of scientific heritage, it will be the object of a dedicated later publication.

At this point, it becomes clear that the ASW meta-lexicon of conceptual terms must evolve in order to conform to the particular needs and requirements of specific projects of analysis. In order to discuss this issue, we must distinguish at least two aspects:

– the enrichment of the taxonomic parts specific to the meta-lexicon of conceptual terms (this process presupposes the distinction between shared, nonmodifiable taxonomical modules and modules added to them);

– the diversification of the bridges between the conceptual terms belonging to the ASW meta-lexicon and metalinguistic resources external to the ASW system – metalinguistic resources such as thesauruses, indexing languages, terminologies, ontologies or norms and standards (see our remarks on this topic in Chapter 11).

Let us return to the three terms forming the canonic base of the ASW conceptual vocabulary of analytical objects. These are not the result of a simple preliminary choice in the sense of opting for this-or-that higher-level ontology. As has already been mentioned, we chose them, in fine, in reference to the top-level DOLCE ontology. However, this choice was made only relatively late in our organization of the meta-lexicon. It was preceded by a classification of the conceptual terms chosen to produce models for analyzing audiovisual corpora and by various attempts to define a canonic base and higher-level categories of classification.

In summary, the organization of the said vocabulary as shown in Figure 12.2 is the result of a double-edged approach:

– on the one hand, a “lexical” or “terminological” approach, entailing the semantic reconstruction of series of conceptual terms identified during the previous phases of analysis of audiovisual corpora, comparative research carried out on existing metalinguistic resources (such as thesauruses, terminologies and ontologies)9 or indeed on lexical resources provided by programs such as the WordNet project at Princeton University10;

– on the other hand, an “investigation” of concepts (or conceptual categories), which were sufficiently general, explicitly defined, “philosophically sound” and which were already being used as the bases for other attempts to develop a metalanguage of description.

The ASW vocabulary of conceptual terms is a descriptive ontology or, to use B. Bachimont’s expression [BAC 05]11, an ontology in the epistemological sense (as opposed to a formal or categorial ontology), based on the concrete analysis of a corpus of audiovisual texts documenting a domain of knowledge (on this subject, see our remarks in Chapter 1 and also [STO 89; STO 96; STO 98]).

The fundamental concepts and the canonic organization of this vocabulary, however, reflect our desire to make it fairly similar to existing formal (categorial) ontologies12 and ultimately to transform it into a formal ontology in the precise sense that it is not dependent on a circumscribed empirical domain of knowledge.

As Bruno Bachimont [BAC 05] rightly points out, a formal ontology must be set apart from a formalized ontology (in the mathematical or logical sense of the term). Thus, the ASW meta-lexicon constitutes an ontology (or rather, part of one) which is formal but not formalized, and can be used for concrete descriptive tasks (at a certain level of granularity) on audiovisual corpora documenting diverse empirical domains of knowledge which stem from the lifeworld (social, historical, cultural, natural). As previously explained, the corpora of analyzed audiovisual texts document three specific domains of expertise: cultural diversity, literary heritage and archaeology. However, the conceptual terms making up the vocabulary in question lend themselves more or less easily to the definition and elaboration of models for analyzing audiovisual corpora which are not directly linked to these three domains. Yet they lend themselves less well to the analysis of filmic objects which have a poetic and aesthetic pretention, as is notably the case with fictional audiovisual works.

There are no more than 1,100 conceptual terms in the set making up the metalexicon which serves to identify and denote the objects of analysis in the ASW universe of discourse.13 These are grouped into some 85 taxonomical domains (for further explanations, see the following chapter, section 13.3). Each taxonomical domain contains two functionally different categories of terms:

– a first category of conceptual terms which primarily serve to organize the taxonomic domain;

– and a second category of conceptual terms which essentially serve to identify the type of knowledge objects which are thematizable in the audiovisual discourse of an audiovisual text or corpus being analyzed.

To these two first categories of conceptual terms we have to add a third category, whose main function is to position and hierarchize the taxonomical domains themselves.

This accounts for the fact that of the 1,100 conceptual terms making up the current version (late 2011) of this meta-lexicon, barely half (i.e. between 500 and 550 conceptual terms) are truly pertinent in the construction of the descriptive models, i.e. of the models the analyst uses in order to process an audiovisual corpus. The remaining conceptual terms fulfill an organizational function, helping to structure the meta-lexicon or a specific taxonomic domain, rather than a genuinely descriptive function.

Today, the meta-lexicon of conceptual terms representing analytical objects in the ASW universe of discourse includes:

– some 30 conceptual terms from the branch [Object “Spatial region”] which can be used to describe places, expanses, natural topography, regions and territories, etc. For describing the audiovisual corpora which make up the archives of the ASWHSS project14, we currently use no more than fifteen of these conceptual terms;

– some 25 conceptual terms from the branch [Object “Temporal region”] which can be used to describe periods, dates or events. For the moment, only about 18 conceptual terms from this group are really used in the models of description;

– some 75 conceptual terms from the branch [Object “Perdurant”] which enable us to describe activities, intentional or causal processes, social practices and states or situations (natural, social, etc.). We currently use around 40 conceptual terms from this group;

– some 400 conceptual terms from the branch [Object “Endurant”] – this branch is by far the most important in our meta-lexicon. The conceptual terms in this branch enable us to describe living or non-living entities, artifacts and products, individual and collective actors, all sorts of expressions, schools of thought, belief systems, etc. Again, currently, no more than half the available conceptual terms are actually used.

12.7. The process of developing the ASW vocabulary of conceptual terms defining analytical objects

Let us return to our project to analyze the three main audiovisual corpora documenting specific cultural and scientific heritage (cultural diversity, literature and archaeology). The goal of this project was primarily to identify series of subjects – of “themes” – treated recurrently in the aforementioned corpora. Let us take a simple and fairly obvious example. In the filmed discourse of archaeologists15 talking about their research projects, among the most commonly recurring subjects we find the presentation and description of the archaeological sites they are working on, where they organize and conduct digs with a view to uncovering and safeguarding objects of archaeological interest, or indeed aimed at vindicating or invalidating this-or-that hypothesis, this-or-that theory. In the corpus of filmed discourses about French literature16, the recurrent subjects are, e.g. the presentation, description and contextualization (historical, intertextual, etc.) of the work of suchand- such an author; discussion and presentation of a literary school of thought in a given historical period; discussion of a literary theme or motif; exploration of the relationships between the different arts and the historical evolution of the arts; discussion of different approaches for treating the literary object, and so on.

Of course, the identification of such series of subjects (or topoi) is not done “blindly”. It is always orientated by a methodological framework which guides the analyses, and by a certain demand (social, lato sensu). In our particular case, the identification and description of series of subjects or themes developed in the chosen audiovisual corpora was carried out using analytical scripts, whose internal structure and use has been extensively discussed in another publication of by the author devoted to the analysis of audiovisual documents [STO 03].

In addition, the analysis of audiovisual corpora using such working scripts was carried out in collaboration with groups of interested parties (stakeholders) who expressed a particular expectation as regards the analysis of these corpora: teachers of French literature; trainers and teachers in intercultural communication; young researchers specializing in the collection and preservation of intangible cultural heritage; professional archaeologists in charge of preserving the tangible heritage of a French départment; an international network of researchers concerned with enriching a video-library dedicated to documenting a geopolitical region, and so on.

An important activity with these groups of people was, of course, identifying and ranking the subjects (or themes) which were most pertinent, most important for a specific stakeholder. In summary, it was a question of carrying out an analysis of the need for information or knowledge in relation to or in conjunction with the stakeholders in question.

In a second stage, the scripts describing the themes identified were subjected to a comparative analysis. Comparing the scripts created by the analysts enabled us to identify – in reference to the theoretical framework briefly outlined in Chapter 1 – and describe the most commonly recurring trends (thematic, discursive, relating to visual mise en scène), and classify them into semantically-homogeneous groups, using them to define types of elements, i.e. conceptual terms expressing “knowledge spaces”, topoi relating to this-or-that type of object analyzed. As we well know, this is a delicate task which necessarily relies upon a sort of principle of constant cognitive revision due to the intrinsic limits and to the “subjectivity” inherent in any categorization and classification.

A third stage in the construction of the ASW meta-lexicon of conceptual terms consisted of grouping the conceptual terms. This task of grouping covers three points which are mutually complementary but clearly distinct:

– grouping the conceptual terms to reveal more and more general types of conceptual terms;

– grouping the conceptual terms making up a specific taxonomic domain (see below, section 13.3);

– identifying the conceptual terms which form the basis or indeed the canonic base for the vocabulary of conceptual terms representing analytical objects in the ASW universe of discourse*.

As regards specifically the conceptual terms which should form the canonic base of the conceptual vocabulary, one of the main concerns was to evaluate them in relation to, and compare them with, pre-existing conceptual categories. In our case, this refers particularly to approved categories which are formally defined in the socalled upper-level or top-level ontologies.

That said, in relation to the upper-level ontologies, we always contented ourselves with the role of a critical user of the conceptual categories defined in these ontologies, without either wishing to or being able to take part in the highly abstract and formal debates between specialists in the matter. The important thing for us was – and still is – that using the categories defined in the upper-level ontologies enable us to impose a certain structure on the terms or groups of conceptual terms previously identified and grouped on the basis of semantic criteria, and to ensure that the meta-lexicon in its entirety remains interoperable with the terminologies and other ontologies which adopt the categories defined by today’s main upper-level ontologies.

1 Official website of the project: http://www.asa-shs.fr/; research log: http://asashs.hypotheses.org/.

2 http://semiolive.ext.msh-paris.fr/arc/.

3 See: http://semiolive.ext.msh-paris.fr/alia/..

4 http://semiolive.ext.msh-paris.fr/ada/.

5 On the topic of the correlation between “thematic configuration” and “topical configuration” stricto sensu, see the explanations given in section 5.3 and Figure 5.1 .

6 The DOLCE ontology was developed by the Laboratory for Applied Ontology at the CNR (Consiglio Nazionale delle Ricerche) in Trento, under the directorship of Nicola Guarino; for further information, see the laboratory’s website: http://www.loa-cnr.it/index.html.

7 An early prototype can be consulted on the Web portal of the ASW experimental workshops: http://semiolive.ext.msh-paris.fr/asa-shs/.

8 That is, at the end of October 2011.

9 For further information on this subject, see the “Documentation en ligne” (Online Documentation) section on the Website of the ASW, where there is a selection of documents on this subject available for consultation: http://www.asa-shs.fr/.

10 http://wordnet.princeton.edu/.

11 Also see the online presentation of B. Bachimont’s lecture: http://www.spim.jussieu.fr/doc/ontologies/Bachimont-SticSante-08122005.pdf.

12 In addition to DOLCE, we also referred to the following ontologies: BFO (Basic Formal Ontology; see http://www.ifomis.org/bfo/); SUMO (Suggested Upper Merged Ontology; see http://www.ontologyportal.org/); OCHRE (Online Cultural Heritage Research Environment) Core Ontology (http://ochre.lib.uchicago.edu/index_files/Page845.htm), which is more specialized in questions relating to cultural heritage); the Conceptual Reference Model (CRM) from CIDOC (Comité International pour la Documentation de l’ICOM; see http://www.cidoc-crm.org/scope.html) and GOLD (General Ontology for Linguistic Description) devoted to research in linguistics (http://linguistics-ontology.org/).

13 That is, in the current version (last updated at the end of November 2011).

14 Remember that this project comprises first the three archives, CCA, LHE and ArkWork (which constitute the main workshops of the experimental activities for this project) and then the FMSH-ARA archives (devoted to the scientific heritage of the Fondation Maison des Sciences de l’Homme in Paris) and AICH (dedicated to Andean intangible cultural heritage). For further information, see the experimental portal: http://semiolive.ext.msh-paris.fr/asa-shs/, and the research log of the ASW-HSS project: http://asashs.hypotheses.org/.

15 This is an audiovisual corpus which belongs to the ArkWork (Arkeonauts’ Workshop) archives: http://semiolive.ext.msh-paris.fr/ada/.

16 This is an audiovisual corpus which belongs to the LHE (Literature from Here and Elsewhere) archives: http://semiolive.ext.msh-paris.fr/alia/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.104.5