Chapter 11

An Overview of the ASW
Metalinguistic Resources

 

11.1. Introduction

The possibility of defining, developing and adapting a model of description* to the requirements of the most varied of analysts and communities of analysts is offered to us by an integrated set of metalinguistic resources which constitute the metalanguage of description of the ASW universe of discourse. Using this metalanguage, analysts can process their audiovisual objects and corpora in order to turn them into intellectual resources sui generis for specific audiences and use contexts.

In this chapter, we shall give an overall presentation of the ASW system of metalinguistic resources. Chapters 1215 will be given over to a more detailed discussion of the various parts of this metalinguistic system.

Section 11.2 is given over to a summary presentation of the ASW system of metalinguistic resources ASA. We shall explain the main relationships between these metalinguistic resources and the working interface (of the ASW Studio).

Section 11.3 gives a brief introduction to the ASW meta-lexicon of conceptual terms, made up of two complementary conceptual vocabularies, one dedicated to the representation of the objects of analysis* and the other to the activities of analysis*.

In section 11.4, we shall give a summary presentation of another type of metalinguistic resource – the ASW thesaurus. As we saw in the previous chapters, the ASW thesaurus plays a crucial role in the procedure of controlled description.

Sections 11.5 and 11.6 are given over to a presentation of the configurational building blocks* which make up the models of description*. As has also been explained in the previous chapters, we distinguish between two categories of such building blocks: the sequences of description* and the schemas of definition*.

In section 11.7, we shall touch on the question of metalinguistic resources beyond those making up the ASW metalinguistic system.

Finally, in section 11.8, we shall give an (again brief) presentation of the working environment the ASW Modeling Workshop, which is used to define and manage, on the one hand, the generic metalanguage of the ASW universe of discourse* and, on the other, all the metalanguages of domain* peculiar to the universe of discourse of a particular audiovisual archive.

11.2. General overview of the ASW system of metalinguistic resources

Figure 11.1 shows the different parts which make up the ASW system of metalinguistic resources. To begin with, we distinguish the Library of models of description. A library of models of description comprises at least one, but generally numerous specific models of description* which represent the vision of the universe of discourse* of a given audiovisual archive. The vision of the universe of discourse of an archive may change depending on the interests, the objectives, or simply depending on the function of its textual object. Thus, a library of models may undergo changes. However, the possible consequences such changes might have for already-conducted analyses must be carefully evaluated.

A library of descriptive models is made up of models specialized in the analysis of a specific part or aspect of the textual object as it is apprehended from a semiotic perspective (see section 1.6). Thus, as explained in Part 1 of this book, we distinguish between models which serve to carry out:

– the description of an actual analysis (that is, the task of meta-description*);

paratextual description* (i.e. description of the formal identity of the audiovisual text from a perspective entirely comparable with that given by the Dublin Core standard);

audiovisual description* stricto sensu (i.e. description of the visual, acoustic and/or audiovisual plans of the audiovisual text);

thematic description* (of the subjects developed in the audiovisual text);

pragmatic description* (i.e. revealing the identity, the specificity of the audiovisual text to the culture and expectations of an audience and to the specific requirements of the context in which it is to be used);

– translation/adaptation per se, which we also classify under the label pragmatic description* [STO 08; STO 11a] and which aims at overcoming the language barrier in the reception and appropriation of the audiovisual object.

Figure 11.1. Overview of the ASW metalinguistic resources

image

These different types of models of description, as we know, make up the working interface of the Description Workshop in the ASW studio, developed by our research group, ESCoM, as part of the ASW-HSS research project (see [STO 11a]).1

The bottom half of Figure 11.1 shows the different parts of the ASW system of metalinguistic resources, which enables a specific model of description or a library of such models to be constructed. We can distinguish two mutually complementary groups of resources:

1. the set of metalinguistic resources which belong to the ASW system;

2. and the set of resources which are external to it but which are placed in relation with it.

The set of metalinguistic resources which belong to the ASW system includes three more specific and functionally different categories of resources:

– a category of lexical resources made up, on the one hand, of a hierarchical meta-lexicon of generic conceptual terms, and on the other, a controlled vocabulary which is the thesaurus of the ASW system;

– a category of structural or configurational resources which select and position the generic terms and/or the terms from the thesaurus in relation to one another in accordance with the specificities of a given domain of expertise and the requirements of the analysis;

– a category of resources – called schemas of indexing – which enable a description to be realized, to be carried out.

Figure 11.1 also identifies a set of metalinguistic resources which are external to the ASW system. The first of these sets is the data generated by the analysts themselves, using the ASW models of description to process (describe, index, annotate, etc.) their audiovisual corpora. These data from the analysts constitute a database of semio-linguistic expressions (verbal but also iconic, acoustic, etc.) which can serve as a reference point for new analyses of the same audiovisual text.

A second category of metalinguistic resources external to the ASW system is made up of standards, ontologies, thesauruses and other terminologies using which we could build or have in the past built correspondences, bridges. These correspondences or bridges serve to render the results of specific analyses carried out using the ASW Studio interoperable (as far as possible) with those carried out in reference to other metalinguistic resources (ontologies, thesauruses, etc.).

Figure 11.2. General view of the relations between – on the one hand – the working interface and the model of description and – on the other hand – the model of description and the metalinguistic components of a model of description

image

Let us now take a closer look at Figure 11.2, which shows a general overview of the main relations which exist between the different elements of the ASA metalinguistic system. Particularly noteworthy are:

– the relationship between the analyst’s working interface (in this case, the Description Workshop in the ASW Studio) and the models underlying that interface, which define it (in our case, the models discussed in this book, namely those reserved for describing the subjects developed in an audiovisual text or corpus). A model of description* is viewed in the form of specialized regions and zones which make up the physical and formal organization of the interface (for an in-depth discussion of the analysis and the design of digital interfaces, see [STO 05]);

– the relationship between a model of description (here, description of the content of an audiovisual text or corpus) and its components which are, firstly, functionally specialized sequences* (description of the domain of knowledge, description of the discourse production, etc.) and secondly the schemas* of definition (i.e. the schemas defining the object of analysis and the schemas defining the procedure of analysis). The schemas of description configure and define the sequences of description of a model. A schema may be included in only one, several or in all sequences defining a library of models of description (this is the case, for instance, of the schemas defining a procedure of free or controlled description); and finally;

– the relationship between the parts making up a model of description, i.e. the schemas and sequences, and their component conceptual terms*, either generic or already referenced (in the thesaurus*), and the relationships enabling us to correctly position the relevant conceptual terms in the form of bona fide schemas of description or in the guise of parts of schemas of description. As regards the conceptual terms, we distinguish between two main categories: that which represents the ASW discourse-object*, i.e. the objects of the ASW universe of discourse which can be analyzed (see Chapters 12 and 13) and that which represents the activities of analyzing the object of the discourse (see Chapter 14).

11.3. The ASW meta-lexicon of conceptual terms

The “heart” of the metalinguistic resources of the ASW metalinguistic system is made up of a hierarchically-organized lexicon of concepts or rather of conceptual terms, i.e. linguistic expressions of concepts. Together, these conceptual terms make up the vocabulary of the ASW universe of discourse. In other words, the conceptual terms are the “words” or “expressions” entered and defined by the concept designer and used by the analyst in order to speak and communicate about a domain of knowledge documented by corpora of texts (audiovisual, etc.). The term Metalinguistic, as used in this context, means language constructed with a view to processing (describing, indexing, etc.) a given textual object – that is, in our case, with a view to analyzing audiovisual corpora.

Figure 11.3. The two parts making up the ASW meta-lexicon of conceptual terms – the conceptual vocabulary relating to the object of analysis and that relating to the procedure of analysis

image

Remember: every ASW model of description* can be reduced, in fine, to two types of conceptual configurations – the schemas of the objects of analysis, on the one hand, and the schemas of analytical or descriptive activities on the other hand. Consequently, the ASW meta-lexicon of conceptual terms is made up of two mutually complementary sets of conceptual terms (see Figure 11.3):

1. the first set of the meta-lexicon: the vocabulary of conceptual terms whose root term is [Object of analysis];

2. the second set of the meta-lexicon: the vocabulary of conceptual terms whose root term is [Procedure of analysis].

In the previous chapters, we distinguished between several large categories of objects for analyzing an audiovisual text or corpus:

– the objects which make up the domain of reference (i.e. the domain which the audiovisual text being analyzed “speaks about”);

– the objects which contextualize the former in time and space;

– truly discursive objects (which therefore enable us to understand how the text being analyzed speaks about its subject);

– visual and sound objects (enabling us to understand the expression, the audiovisual mise en scène of an object in the text being analyzed); and, finally,

– the so-called reflexive objects which serve to explicitize the content and the objectives of an analysis itself.

These various categories of objects of analysis play a particular role, remember, in the definition and development of models for describing the content of audiovisual corpora. However, the ASW Studio (that is, its Description Workshop) is equipped with yet more families of models which serve the paratextual description of an audiovisual text or corpus, its translation and adaptation for a given audience, etc. These models use categories of objects of analysis which we do not use for describing the content – for instance, categories of analytical objects which serve to identify an audiovisual text or corpus, to explicitize their cognitive or intellectual specificity, or to explicitize their uses.

All these categories of analytical objects (whether or not they are used in the construction of models aimed more specifically at analyzing the content of an audiovisual text or corpus) form the referential domain of the vocabulary belonging to the first part of the ASW meta-lexicon of conceptual terms, i.e. the vocabulary whose basic conceptual term is [Object of analysis] of the ASW discourse-object*.

Figure 11.3 shows that this vocabulary of conceptual terms can be broken down into three branches whose basic conceptual terms are: [Object “Endurant”], [Object “Perdurant”] and [Object “Region”]. Together, they form the canonic basis of the vocabulary of the ASW meta-lexicon relating to the domains of knowledge thematized (or thematizable) in the ASW universe of discourse. We shall offer explanations and a detailed presentation of these in Chapters 12 and 13.

The interface showing the ASW meta-lexicon of conceptual terms in Figure 11.3 is that of a tool for designing and developing metalinguistic resources, called OntoEditor2 (see below, section 11.8).

In the previous chapters, we have given an extensive discussion of a whole series of specific analytical or descriptive activities which make up the procedures of free and/or controlled description of an object or set of objects of analysis. These activities, along with all those which we need to carry out the other tasks of analysis or translation-adaptation, form the referential domaiN of the second part of the ASW meta-lexicon of conceptual terms, i.e. the vocabulary based on the conceptual term [Analytical procedure].

As Figure 11.3 shows, the vocabulary of conceptual terms denoting the analytical procedures can, in turn, be broken down into four more specialized branches whose root terms are as follows:

– [Procedure of structural analysis of the textual object];

– [Procedure of analysis of the textual object using the ASW thesaurus];

– [Procedure of analysis of the textual object using a reference external to the ASW environment];

– [Procedure of pragmatic analysis of the textual object].

Together, these four branches form the canonic basis of analytical activities in the ASW universe of discourse. We shall discuss them in greater detail in Chapters 14.

Thus, together, the two vocabularies [Object of analysis] and [Procedure of analysis] make up the meta-lexicon of conceptual terms of the ASW universe of discourse. The ASW universe of discourse means that it forms a knowledge space which possesses its own structure, its own rules, its own “grammar”. This is not the only universe of discourse, of course – there may be any number of such spaces. An important goal is to render the ASW environment sufficiently general and open so that it can accommodate the universes of discourse of archives other than those examined here.

11.4. The ASW thesaurus

The ASW thesaurus is a controlled vocabulary of standardized terms and linguistic expressions which form the predefined values (instances or referents) of the generic conceptual terms in the ASW meta-lexicon. As we know, the ASW thesaurus is made up of a set of facets and (hierarchical) lists of standardized expressions (descriptors), each of which belongs to one or more facets. A facet, for its part, forms a semantic axis or a dimension of the content (i.e. of the meaning in the structuralist sense of the term) of a conceptual term.

A standardized expression of a facet thus represents a possible value, an instance or a referent of the conceptual term which has the same meaning as that facet. For instance, one of the facets of the conceptual term [Country] is All the countries of the world at the start of the 21st Century. This facet is made up of the list of all countries recognized by the international community (not only by the UN). Thus, in addition to the 194 countries officially recognized by the UN, the ASW microthesaurus (i.e. facet + hierarchical list of standardized expressions) All the countries of the world at the start of the 21st Century also contains territories such as the island of Taiwan, the Cook Islands, Abkhazia, Palestine, etc. (Figure 11.4 ).

Figure 11.4. The standardized expressions denoting the countries of the world and making up the facet “All the countries of the world (start of 2000)”

image

If the inclusion of countries not officially recognized by the UN poses a problem for a particular community of analysts, it can of course be replaced with a list of the 194 countries recognized by the UN, this time brought together under the facet The 194 countries officially recognized by the UN.

For a concrete description, we can use either of the two facets, or indeed a combination of the two (a combination of the two facets, in this particular case, means a micro-thesaurus which is formally identical to one organized on the basis of the facet All the countries of the world at the start of the 21st Century but in which those territories not officially recognized by the UN are marked as such).

Of course, a standardized expression representing a possible value of a conceptual term in the ASW meta-lexicon may belong to different facets. This means that a standardized expression may have different meanings depending on whether it is used in this-or-that micro-thesaurus and for this-or-that specific analysis.

Figure 11.5 illustrates that the standardized expression Argentina can be used, in the context of the ASW thesaurus, for three (slightly) different facets: the facet Contemporary countries of the Americas; the facet Countries with literary culture; and the facet All the countries of the world (start of 2000). Of course, we can conceive of a whole range of other facets in which the standardized expression Argentina can be used. The three facets listed above and in Figure 11.5 are pertinent for analyzing corpora from the experimentation workshops of the ASW-HSS project. That is why they have been created.

Figure 11.5. Use Of The Standardized Expression Argentina In Three Different Facets belonging to the ASW thesaurus

image

Figure 11.6 offers a summary overview of the main parts of the thesaurus in the ASW universe of discourse. We distinguish between:

1. the shared thesaurus in the ASW domain of analysis. The shared thesaurus is made available to every user (analyst), every group of users (analysts) but can only be modified (enriched) by the committee (the “authority”) responsible for managing the ASW thesaurus;

2. the thesauruses specific to a group or community of users of the ASW resources. These thesauruses represent the specific points of view relating to the universe of discourse of an archive. For instance, the AICH audiovisual archives (Andean Intangible Cultural Heritage3) have their own facets and expressions for analyzing the audiovisual texts which constitute their collection. In addition, facets and lists of standardized expressions from the shared thesaurus can be copied into that part which is specific to a community of analysts in order to freely be able to modify either the facets or the lists of standardized expressions;

Figure 11.6. General organization of the ASW thesaurus

image

3. the controlled vocabularies external to the ASW system, particularly from other thesauruses, languages or standards we wish to use “directly” (on their own or in conjunction with the terminological resources specific to the ASW system) to analyze an audiovisual text or corpus. We saw an example of this in Chapters 10 (section 10.3), based on using a shared ASW micro-thesaurus and a micro-thesaurus from UNESCO to describe audiovisual texts about the cultural constructs of a people or a geopolitical region.

11.5. The schemas of definition

A schema of definition is a configuration which positions two or more metalinguistic elements of the ASW system in relation to one another. In addition, a “schema”-type configuration is considered to be an elementary configuration (in contrast to a “sequence”-type configuration which, in itself, is a compilation of one or more schemas).

Figure 11.7. Library of generic schemas of definition characterizing the universe of discourse of the FMSH-ARA archives

image

Figure 11.7 shows a library of generic schemas of definition which we use to process audiovisual corpora belonging to the audiovisual archives of the FMSH in Paris.4 As we can see, in keeping with our theoretical approach, we distinguish between the class of schemas defining the objects of analysis and the class of schemas defining the procedures of description used to describe a particular object.

As we have already said, the generic schemas of definition form mini-structures, local structures which make up the thematic configurations* (topical, about discourse production, etc.) defining the models of description accessible through the ASW Studio’s working interface (section 11.2 ). Figure 11.8 and 11.9 show what these mini-structures or local structures look like.

Figure 11.8 focuses on the generic schema of the analytical object Description of a train of thought, a theory… This generic schema belongs to a family of schemas used to define subjects relative to the domain of scientific culture (lato sensu). It is only at the level of the library of sequences that the decision is made as to which schemas from this family are truly relevant to be selected in a particular sequence, specialized in analyzing a subject from scientific culture. In the simplest cases, only one schema is selected; in more complex cases, several schemas are selected.

Figure 11.8. Definition of the generic schema of the object of analysis Description of a system of thought, a theory…

image

What Figure 11.8 shows above all, though, is that the schema Description of a train of thought, a theory… contains a variant called Choice of the appropriate CT(s) – a variant which is defined by the selection of the three conceptual terms: [Theory], [System of thought to be specified] and [Concept to be specified]. Our schema (or, more precisely, the variant of our schema) is thus defined as a small structure of three conceptual terms (belonging to the ASW meta-lexicon) which are positioned in relation to one another in the form of a relation called “or nonexclusive” (i.e. “inclusive disjunction” – vel, in Latin).

Although we have had neither the time nor the means to implement the different logical relations at software level, note that the structure which defines a schema is composed of:

– the selection of one or more conceptual terms;

– and a logical relation defining the precise relationship between the selected conceptual terms.

In principle, and in the vast majority of cases we have come across to date, it is either a relationship of simple affirmation of the presence of a conceptual term in a schema or the relationship called “or non-exclusive” or indeed “inclusive disjunction” (vel, in Latin).

Note, in addition, that all the other relations dealt with in the specialized literature – casual relations, attributive relations, partitive relations, locating relations, rhetorical relations, etc. – only really come into play at the level of sequences when it is a question of selecting and positioning not conceptual terms but rather generic schemas of definition made up of one or more conceptual terms.

Figure 11.9. Definition of the generic schema of the procedure of Free description (standard version)

image

Figure 11.9 shows a generic schema belonging to the second class of schemas identified above, that of the schemas reserved for defining procedures of description. In our case, it is the procedure of free description, the standard version.

In contrast to the simplified version which can only use one descriptive activity ([Minimal designation – simplified form]), the schema defining the standard version of the procedure of free description can, as Figure 11.9 shows, call upon four analytical activities, all of which, of course, form part of the conceptual vocabulary of the ASW meta-lexicon reserved for the analysis of the audiovisual text. These activities are: [Minimal designation – simplified form], [Contextualized designation – simplified form], [Designation of the referent in the original language] and [Drafting of a summary presentation].

In the same way as the schema shown in Figure 11.8 , this schema is defined by a selection of a set of relevant conceptual terms and by the relationship called “or non-exclusive” or “inclusive disjunction”. This means that the analyst using this schema can perform one or other or several of the activities identified by the schema in question to provide information about the object of his analysis. However, there is an additional condition here which the analyst has to respect. This condition stipulates that if the analyst uses this schema, the descriptive activity called [Minimal designation – simplified form] becomes both obligatory and presupposed by all the other activities: using this schema, the analyst must perform the activity [Minimal designation – simplified form] before any other activity.

This condition – which we have not been able to implement at the software level either – adds a specific pragmatic constraint to the use of this schema which is not contained in the logical relation governing the relationships between the four selected conceptual terms.

11.6. The sequences of description

A sequence possesses a specific function, peculiar to that sequence, in a model of description* of an audiovisual object. Limiting ourselves to models of thematic description (i.e. description of the audiovisual content), we can distinguish (as already explained in Chapters 5 ) the following main functional types of descriptive sequences:

– sequences of identification and description of the domain thematized in an audiovisual object;

– sequences of temporal or spatial location of the thematized domain;

– sequences of description of discourse production around the object thematized (and possibly contextualized);

– sequences of description of the visual or audiovisual expression of the thematized object;

– sequences for the analyst’s comments either about the description of the thematized object or about the object itself (and/or of the conditions of its audiovisual expression and discourse production).

All the libraries of sequences* are defined in reference to these five functional types. Figure 11.10 shows the library of descriptive sequences using which the analyst carries out the description and indexation of the audiovisual corpora which make up the FMSH’s own audiovisual archive.5 We can distinguish the following five main families of sequences:

1) First family of sequences, including the sequences reserved for analyzing the domain of expertise peculiar to an archive’s universe of discourse. In terms of the FMSH-ARA archives, we find domains such as globalization, social movements, cultural diversity, etc. All these domains (and many more) are touched upon in the audiovisual production of the FMSH-ARA archives, hold an obvious interest for research in human and social sciences, and contribute to defining the specificity, the added value of these archives.

2) Second family of sequences, including the sequences reserved for pinpointing the domains of expertise (geographically, geopolitically, chronologically, historically, etc.). Note that, in contrast to the first, this family of sequences is not peculiar to the universe of discourse of the FMSH-ARA archives. This category can be found in practically every library of sequences serving to define the descriptive models of a particular archive’s universe of discourse.

Figure 11.10. Library of sequences defining the universe of discourse of the FMSH-ARA archives

image

3) Third family of sequences, including the sequences serving to analyze the discourse production around a domain of expertise such as social movements or globalization. Also, this family of sequences can, in principle, be found in every library of sequences defining the models of description of an archive’s audiovisual collection.

4) Fourth family of sequences, including all the sequences which serve to analyze the audiovisual and/or verbal expression of a domain of expertise thematized in an audiovisual text. Once again, these sequences are not peculiar to the FMSH-ARA’s library of descriptive models, but rather can be reused exactly as they are or following some local modifications, to define absolutely any library of descriptive models.

5) Fifth family of sequences, including the sequences serving to explicitize the analyst’s viewpoint either as regards the thematized domain or as regards his own analysis, his description of that domain. This family of sequences, again, can be reused to define any library of descriptive models.

Grosso modo, all libraries of sequences of analysis are constructed in accordance with the reference model shown in Figure 11.10 .

In terms, more particularly, of the category of models of thematic description (description of the audiovisual content), it is particularly the first family of sequences – that which serves for describing the knowledge objects or domains of knowledge – which sets the libraries of sequences apart from one another. The other four families are not really specific to an archive, to the universe of discourse of a particular archive. Thus, despite the apparent complexity of the process of compiling a library of models of thematic description, it usually only concerns the first family of sequences serving to analyze the objects and domains of knowledge thematized in an audiovisual corpus or collection.

Figure 11.11. Functional organization of a sequence and relationships with the schemas making up a sequence

image

Let us take another look at the internal organization of a sequence and its relationships with the schemas of definition. The example in Figure 11.11 shows the sequence Analysis “Scientific Research in HSS”. This sequence is used in a syntagmatic structure which is made up, as we can see, of two main sub-sequences, the second of which presupposes the first. In other words, the first sub-sequence necessarily has to be filled in before the second. Looking more closely at the relationship between the two sub-sequences, we can see that the first sub-sequence delimits the relevant context (in our case, the context is given by the scientific disciplines dealt with in an audiovisual text or corpus), while the second subsequence is charged with detailing that context (in our case, explicitizing the specific aspects of a disciplinary or interdisciplinary piece of research: theme, domain, field work, etc.).

“First the relevant context, then different facets or aspects of the context” is a very frequently recurring syntagmatic device in the construction of more complex sequences that, like the one shown in Figure 11.11 , are deployed in two or more subsequences. Yet of course, there is a whole range of such functional and syntagmatic devices such as enumerative deployment (certainly the simplest), causal deployment, chronological deployment, bona fide narrative deployment, etc. Here we run into the problem of the text’s syntagmatic coherence – an issue which we have dealt with more extensively in another book, devoted to the analysis and design of “new information products” [STO 99; STO 92].

As Figure 11.11 shows, each of the two sub-sequences can, in turn, be divided – in accordance with the same imperative of syntagmatic coherence – into even more specialized sequences. In our example, the first sub-sequence is itself made up of two more specialized sequences, the first of which must be filled in, while the second is optional. Simply put, here the analyst is invited to specify which scientific discipline(s) the audiovisual text he is describing deals with. That is, if applicable, and if the analyst so desires, he can specify the discipline(s). This is a particularly useful option to properly explicitize disciplines (such as sociology or anthropology) which have a great many sub-disciplines and approaches whose existence may be down to the decision of an individual researcher, a particular research group or whether they can be considered institutional facts or widely shared scientific references – we have no real way of knowing this. The specialized sequences which make up the second sub-sequence follow a pattern of enumerative deployment: there is nothing to stop other relevant specialized sub-sequences being added in order to explain the context of the scientific research, nor to stop the position of a particular specialized sequence in the selected syntagmatic order being altered.

In any case, each unit of the sequence which serves “directly” for describing an audiovisual text, as Figure 11.11 also shows, is defined by at least two types of descriptive schema. The example shown in Figure 11.11 is the unit of the sequence Description of the theme “Discipline”. This is defined, as we can see, by the schema Selection of the CT (conceptual term) “Scientific discipline” and by the schema Controlled description of the Scientific discipline.

As a general rule, all the units of sequences which serve “directly” for describing an audiovisual text are founded in this pair of definitional schemas that we have just discussed:

– the first schema serves to define the topical structure (in our case – the simplest – it is merely a question of confirming the fact that the analysis does indeed relate to the conceptual term [Scientific discipline]; for more complex cases, see Chapters 5 and 8 );

– the second schema serves to describe the topical structure defined beforehand (as we know, it is either the schema defining a procedure of controlled description* or the schema defining a procedure of free description*; in some cases, we can also find a slightly more elaborate schema here which integrates both procedures).

11.7. Resources external to the ASW system

By “resources external to” the ASW metalinguistic system, we mean resources which are not actual components of that system, or which are not so in the same sense as are, for example, the meta-lexicon of conceptual terms, the ASW thesaurus or the schemas of indexing. However, “external to the ASW system” does not mean “having no relation at all” (either existing or potential) with that system.

Figure 11.12. The LOMFR standard integrated into the working interface of the ASW Studio

image

We distinguish between two classes of resources external to the ASW system: the class of resources constituted by the data produced by the analyst, and the class of resources constituted by the diversity of already-existing metadata. These two classes are controlled either by the ASW system (in the case of the data produced by the analyst), or – as far as possible – placed in conjunction with the elements of the ASW system in order to increase the expressive capacity of the ASW system itself, and to contribute to the interoperability of the systems of description/indexation used to generate metadata relating to a digital resource.

The standards, terminologies, ontologies, etc. are metalinguistic resources external to the ASW system, but which we may have to make use of in that they constitute the “languages” employed by various communities (institutions, etc.) to process (describe and/or index) audiovisual digital corpora documenting domains of knowledge similar to those we wish to investigate with the semiotic workshop of audiovisual description.

There are many, very varied types of external resources. One thinks, of course, of the monolingual or multilingual thesauruses and terminologies, of ontologies which are often conceptually very similar to the terminologies, of the standards used for archiving, diffusing and sharing digital resources, or indeed of the semantic and conceptual networks comparable to the ASW models of description*.

It is a veritable “multilingual landscape” which is taking shape before our eyes, transposing the image of the Tower of Babel from the level of natural languages to the level of the metalanguage. We cannot ask the analyst to “translate” his analysis – performed using the ASW models of description) into other metalanguages. These “translations” instead have to be integrated into the system itself.

There are at least two options for rendering the data produced by an analyst using the ASW metalanguage of description (at least partially) interoperable:

Figure 11.13. Creation of a bridge between the ASW standardized expression “Argentina” and the English-language Wikipedia article in order to harvest (amongst other data) information on geographical location

image

1. development of ASW models of description* which integrate references external to the ASW metalinguistic system, just as they are. Figure 11.12 gives an example in the form of an extract from an ASW model of description elaborated and created in accordance with the French norm LOMFR;

2. coordination of the conceptual terms, the standardized expressions forming part of the ASW thesaurus, the schemas, sequences or models of description with their metalinguistic equivalents in a thesaurus, an ontology, a standard, etc. external to the ASW metalinguistic system. Figure 11.13 and 11.14 offer two concrete examples: linking to Wikipedia from the standardized expression <Argentina> which belongs to the shared ASW thesaurus (Figure 11.13); and referencing of the standardized expression <Nenets> in the Ethnologue glossary of languages of the world (Figure 11.14) in order to harvest linguistic data about this Samoyed language and be able to communicate with all metalinguistic systems which use that glossary.

Figure 11.14. Creation of a bridge between the ASW standardized expression “Nenets” (Samoyed languages) and its correspondent in Ethnologue, “Languages of the World”6

image

11.8. ASW Modeling Workshop

The ASW Modeling Workshop forms part of the ASW Studio (see Chapters 2 and [STO 11a]). It is used, firstly, for developing and managing all the metalinguistic resources for analyzing the ASW universe of discourse* we have briefly presented in this chapter. This (crucial) function is reserved for the administrator, the person or group of people responsible for the metalinguistic resources which are common and open to all audiovisual archives.

Yet the ASW Modeling Workshop also serves to specify, develop and manage libraries of descriptive models* for analyzing the universe of discourse* of a particular audiovisual archive – e.g. that of the CCA, LHE and ArkWork archives7 (that is, the three main experimentation workshop of the ASW-HSS project), each of which has its own library.

The main tool of the ASW Modeling Workshop is, at present, a tool called OntoEditor. “At present” means that this tool could in the future be replaced by other, more sophisticated tools, but without the conceptual organization of the metalinguistic system being affected.

Figure 11.15. General interface of the tool OntoEditor

image

Developed, as has already been mentioned, by ESCoM&rsquo;s Francis Lemaitre at the FMSH in Paris in the context of various R&amp;D projects, OntoEditor is an editor of xml files used for developing, managing and enriching the ASW metalinguistic resources and also the domain ontologies which are peculiar to the different audiovisual archives making up the ASW universe of discourse*.

In this section, we shall give a very brief presentation of the working interface of OntoEditor and the organization of the ASW metalinguistic resources in the form of a set of xml files.

Figure 11.15 shows an extract from the working interface of the OntoEditor tool. The left-hand side displays the xml file involved for working on this-or-that aspect of the metalinguistic system. The system is made up of a set of files, which we shall present later on.

The task of developing and/or managing a metalinguistic resource such as the two meta-lexicons, the thesaurus or the configurational building blocks to construct a library of descriptive models is carried out on three levels:

1) The first level (called Annotation; see Figure 11.15 ) is reserved for entering the denomination of a metalinguistic element (conceptual term; value of a conceptual term; title of a schema or sequence; titled of the model of description; etc.) and its qualifier (i.e. its verbal definition and description, exemplification etc.). A very useful distinction here is that which is drawn between different names (i.e. between different sociolinguistic registers of denomination) which can be attributed to a metalinguistic element depending on that element’s use context: internal and technical use of the element in question, public use (e.g. on a website), use in the form of an explicative locution or in the form of an abbreviation (an acronym, an identifying icon, etc.).

2) The second level (called Field (of definition); see Figure 11.15 ) is reserved – as its name suggests – for the operational definition of a metalinguistic element: definition of a schema of definition using a selection of conceptual terms; definition of a conceptual term using a schema of indexation stricto sensu (see Chapters 16 ); definition of a model of description using a selection of descriptive sequences; etc. We also find the operation of coordinating between a metalinguistic item belonging to the ASW system and a reference external to that system: coordination of a conceptual term from the meta-lexicon denoting the objects of analysis of the ASW universe of discourse with an expression (or list of expressions) from a thesaurus or an external ontology; coordination of an ASW schema or sequence of description with an element belonging to a standard or a norm.

3) Finally, the third level (called Meta-File; see Figure 11.15 ) is reserved for a series of activities to manage the metalinguistic resources: coordination of the files making up the domain ontology of a particular audiovisual archive; monitoring of the main properties of each metalinguistic element (including, notably, the unique identifier which defines a given element) and the history of the activities relating to a given metalinguistic element.

Let us take another brief look at the general organization of the system of xml files containing the ASW metalinguistic resources. Figure 11.16 shows that this file system contains, among others, four main folders, each with a specific function.

Figure 11.16. General organization of the ASW system of metalinguistic resources

image

In particular, we see the folder “_concepts”, where all the files containing the metalinguistic resources common to the ASW universe of discourse* (see Figure 11.17) are placed: the two meta-lexicons identifying the objects of analysis* of the ASW universe of discourse and the activities of analysis*; the library of schemas of indexation per se (see Chapters 16 ) as well as the types of data with which we are working (textual data; numerical data; physical location data, etc.).

Figure 11.17. Canonic organization of the ASW domain

image

Another folder which is part of the file system (see Figure 11.16 ) containing the ASW metalinguistic resources is entitled “_domains”. As Figure 11.18 shows, this folder brings together the ontologies specific to a domain and which are developed using the metalinguistic resources common to the whole ASW universe of discourse.

Currently, it contains the ontologies of the CCA domain (ARC – devoted to cultural diversity and intercultural communication), the LHE domain (ALIA – devoted to literary heritage), the ArkWork domain (ADA – devoted to research in archaeology), the FMSH-ARA domain (AAR – devoted to the scientific heritage of the FMSH), the AICH domain (PCIA – devoted to the intangible cultural heritage of Andean peoples) and the ACH domain (PCA – devoted to Azerbaijani cultural heritage). This list is, of course, entirely open-ended.

Figure 11.18. The universes of discourse specific to each archive

image

Figure 11.19 shows the folder we use to manage the ontology of a specific domain (such as that of the LHE or ArkWork archives). This folder is canonically composed of three files:

1) the file containing the schemas of definition* of the objects of analysis and the activities of analysis peculiar to the universe of discourse* of the archive in question;

2) the file containing the sequences of description* deemed pertinent to process the thematization of the knowledge objects and, possibly, discourse production around them, audiovisual expression of them or indeed the analyst’s own position as regards the object of his analysis;

3) and the file containing the library of models of description* using which the universe of discourse of the archive in question is described, explicitized, adapted, exploited, etc.

Alternatively, other files may be added to the three listed above, but these must be present in all the folders for managing the ontology of a specific domain, or a specific version of the domain ontology).

Figure 11.19. Canonic organization of a domain

image

Finally, the system of files containing the ASW metalinguistic resources also includes other specialized folders as well as a small collection of xml files. Let us simply point out here the file “_static vocabulary”, which contains the ASW thesaurus (see Figure 11.16 ), and the file “_listOfOntologies”, which identifies the thesauruses, terminologies, ontologies and other external standards or norms with which it is possible to create bridges of interoperability.

1 For further information, see the website of the ASW-HSS project: http://www.asa-shs.fr/, and the research log devoted to the project: http://asashs.hypotheses.org/.

2 OntoEditor is an xml editor developed by Francis Lemaitre of ESCoM, as part of the SAPHIR and ASW-HSS R&D projects, financed by the Agence Nationale de la Recherche (ANR). Today it constitutes the software part of the ASW Modeling Workshop.

3 See: http://semiolive.ext.msh-paris.fr/pcia/.

4 See the portal website: http://semiolive.ext.msh-paris.fr/fmsh-aar/.

5 See: http://semiolive.ext.msh-paris.fr/fmsh-aar/.

6 See: http://www.ethnologue.com/.

7 See: http://semiolive.ext.msh-paris.fr/asa-shs/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.211.165