Introduction 1

This collective work deals with the analysis of audiovisual numerical texts or corpora, which may e.g. form part of an audiovisual library or archive.

The development of methods, tools and conceptual frameworks (or models) for the concrete analysis of audiovisual texts or corpora is one of the most important issues for multimedia (audiovisual) digital libraries, archives, collections, etc. and also for any project or program to compile and disseminate knowledge heritage (e.g. cultural, scientific etc.).

Analyzing audiovisual recordings, shoots, sound recordings, film or complex multimodal documents etc. obviously constitutes an essential step for any classification of the (digital) collection of an archive or library.

Above all, however, it is the most important activity by which an actor (an individual, group of individuals, institution, etc.) obtains and exploits numerical audiovisual data to transform them – depending on their own skills, expectations and requirements, but also within the limitations imposed by the tools, methods and models available – into genuine cognitive resources which they regard as “useful”, “pleasant”, “interesting” or simply relevant, i.e. which have a value for them.

Ten years ago now, along with a small nucleus of permanent collaborators from the ESCoM (Semiotics Cognitive and New Media Team), the research center at the Fondation Maison des Sciences de l’Homme (FMSH – House of the Human Sciences Foundation) in Paris, we set up the ARA (Audiovisual Research Archives) program. One of the objectives of this program, which will be described in more detail in the Chapter 1 of this book, is to compile and distribute scientific and cultural heritage, notably through scientific events and field-work carried out in human and social sciences. Another objective of this program is to set up research and development projects aimed at:

a) collecting and producing audiovisual documentation (of field-work, for example);

b) compiling analysis corpora and effectively analyzing these corpora;

c) creating publishable corpora and publishing them;

d) defining and setting up (metalinguistic) models and essential procedures to successfully carry out the aforementioned three “tasks”.

In this book, and another collective work complementing this one (see [STO 12a]), we will present and discuss the results of our research and development relating to the analysis, description and indexing of audiovisual corpora. The question of analysis has been addressed from the start with regard to the following three issues:

1) a good understanding of the activity of analysis must take account of the internal structural organization of the audiovisual text and must have recourse to the semiotics of the audiovisual text or discourse;

2) a true analysis (going beyond, e.g. simply producing unstructured lists of keywords) of audiovisual corpora cannot be carried out without a metalanguage (an “ontology”), i.e. models of description representing the area of expertise covered by a corpus to be analyzed;

3) of course, no analysis can take place without an appropriate working environment.

Thanks to a series of French and European R&D projects1 and to the support of the FMSH, between 2001 and 2009, we were able to make tangible progress towards addressing the three issues mentioned. However, in particular it was the ASW-HSS2 project, financed by the French National Research Agency (Agence Nationale de la Recherche – ANR), that gave us the time and means needed to develop:

– a metalanguage for analyzing audiovisual corpora documenting a wide variety of areas of knowledge/expertise. This metalanguage is a generic ontology (called “ASW3 ontology”) which has helped us to define, use and validate a whole series of domain ontologies4 and models of description adapted to thematically limited areas of knowledge/expertise. This book will present it through a wide variety of concrete examples. [STO 12b] gives a more theoretical and more detailed account of this metalanguage;5

– a working environment for segmenting and describing audiovisual corpora entirely based upon the ASW metalanguage of description. The name of this environment is ASW Studio; it is made up of several specialized workshops: the Segmentation Workshop, for (virtually) segmenting an audiovisual object; the Description Workshop, for describing an audiovisual object; the Publication Workshop, for publishing an audiovisual object; the Modeling Workshop, to model the metalinguistic resources needed to undertake an analysis/description of an audiovisual object. In this book we will present the two following workshops in particular: the Segmentation Workshop and the Description Workshop; the presentation of the Modeling Workshop will be the subject of [STO 12b]; as the Publication Workshop is still partially under development, it will be the object of a new publication in late 2012;

– an as-yet relatively simple metalanguage for defining models for publishing/republishing audiovisual corpora in the form, e.g. of themed folders, bilingual folders, theme-limited video-glossaries, themed Websites, etc. These models are indeed used for publishing/republishing audiovisual corpora but the metalanguage enabling us to define them has not yet been made explicit. Clarifying the organization of this metalanguage and incorporating it into the ASW generic ontology will, conditions beyond the authors’ control permitting, constitute the main object of the ESCoM’s research activities during the next few years.

This book is divided into two main parts. In part 1, following an introductory chapter contextualizing our R&D activities since 2001, the different approaches to analyzing of an audiovisual corpus using ASW Studio will be presented:

– strictly textual analysis, consisting of the identifying passages which are relevant to an analysis and to the (virtual) segmentation of an audiovisual object (Chapter 2);

metadescription, which clarifies the content and objectives of the analysis itself as well as the authors of the analysis, the rights associated with using the results, etc. (Chapter 3);

paratextual description, the aim of which is to formally identify the audiovisual object being analyzed (title, author, genre, summary of content, etc.) and the relative rights associated with its use (Chapter 3);

audiovisual description, which relates to analyzing visual, acoustic and audiovisual shots (Chapter 4);

thematic description, which deals with the content, the subjects dealt with and developed by the audiovisual text being analyzed (Chapter 5);

pragmatic description which clarifies the potential interest of the audiovisual text in question for a given audience/use and also looks at its possible translation-adaptation (Chapter 6);

– publication of an audiovisual corpus in the form of a Web portal which is the usual form of publishing the audiovisual corpora analyzed and indexed during the ASW-HSS project (Chapter 7).

Part 2 of this book is given over to a technical presentation and a detailed discussion:

– of the ASW digital environment (Chapter 8);

– of the ASW Studio dedicated to work on audiovisual corpora (Chapter 9);

– of the computerized development of the publishing model called “portal with specialized access to audiovisual corpora” – the standard model of publication of the experiments conducted during the ASW-HSS project (Chapter 10).

Let us reiterate that this collective work is accompanied by a second collective work [STO 12a] which deals with new practices in analyzing audiovisual corpora. That book contains in-depth presentations of highly specialized analyses which could not be conceived of without genuine scenarios of analysis, projects aimed at implementing “shared” audiovisual archives using the ASW approach (i.e. the ASW metalanguage and the ASW Studio) and finally, the exploitation of the results of analysis of audiovisual corpora in the context of social media, Web 2.0 and mobile communication. In [STO 12b], the reader will find a more detailed and systematic presentation of the ASW metalanguage and of all the elements which make it up.

To conclude this introduction, let us highlight once more that this book really is the product of a collective and interdisciplinary effort combining “fundamental” research with applied research, and computing with human sciences (particularly semiotics and linguistics). As mentioned above, the work has been carried out over 10 years by a small team of researchers and engineers who are also the authors of this book and of [STO 12a]. The author of this introduction expresses his gratitude and high esteem to each of them.

Throughout the last 10 years of research and development, the team has benefited from the support and the backing of many colleagues and friends in France and abroad. Thanks go in particular to the following individuals: Patrick Courounet, Steffen Lalande, Abdelkrim Beloued, Bruno Bachimont (INA Research Dept.); Jocelyne and Marc Nanard (CNRS-Lirmm); Marie-Laure Mugnier, Michel Chein, Alain Gutierrez (CNRS-Lirmm); David Genest (University of Angers-Leria); Danail Dochev, Radoslav Pavlov (Bulgarian Academy of Sciences); Stavros Christodoulakis, Nektarios Moumoutzis (Technical University of Crete, Chania).

In addition, special thanks go to Muriel Chemouny (FMSH-ESCoM) for having proofread each of the contributions which make up this book, and to Elisabeth de Pablo (FMSH-ESCoM) for formatting this manuscript.

Our special thanks also go to ISTE/WILEY for giving us the opportunity to present our research and development over the past decade to a non-french speaking audience. Finally, we are especially grateful to Benjamin Engel for having realized such an excellent translation in such a short time.


1 Introduction written by Peter STOCKINGER.

1 For more information, see Chapter 1 of this book; see also the glossary of acronyms and project names at the end of this book.

2 See official Website of the ASW-HSS (Audiovisual Semiotic Workshop – Human and Social Sciences) project: http://www.asa-shs.fr/.

3 The acronym ASW means “Audiovisual Semiotic Workshop” and refers, of course, to the ASW-HSS project financed by the French National Research Agency (ANR).

4 As part of the ASW-HSS projects, several experimental workshops dealing with the formation, analysis and publication of audiovisual corpora within limited areas of knowledge/expertise: literary heritage, archeology, cultural diversity, etc. have been defined.

5 The research diary or blog http://asashs.hypotheses.org/ is entirely dedicated to issues relating to the ASW metalanguage of description, its evolution, its reuse and its instrumentation within the ASW Studio framework.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.254.103