Chapter 2

The Segmentation Workshop for Audiovisual Resources 1

2.1. Introduction

As we saw in the previous chapter,1 the “audiovisual text is not in itself a cognitive resource for an audience. It becomes so only after it has undergone a qualitative transformation”.

In order to become a resource sui generis, it must in principle undergo a set of transformations, indexations, adaptations etc. to enable it to fulfill the expectations of a given audience, in view of specific uses and exploitations, etc.

A first “manipulation”, here, consists of identifying the most relevant “moments” or “passages” to make a “source” audiovisual text (i.e. which has its own identity) correspond to a specific audience, use or exploitation. From a technical point of view, it is a question of “segmenting” a document.

In this chapter, therefore, we shall deal with the first workshop in ASW Studio, called the “Segmentation Workshop”.

This workshop enables the analyst to virtually segment a video, in its entirety or in parts – with the aim of later describing and annotating it using the “Description Workshop”.

After a general presentation of the segmentation issue, which will be followed by a description of the functions of the Segmentation Workshop served by ASW Studio, we will offer a number of reflections on segmentation. Finally, we shall look at this workshop in perspective.

2.2. Segmentation of audiovisual corpora – a general presentation

A preliminary stage necessary for any analysis (be it semiotic or otherwise) of an audiovisual document, is segmentation2 of the document being studied – namely, in this case, a text on an audiovisual support. This task begins by locating the content of the text and the objectives to be achieved by the semiotic analysis function.

It is therefore necessary, before all else, to have a precise idea of the objectives of the segmentation. The practice of analysis stems from an express request: acts of research, pedagogy, teaching etc. which requires a new document to be created – be it purely textual, visual or audiovisual – that uses the analytical data to ends resulting from predefined objectives. As specified in the introduction to [VAN 92]: “defining the context and the finished product is essential to put the analysis in perspective”.

This leads to a division between tasks and concrete actions of analysis.

First of all, a degree of reflection allows the following questions to be asked:

– What is being analyzed? (In our case: are we selecting a single video? a corpus of videos?);

– Why is it being analyzed? (What is the objective of the analysis?);

– For whom is it being analyzed (what will be the target audience for the analysis and the results in terms of the creation of new products?).

Here, we will spend no more time on issues related to the analysis itself: this is a matter for a separate debate. We shall consider that this stage has been debated beforehand, and that the questions about the themes, the choices of corpus, the final objectives linked to the context, uses and users have all been finalized.

Part 2 of this book will be dedicated to the work of analysis itself. It is here that the wealth of functions offered by the Segmentation and Description Workshops comes into play.

As we will discover throughout this book, the Segmentation Workshop and the Description Workshop, although distinct from a formal point of view, are “intimately” linked from an analytical point of view, particularly as regards the choice and description of the editorial brand.3

The virtual segmentation4 of our audiovisual object is therefore an action motivated by different goals and objectives depending on the context and the uses (on the chosen objectives).

In the context of the ASW-HSS project, all the audiovisual objects being analyzed are digital videos which belong to different film genres, e.g. an interview, a seminar, a conference, a round-table, a scientific documentary, a road movie, a scientific report, or simply digital rushes classified by theme. Each genre presupposes a more-or-less sophisticated construction – or montage – which is made up of a succession of sequences, inlays, etc. depending on the message to be conveyed. The techniques used in these genres may influence the segmentation. Hence it seems “easier” to extract the definition of a term or a concept from an interview with a scientist. Similarly, the type and number of inlays of still images or diagrams during a seminar will enable a link to be made between several contributions.

Based on the fact that these analyses may have several objectives (also see Chapters 1 and 2 of Digital Audiovisual Archives5), which will be discussed below, the segmentation will have to be done according to the type of products we are looking for. On the one hand, it must be organized according to a professional, educational, and/or learning axis, a multilingual adaptation, a valorization… On the other hand, it must be organized according to the technological objectives, particularly with a view to publishing the documents in the form of new information products intended for distribution over Internet and mobile networks. Hence, different segments will be potentially possible for the same video, depending on its genre.

Generally speaking, the action of segmenting a video follows a process which can be represented diagrammatically as follows:

Figure 2.1. Simplified diagram of the process of deconstructing /reconstructing the analyzed object

Figure 2.1

2.2.1. Example of segmentation of a scientific interview

For example,6 imagine that an academic interview with a celebrity or a professional in a certain field gives rise to:

– a so-called “basic” thematic analysis where the analyst carries out a global segmentation (corresponding to the object itself) in relation to the topic discussed with the interviewee in the video; this type of segmentation might be used by librarians or archivists;

– a so-called “general” thematic analysis where the analyst carries out a complete segmentation – according to the chapters suggested on the distributing site – in relation to a topic motivated by the “final product” which must communicate on a subject; this type of segmentation might be used by teachers;

– an in-depth analysis where the analyst will prefer a partial targeted segmentation which can be used in a video-glossary, an interactive encyclopedia…

A concrete example of segmentation,7 carried out on the basis of an interview with César Itier, a specialist in Quechua language and civilization at the Institut National des Langues et Civilisations Orientales (INALCO) in Paris, dealing with Quechua language and oral tradition,8 is available on the portal site Peuples et Cultures du Monde (PCM) [People and Cultures of the World].9 Two types of folders are currently available for consultation.

A first “hypermedia folder” offers a complete and detailed breakdown taking the initial title of the video: “Introduction à la langue et à la tradition oral quechua” (“Introduction to the Quechua language and oral tradition”). This work facilitated the creation of three other bilingual hypermedia folders which present abridged translations of this folder in three different languages (Spanish, English and Italian).

Figure 2.2. Interface for accessing the hypermedia and educational folders presenting different sorts of interview segmentations

Figure 2.2

The objective of this complete segmentation is to create different types of lecture support material explaining the discourse of a researcher and suggesting further resources. These supports may be used in a formal university teaching context (bachelor level), or in an informal context aimed at familiarity with a language and civilization. These files can be consulted at: http://www.culturalheritage.fr/940_fr/.

A second “educational folder”, entitled “Introduction to the Quechua language”, offers a partial segmentation of the initial video which favors the passages exclusively dealing with content referring to the Quechua language. Its goal is to create a reading portfolio to accompany a language class of bachelor/masters level. This file is available for consultation at: http://www.culturalheritage.fr/940_peda_formel1_fr/.

2.2.2. Example of the segmentation of a conference

The talks given at a conference, a round-table, etc. (and therefore the contribution of a researcher) – although they are chronological in construction and montage – may be supplemented with inlays of documents for illustration purposes. One might reasonably expect them to be able to:

– be analyzed as a whole and therefore segmented in a minimalist way in relation with the topic of the aforementioned event; this type of segmentation might be used for creating institutional or laboratory archives;

– be partially segmented according to the contributions of each speaker; this type of segmentation might be used for creating archives on a personality;

– give rise to one or more in-depth segmentations in one or more contributions, emphasizing particular points of argument about a topic relating to that of the event; this type of segmentation might be used for creating archives on a scientific domain, or a new audiovisual object using a selection of segments according to a topic.

Let us here cite a concrete example dealing with the segmentation of an international conference. This task was carried out as part of the ASW-HSS project, on the “XIIIe Rencontres sabéennes” [Literally, the 13th Sabaean Conference]. The original videos can be viewed at: http://semioweb.msh-paris.fr/corpus/ada/2020/home.asp).

Each contribution was segmented within the original video of the conference, and analyzed as a separate audiovisual sequence. Then, an in-depth segmentation of each talk was carried out. This enabled the analyst (using the Publication Workshop of the ASW-HSS project) to create different scenarios of use in order to establish connections between the various points of view and different approaches to a theme relating to that of the conference.

Figure 2.3. Superimposed visualization of the segmentations which were carried out for the “XIIIe Rencontres sabéennes” conference

Figure 2.3

2.2.3. Exemplification of the segmentation of an amateur video

We may imagine that an amateur video, a road-movie, whose creation and montage may have not been planned might:

– give rise to an essentially basic visual segmentation, enabling the images to be re-appropriated for illustration purposes in the context of an educational folder;

– give rise to a sound-based segmentation enabling us to focus on a subject using a soundscape (natural ambient sound capture) allowing a sound illustration in the context of an educational games folder.

This was the type of segmentation used to work on the road-movie entitled “La ville de Hong-Kong. Une documentation audiovisuelle de la vie de tous les jours” (“The city of Hong-Kong. An audiovisual documentation of daily life”) which is available for consultation at: http://semioweb.msh-paris.fr/corpus/arc/1788/.

This road-movie presents many visual and sound aspects of daily life in Hong-Kong. The sequences are relatively short, and priority was given to real-life shots. Several segmentations of this video were carried out as part of the ASWHSS project; one of these deals exclusively with the visual forms which come across through this document, particularly the emphasis of a few key images which are representative of life in a neighborhood. Originally, the video is distributed, split into thematic chapters: catering, market, strolling through the streets, public transport and unusual sights of the city… This form of segmentation allows us to offer concrete illustrations which can accompany a scientific discourse or an educational folder, or even a lecture in domains ranging from human geography through spatial planning, to contemporary urban anthropology.

Figure 2.4. Representation of the complete segmentation of the road-movie “The city of Hong-Kong”

Figure 2.4

2.2.4. Example of the segmentation of an audiovisual report

We may imagine that a report or documentary film which requires well thought-out and purposeful production and montage might:

– give rise to a visual and/or sound-based segmentation, for the segments to be retrieved as examples in a thematic folder;

– give rise to a segmentation targeted at textual, sound or visual effects (short relevant extracts) which might be combined to make up a radio- or AV-trailer or promote or distribute a scientific communication.

Let us now look at the complete segmentation10 of the documentary film “Iyambae: Ser Libre: la Guerra del Chaco (1932–1935)” (Iyambae – Freedom – The Chaco War (1932–1935) presented on the PCM Website at: http://www.culturalheritage.fr/771_fr/.

Watching this documentary, we can easily see that the type of segmentation chosen highlights the sequencing processes linked with the montage and the objectives of the film: introduction of the topic, geographical presentation of the location of the conflict, testimony from former soldiers, historical reconstructions, consequences of the conflict. Yet it is precisely by a choice between these sequences that several types of folders or reading portfolios might be made possible. Concretely, in our work as part of the ASWHSS project, we have favored the conception and publishing of a thematic folder dealing with the consequences of the Chaco war from the point of view of the indigenous people by preselecting segments – on the basis of detailed analysis of the testimonies – and re-segmenting them so as to bring the relevant visual and sound characteristics of the testimonies to the fore.

Figure 2.5. Representation of the complete segmentation of the report “Iyambae” superimposed on that of a detailed segmentation of one element

Figure 2.5

2.2.5. Other possible segmentations

Finally, two other types of segmentations may also be envisaged and carried out:

– the first according to the distribution support of the object to be republished. Hence, a distribution on tablets or mobile phones, might lead on to seek out segments which are more precise, more relevant, and shorter, to exemplify an idea or concept;

– the second according to the pluri- or multilingual nature of the video, which will lead either to adapted forms of translation or to linguistic adaptations, in view of the desired objectives.

Let us take, as a concrete example – again carried out as part of the ASW-HSS project – a segmentation extracted from the report “Le club du Choro de Paris; musique populaire bresilienne” (“The Choir club in Paris: Brazilian popular music” which can be viewed at: http://semioweb.msh-paris.fr/corpus/arc/1980/home.asp.

A partial segmentation of the initial video was carried out in order to select several key moments, enabling us to compose a podcast entitled: “The Choro, first discovery”. This podcast is essentially intended to be downloaded and watched on a mobile phone. A number of technical constraints will have to be taken into consideration, particularly in terms of duration and accessibility of relevant soundtracks. It should not be longer than 4 minutes, and the quality and interest of the sound are essential, in view of the fact that mobile podcasts are downloaded primarily to be heard more than watched. The segmentation carried out will therefore focus on a short key textual extract, i.e. an explanatory remark – a definition taken from the interview with the professional – around which 3 minutes of soundbites representative of the type of music being discussed (here choral music) will be inserted.

Figure 2.6. Representation of the segmentation carried out with a view to putting a podcast online

Figure 2.6

As has already been said, we can clearly see that the work of segmentation is far from trivial; indeed it predetermines the acts of indexing and annotating which must be done in the Description Workshop. In addition, the motives which lead us to segment a video and the editorial choice will have to be clearly indicated and described in the “metadescription” part of the Workshop, as we shall see in the next chapter, section 3.2.2.1.

2.3. Appropriation of the segmentation workshop

The interface of the “Segmentation Workshop” opens on the screen as soon as the Interview software is launched. Any analysis of audiovisual corpora carried out using ASW Studio must necessarily start with this.

As with all software, there is a File menu which offers the possibility of loading and opening a video in this first window in order to “segment” it. As soon as the video is loaded into the workshop, the main functions for segmenting a video (in gray in Figure 2.8) appear.

Figure 2.7. The user interface for the Segmentation Workshop – initial view

Figure 2.7

Figure 2.8. Segmentation Workshop, video loaded, segmentation functions appear

Figure 2.8

As we have seen, “describing a video”, in the context of the ASW-HSS research project means either describing it as a linear whole (without breaking it up), or as its constituent parts (or sequences) or even describing it in a more-or-less partial way, focusing the analytical attention only on such-and-such a moment, such-and-such passage.

In the first case, the video is not split into segments; in the second and third cases, it is cut into as many segments as the analyst deems necessary for his/her work of analysis.

The interface of the tool (Figure 2.9) reflects these various possibilities:

– analyzing the video as a unadulterated whole, which is the default setting; as we can see in Figure 2.9, once the video has been loaded a single segment is selected and its general parameters appear;

– analyzing the video segment by segment;

– analyzing only such-and-such a segment of the video.

Let us examine these possibilities on the interface in more detail, from a formal point of view:

Figure 2.9. The Segmentation Workshop, a glimpse of the functionality of the interface

Figure 2.9

As stated above, by default, the technical properties inherent to the video will be suggested when it is selected as a single segment.

As Figure 2.9 shows, the Segmentation Workshop offers different functions, most of which are presented using the metaphor of video-editing software.11

Particularly, as regards the part for working on the video and its content (video window, play/stop buttons, cursor in the segmentation bar) which enables us to view and listen to the content, a sort of de-rushing which is useful and necessary for creating textual, visual or sound bookmarks and establishing, modifying or deleting segments during this initial phase of work, and essential when the analyst makes his/her description and wants to watch or listen to such-and-such a part a new.

This metaphor allows us to imagine, in an almost intuitive way, the main possibility offered by the segmentation bar: virtual segmentation. It is not difficult to understand that the cursor here works in order to move virtually, image by image, within the video and point to a part x and then y, which correspond to the beginning and end of a selected segment. This function is coupled with the “Create segment” button.

The other properties displayed in the “Segment properties” pane enable us to:

– attribute a title to the segment (see below);

– exactly determine the duration of a segment by displaying its time-code;

– associate an icon with the segment – a visual representation of it – by choosing a still image to be displayed when it is published.

Assigning a title at this level stems from a calculated choice on the editor’s part – this implies looking at things from a publication point of view. The semantic function is important here, because it varies according to the context of use. We shall see that the Description Workshop offers the possibility of changing it and/or contextualizing it, say, by adding a subtitle.

Finally, before moving on to the Description Workshop, the “segmentation” panel shows us an overview, displaying: the title, duration and the icon representative of the virtual segments.

Without going into the technical details of cutting a video at this juncture, let us underline that it is carried out according to the needs or interests but also the knowledge and know-how of the analyst. Thus, we must distinguish between the many possibilities offered by the Segmentation Workshop in Interview and the quality (reliability, credibility, etc.) of the cutting, for which the analyst alone is responsible.

2.4. Some additional thoughts about segmentation

As has already been said, one might imagine many and diverse forms or types of cutting. The type of cutting of a video into one or a number of segments may simply rely on the beginning and end of a video shoot (i.e. the beginning and end of the action of the movie camera filming what is known as a pro-filmic situation). This is now largely automated and offered as a function in most commercial tools for video digitization and editing. The automatic cutting (and indexing) of digital videos is a field of technical research where the scientific as well as practical and economic stakes are extremely high.

Another type of cutting is based on filmic (semiological) analysis of a video, i.e. on analyzing the visual and sound shots, techniques of “visual mise-en-scène” and “sound mise-en-scène”, editing the visual shots into a coherent whole, synchronizing the visual and sound shots, etc.

Here, the type of cutting we (and, apparently, most users of video as a tool for expression, communication and sharing information and knowledge) are interested in, is based on what we call the video’s content, i.e. the fact that it seems or it is supposed to possess information (in the broadest sense of the word) aimed at an audience for whom it therefore constitutes a potential knowledge source or resource. Indeed, the task of cutting a video is considered in a similar way to the traditional task of identifying and “manually” extracting pieces of information (“sentences”, “sections”, etc.) within a book – pieces of information which interest the reader and which he classifies, comments upon and then keeps in the form of index cards (paper cards, as was the practice about 15 years ago, or digital cards, as is the case now). Yet, any justification (be it personal or professional) of the effort and investment (not only financial) represented by such a task of information extraction and subsequent processing relies on the (always fairly risky) assumption made by the reader (the analyst) of relevance, of the value (cognitive, emotional, practical, etc.) s/he attributes to a video (or to a part of the video).

Hence, many extremely varied types and forms of cutting may exist; the cutting of a video into thematically coherent segments (for the analyst) is the type we are most interested in here and which serves as a “guiding principle”, as we saw in section 2.2.

2.5. Perspectives relating to the segmentation workshop

As previously described, the analyst carries out a segmentation according to his/her intellectual abilities which enable him/her to describe and select the textual, visual and sound elements in a visual object which are relevant to his/her analysis according to his/her editorial choice motivated by the objectives of his/her (hereafter “his” for simplicity’s sake) analysis.

Hence, the perspectives relating to the development of this workshop are clearly defined as regards technicality and potential for success in segmentation.

Hence, in the context of a collaboration between ESCoM at the FMSH and the Research Department of the INA (Institut National de l’Audiovisuel), and through various French and European R&D projects,12 we considered the possibility of segmenting a still image and practicing several types of segmentation in parallel on one file or video, using an early prototype of a more sophisticated segmentation software package than Interview. The name of this prototype is Saphir Studio, developed by INA-Recherche.13

Currently, the ASW Segmentation Workshop uses Interview, first developed by INA-Recherche and then adapted by ESCoM to the ARA’s specific needs for publishing audiovisual corpora (see Chapter 7 for more detailed information). Today, this software – early versions of which date from 2004–2005, is beginning to show serious limitations.

The segmentation is represented in a single layer. The possibility of including two (or more) layers of segmentation using a multilayer segmentation bar would enable us to orient such-and-such a level of segmentation according to the objectives of one video. Similarly, the possibility of including segmentation within a selected still image, either as a still image inlaid within the video, or as the selection of a shot, would open the way to analyzing the visual elements making up the image, and creating thematic or educational folders dedicated to these images. Think, for example, of folders in Art History where a collection of recurring figures or color choices could be made explicit and contextualized according to a civilization, or folders on literature where typical and mythical characters could exemplify literary genres. Another possibility, which would involve a new sector of technical research, could be to take the acoustic dimension into account, enabling the soundtrack to be segmented and the sounds to be re-appropriated, with a view, e.g. to creating downloadable MP4 or MP3 podcasts which could illustrate a lecture or any other educational or cultural product.

These perspectives require that significant and profound changes be made in the computer part of the aforementioned software and, very probably, that it be replaced by more appropriate – and more sophisticated – “software solutions”. Thanks to the French research and development project SAPHIR, financed by the French ANR, these solutions were developed and tested as a “multilayer” segmentation prototype, enabling us to segment not only digital videos but also digital images, sound files or even .pdf files.

Although the software (i.e. “Interview”) currently used to make the ASW Segmentation Workshop work seems destined for imminent replacement, only time will tell what new “technological solution” will be adopted.


1 Chapter written by Elisabeth DE PABLO.

1 Chapter 1, “Context and Issues”.

2 The word “segmentation” is here considered in the sense of cutting the document into homogeneous subsets and does not necessarily correspond to the phrase “segmentation or sequence” as it is conceived in the technical vocabulary of production or in usual critical common parlance [AUM 93].

3 For information on this, see section 3.2.

4 Here, we speak of virtual segmentation. Obviously, this segmentation is carried out using a computer tool and is therefore “virtual”, rather than by way of a physical process of cutting, as can still be done with analog segmentation or montage.

5 Digital Audiovisual Archives, ISTE Ltd, London and John Wiley & sons, New York, 2012.

6 Note: the following lists are provided for exemplification purposes only – in no way do they constitute exhaustive lists of the possibilities, which must be determined according to the needs and constraints of the analysis.

7 This segmentation was carried out using an early version of the “Interview” software as part of the SAPHIR project.

8 The complete interview is available for consultation online on the ARA Website: http://www.archivesaudiovisuelles.fr/924/.

9 The site was built as part of the French project SAPHIR, and European projects LOGOS and DIVAS: http://www.culturalheritage.fr.

10 This segmentation was also carried out using the “Interview” software as part of the SAPHIR project.

11 In addition, help bubbles are available at any time by hovering over the buttons with the mouse.

12 Specifically, we refer here to the French research project SAPHIR (2006–2009, spearheaded by INA-Recherche and financially backed by the ANR), and the European research project LOGOS (2006–2009, led by Antenna Hungária and financed by the EC as part of its 6th CORDIS). For more information about these projects, see the ESCoM Website: http://www.semionet.fr.

13 This is a team of three engineers: Patrick Courounet, Steffen Lalande and Abdelkrim Beloued – with whom, as already explained in the introduction of this book, the ESCoM team have been collaborating for nearly 10 years.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.253.223