Chapter 8

The ASW Digital Environment 1

8.1. Introduction

As already stated in the introduction to this book, the working environment developed as part of the ASW-HSS project is the result of studies conducted from 2001 onwards as part of the Audiovisual Research Archives (ARA) program. Hence, it is an evolved version of an existing environment, and the culmination of nearly 10 years of research and development:

– 2002–2006 saw the simultaneous development of a database, a dynamic and multilingual Web portal connected to this database, and a software package for managing publications on the portal, in partnership with Microsoft France Education.1 In 2002, distributing video on the Web and putting a tool in place for the multilingual management of the content represented rather bold challenges: let us not forget that YouTube2 and Dailymotion3 were only created in 2005, that CMSs4 such as WordPress5 and Joomla!6 have only been in existence since 2003 and 2005 respectively, and SPIP7 only began offering multilingualism in 2004.

– 2006–2009, witnessed the development of a software package for indexing videos and producing of specialized publications as part of the SAPHIR8 and LOGOS9 projects;

– since 2009, we have developed a whole new environment for the ASW-HSS10 project.

Thus, this environment exploits the results of the preceding years of research and responds to two mutually complementary types of contexts for using audiovisual documents:

– standard production/publishing as is done in the context of the Audiovisual Research Archives program, the result of which forms the basis of the ASW-HSS project;

– the ASW-HSS project, based on complex issues of semiotic description of audiovisual documents and specialized publications, constituting a reuse and a considerable valorization of a corpus as rich as that of the ARA.11

Hence, the development of the environment involved:

– developing a new model of data, intended for the sole purpose of publishing the data gathered during analysis, and a generalization of the use of a metalexicon of conceptual terms12 based on the results of our work on indexing from 2006 onwards;

– developing a database and Web services for the access to data;

– developing the mechanisms of access, control and data conversion;

– developing a new portal technology, Semiosphere, using both the functions developed for the ARA portal from 2002 onwards, and the specialized publishing functions explored since 2006.

There are many additions that this new environment brought, and they have not yet been fully exploited. Its power and capacity to evolve are related to two main factors.

First of all, by comparison to the environment of the ARA program presented in Chapter 1 of this book,13 the working processes are simplified by minimization and an improved complementarity of the necessary tools (in addition, all can be remotely used with a simple Internet connection): a tool for editing both the parameters of the environment, the ASW metalexicon of conceptual terms, the ASW thesaurus and the models of description; a tool for describing the videos (the work being simplified and better accompanied by the use of conceptual terms (or concepts) and models of description); and finally a very simple tool for publishing, since all the necessary information is already provided during the description.

Also, the fundamental added value of the ASW environment is that it is completely transposable to different contexts of use and for any other organization other than the FMSH or the ESCoM, enabling any organization or institution to define the parameters of the environment as it sees fit:

1. The only constraint relates to the use of Microsoft technologies (for the historical reasons explained above) for the database and the Web servers. On the other hand, there is no restriction on geographical location and the access paths are fully adjustable.

2. One of the fundamental strengths of the environment is that everything is piloted by the ASW metalinguistic resources, i.e. the ASW domain ontologies (metalexicon of conceptual terms, the thesaurus and the models of description), which are easy to edit and translate using the ESCoM OntoEditor software which is now part of the ASW Conceptual Modeling Workshop. Therefore, we are able to choose the parameters of:14

- the media: formats (wmv, mpeg, jpeg, etc.), modes of distribution (progressive download,15 streaming,16 mobile diffusion, etc.) and resolutions (16/9, 4/3, low bandwidth, high speed bandwidth, etc.),

- the types of rights and of user activity,

- the standards (Dublin Core,17 LOMFR,18 DEWEY,19 etc.) for importing and exporting data from/to other environments,

- the types of state of a publication (in progress, forthcoming, online, etc.),

- the thesaurus of the ASW environment bringing the items together by category. One of the strengths of the ASW system is that each item may be classified under several categories (for example, Victor Hugo would appear in the lists of French writers and French politicians),

- the basic types enabling us to configure the forms when developing the models of description: text, number, date, category of thesaurus, etc.,

- the metalexicon of conceptual terms peculiar to the organization,

- the common models of description (generic models of description of an author, a place, etc.),

- the models of description peculiar to each domain of knowledge (in the case of the ASW-HSS project, these are the domains of expertise of the experimentation workshops CCA, LHE, ArkWork, AICH and AACH).

3. The possibility for an organization to “personalize” the ASW metalinguistic resources (metalexicon of conceptual terms, thesaurus, and models of description) provides adjustability when it comes to the content description.

4. Not only may the types of video distribution server be located anywhere, but they may also use any technology (downloading, streaming Windows Media, streaming Flash, etc.). In order to change the formats of distribution, the user need only update the media settings via the ESCoM OntoEditor in the Conceptual Modeling Workshop.

5. Each portal which is displayed uses the same technology, Semiosphere, and may be automatically updated without modifying the parameters which are peculiar to the portal. Moreover, Semiosphere offers a simple function for personalizing the portals:

- use of a single setup file,

- personalization of the design thanks using page layout templates, stylesheets and images peculiar to each portal,

- tools for creating headings, editing content peculiar to a CMS, and tools enabling us to choose and define the parameters of the types of access to the content (thesaurus, thematic access, direct access, data accompanying a video, etc.).

The following sections of this chapter present and then describe the principles, interactions and realizations of the technologies developed for the ASW digital environment.

8.2. General presentation

8.2.1. Management of roles and rights

The set of activities within the ASW Studio environment is systematically subjected to a verification of the identity and rights of the user wishing to create, modify or delete information. Hence, when any tool is launched, the user must first enter his ID and the password that he was assigned by the administrator of the digital environment. This is for various reasons, which are not only prohibitive:

– the history function records the user’s intervention, ensuring he will appear as an author or co-author of the work (metalinguistic resources, analysis, publication, etc.);

– when his work is completed, the user may even specify what the nature of his contribution was (translation, analysis, proofreading, etc.);

– before anything is saved in the system, the user’s rights are verified depending on the type of action requested and the item to be saved;

– this system ensures that this environment may be used in any institutional context.

8.2.1.1. The roles

The roles are the types of users which are led to interact with the environment. We can distinguish six roles (see Figure 8.1):

Figure 8.1. The roles in the ASW environment

Figure 8.1

1. Administrator: configures the portals, creates the users and defines their rights.

2. Knowledge Engineer: configures the environment by editing the metalexicon of conceptual terms, the types of media, the models of description, the thesaurus and the parameters of the Web portal.

3. Director: saves new audiovisual productions via the portal.

4. Analyst: describes the audiovisual documents.

5. Translator: translates the vocabulary of the environment and metadescriptions.

6. Author: manages the publications on the portal.

Note that the role of “Author” is necessarily associated with a specific Web portal, whereas the other roles apply to the whole environment.

8.2.1.2. The activities

The activities are the different types of interactions between a user and the ASW environment. They enable us to more precisely define the users. We can distinguish five activities:

1. All activities: represents the set of activities below. (Thus, it is possible to define a super-administrator of the environment by giving him the “Administrator” right of “All activities”).

2. Modeling: represents activities on the ASW domain ontologies.

3. Production: represents activities on the media.

4. Analysis: represents activities of analysis of the audiovisual corpora.

5. Publishing: represents the activities on a particular Web portal.

8.2.1.3. The rights

Within this working environment, we can distinguish five types of rights:

1. Administrator: no restriction. The administrator may also grant or rescind rights to/from other users.

2. Moderator: no restriction. The difference is that the moderator cannot grant rights to other users.

3. Author: may record new information. The author may also modify and remove information that he has created. The author may neither modify nor remove information which was created by another user.

4. Translator: may add linguistic data to existing information. The translator may neither create new information, nor remove existing information.

5. Reader: may only consult information, without modifying it.

Each user may obtain different authorizations when acquiring rights for the different activities, hence being able to play several roles at the same time (with different attributions for each of these roles).

For example, a bilingual French-Spanish user, in charge of the site and a producer of videos will be attributed:

– the “Author” right for the “Production” activity;

– the “Translator” right for the “Modeling” and “Analysis” activities;

– the “Administrator” right for the “Publishing” activity (only for its Web portal).

8.2.2. The technologies

The ASW digital environment relies on a set of tools and services, some using existing technologies, others using technologies developed by ESCoM (see Figure 8.2):

Figure 8.2. Overview of the ASW environment

Figure 8.2

– ESCoM Suite, a set of office automation applications which is used for prepublishing activities – development of ontologies, encoding of the media, semiotic descriptions – developed by ESCoM (see Chapter 9);

– Semiosphere, a Web application for managing users, media and publications developed by ESCoM. Semiosphere is also the technology used for the Web portals on which the publications are placed (see Chapters 9 and 10);

– SemioscapeLibrary, a set of libraries of classes and methods which are common to all these tools, developed by ESCoM (see section 8.3).

These tools communicate with a set of servers to process and distribute data:

– a streaming server, Windows Media Server, broadcasting videos in wmv format in streaming mode (mms protocol);

– a streaming server, Flash Red5, broadcasting videos in flv format in streaming mode (rtmp protocol);

– a Web server hosting the Web applications with Semiosphere technology;

Semioscape, a server for storing and processing data developed by ESCoM.

The technologies developed by ESCoM for the ASW environment are described in the following sections.

8.2.3. The working process in the ASW environment

Here are the typical activities carried out in the context of a study on a portal using the ASW Studio, enabling us to better understand the interactions between the users and the system:

1. The administrator connects to Semiosphere, registers the users and attributes administrator’s rights for each activity;

2. The knowledge engineer connects to OntoEditor, specifies the domain ontology describing the objects of analysis of the portal’s domain of expertise using a library of already-existing metalinguistic resources, via the ASW Conceptual Modeling Workshop (hierarchy of conceptual terms, classes of models of description, classes of sequences and schemas making up a model of description, micro-thesaurus with facets, etc.):

a. The specified domain ontology is sent to Semioscape by Web service,

b. Semioscape saves it in the database;

3. The translator connects to OntoEditor to translate the specified ontology:

a. He selects the ontology from those saved in the database,

b. He carries out his translation,

c. The ontology is sent to Semioscape by Web service,

d. Semioscape updates the database;

4. The administrator displays and configures the portal on the Web server;

5. The administrators of each activity log in to Semiosphere and attribute rights to the users depending on their activity;

6. The director adds a digitized and edited media file. He launches ffCoder and encodes it in wmv and flv, in medium and high definition (in total, four files for the same media);

7. The director logs in to Semiosphere and saves the media using a form, specifying the different versions of the files (i.e. the 4 previously encoded files):

a. the encoded files are uploaded onto the streaming servers,

b. the media is saved in the database;

8. The analyst logs in to Interview. He carries out the analysis and evaluation of the media:

a. He selects the media from the database,

b. The media is downloaded onto his computer,

c. First, the analyst segments the media using the Segmentation Workshop, then in the ASW Description Workshop, carries out a semiotic description of the whole media as well as each segment, in the form of metadescription,

d. The metadescription is sent to Semioscape by Web service,

e. Semioscape saves the metadescription in the database;

9. The translator logs in to Interview via the Description Workshop. He translates the analyst’s expert report:

a. He selects the metadescription from those in the database,

b. He translates it,

c. The metadescription is sent to Semioscape by Web service,

d. Semioscape updates the metadescription in the database;

10. The author logs in to Semiosphere via the ASW Publishing Workshop. He creates a publication which can be organized in hierarchical levels:

a. For each hierarchical level, the author may associate one or more metadescriptions among those saved in the database,

b. The author indicates the state of his publication: “in progress”, “forthcoming”, “to be translated (into a given language)” or “in waiting”,

c. The publication is saved in the database;

11. The translator logs in to Semiosphere via the ASW Publishing Workshop. He receives a notification stating that a publication needs to be translated:

a. He translates the publication,

b. The publication is updated in the database;

12. The moderator of the portal logs in to Semiosphere via the ASW Publishing Workshop. He receives a notification stating that a publication is waiting. Having viewed it, he decides to upload it:

a. He modifies the state of the publication, to “online”,

b. The publication is updated in the database,

c. The publication is instantly accessible from the portal and the semiotic descriptions of the associated metadescription(s) feed into the headings for the different accesses (by theme, by genre, by use, by author, etc.) of the portal,

d. The corresponding media is diffused in streaming mode from the streaming servers, from the Web server.

Thus, the ASW environment enables us to carry out all the necessary steps for the management of audiovisual archives, according to a consistent working process with a limited number of tools used, and where the roles and rights of all parties are perfectly defined and controlled.20

8.3. SemioscapeLibrary

8.3.1. The abstraction layers

The SemioscapeLibrary library was developed in C# (.Net Framework 4). As shown in Figure 8.3, it is composed of three abstraction layers:

1. The objects layer, called SemioscapeEntities, is the main layer. It contains the declaration of the classes being used by the set of other layers, as well as by the applications. These classes are described in the next section.

2. The layer of access to the data defines the methods for accessing the data, i.e. the read and write requests in the database as well as the valuable methods for handling objects. Access to the data from an application is done through several steps, since they cannot be directly connected to the database server:

a. the client sends a request by a Web service,

b. the Web service contacts the library of methods for accessing the data,

c. in the case of a write request, the transaction is implemented on a Dataset which is hosted by a Web service (the read accesses are implemented directly on the database),

d. if the write action on the DataSet is successful, the database is updated from the DataSet (hence, the whole database is preserved at any time, even in the case of simultaneous contradictory requests),

e. the Web service sends a response to the client;

3. The data processing layer implements the necessary processes before any retrieval of the data and carries out the processing which is necessary before any request for data access:

- control of the rights of the users making a request,

- possible conversion of the data (if the client uses a different procedure from the one which is defined in the objects layer),

- validation of the data.

It should be noted that SemioscapeLibrary also contains two other libraries (which are not shown in Figure 8.3):

4. A library of resources (Semioscape Resources) which brings together the common images and translations of the interfaces;

5. A library of controls used by the applications (Semioscape UserControls).

Figure 8.3. Overall view of SemioscapeLibrary

Figure 8.3

8.3.2. The objects layer

The SemioscapeEntities classes form part of a data model which owns the environment. Here, they are succinctly described and are gathered according to the types of objects which are often manipulated by the users: users, media, ontologies, metadescriptions and publications. This library implements the spaces of names Escom.Semioscape.Entities.

8.3.2.1. The common classes

Table 8.1 shows the classes representing fairly generic objects which are used by the other classes: languages, links and rich descriptions.

Table 8.1. The common classes in SemioscapeEntities

Class Main properties Description
Culture String Code
String Language
String Localization
String Name
Culture, i.e. language-country association, according to the ISO 639-1 standard21 (where the French language as spoken in Belgium is represented by the code “fr-be”).
Link String URL
String Name
Hyperlink.
RichDescription String TextContent
List<Link> Links
Rich description, i.e. text associated to hyperlinks.

8.3.2.2. The user classes

Table 8.2 show the classes which are used for managing users: personal data, history and rights (Table 8.2).

Table 8.2. The user classes in SemioscapeEntities

Class Main Properties Description
User Guid UserId
String UserName
User identified by a unique ID and password (login).
UserAction Guid UserId
Guid ActionId
DateTime ActionDate
String ActionDescription
Action of a user on an object.
UserHistory List<UserAction> UserActions History of the activities of the users on the same object.
UserRole Guid UserId
String Role
String Activity
A user’s right i.e. the user-role-activity association.

8.3.2.3. The media classes

Table 8.3 shows the classes which are used for manipulating the media and its formats.

Table 8.3. The media classes in SemioscapeEntities

Class Main Properties Description
MediaFile String FileName
String PublicDirectory
Guid TypeId
Guid TypeOfDiffusionId
Guid TypeOfResolutionId
Media file
Media Guid Id
String Title
String UniqueName
DateTime DateOfRealization
String Duration
List<MediaFile>
ListOfMediaFiles
Media. A media object may contain several files: hence we can define several formats, resolutions and modes of distribution for the same media.

8.3.2.4. The ontology classes

Table 8.4 shows the classes used for manipulating the ASW metalinguistic resources: hierarchies, branches of a hierarchy, multilingual annotations and equivalents in other standards.

In order to fully comprehend the use of the properties of the hierarchies and DefinitionFormats, it should be borne in mind that the metalanguage of description is divided into three main classes of conceptual terminology (Figure 8.4).

1. The “domain ontology” category enables us to define the models of description which are peculiar to a domain of knowledge (domains such as cultural diversity, literary heritage or research on archeology). A model of description is composed of a selection of conceptual terms that are to be specified (to be indexed). Each of these models is defined by a functional organization, a hierarchical organization and a configurational organization:

a. the functional organization of a description model. A description model is typically divided into: 1. a part reserved for the identification and description per se of an object or domain of knowledge; 2. a part reserved for the (spatial, temporal or even contextual) localization of the object or domain of knowledge to be described; 3. a part reserved for the “discourse analysis” of the thematization, the object or domain of knowledge of the source video; and, finally 4. a “metatextual” part composed essentially of the analyst’s comments, explanations, etc.,

b. The hierarchical organization of a description model. Each functional part of a descriptive form is composed of local sequences which guide the analyst in indexing it. For example, the functional part “temporal localization” (of a work of literature, of an archeological excavation in France, of a language spoken in a given region of France, etc.) may be constituted of a sequence “localization within a period of French History” and “chronological localization by century”,

c. The configurational organization of a description model. The sequences of a form are composed of a selection of generic schemas (or “modules”) containing either the notions or concepts which have to be entered, or the values (referents) to be respected when carrying out a concrete description of a video. For example, several forms of description of the LHE domain (dedicated to literary heritage) containing a sequence “Identifying a literary work”. This sequence is composed of a set of schemas (or modules): “Write the headings”; “Select the literary genre”; “Select the author”, etc. Each of these schemas contains the concepts or notions to be specified (the surname and forename of the author, for instance).

Figure 8.4. Intra-ontological relations

Figure 8.4

Table 8.4. The Ontology classes in SemioscapeEntities

Class Main properties Description
Benchmarking String
Benchmark_Unique
String Code
Standard-code association. Enables us to define the code of a Field in another standard (e.g. the field called “Mathematics” in the ASW thesaurus corresponds to code 510 in the DEWEY classification).
Annotation String Language
String Name
String PublicName
UserHistory History
Annotation of an element in a language. It is possible to specify a different annotation according to the context (for example, the property Name is used for displaying the item to the analysts, but it is the less technical property called PublicName which is displayed in Semiosphere).
Field Guid Id
String UniqueName
List<Annotation>
Annotations
Annotation
CurrentAnnotation
List<Field> Fields
Bool IsDefinition
List<Guid>
DefinitionsFormats
List<Benchmarking>
Benchmarks
UserHistory History
Branch of a hierarchy (conceptual term), identified both by a unique ID and a unique name.
Its appellations in different languages are stored in the Annotations, CurrentAnnotation automatically referring to the Annotation in the user’s language.
The hierarchical structure is ensured by the Fields property.
The DefinitionsFormats refer to Field IDs defined in another object: this mechanism facilitates the construction of semiotic models (see Figure 8.4).
The IsDefinition property indicates that the DefinitionsFormats represent semiotic models with predefined values.
Ontology: Field Guid TypeOfOntology
Guid
OntologyVersionId
Guid
ConceptsVersionId
Guid
DefinitionOntologyId
UserHistory History
(Inherits from Field)
Complete hierarchy which is able to define a thesaurus, a domain ontology, a metalexicon of conceptual terms, of the models of description, etc.22
The different properties enable us to define:
- the type of ontology (domain, concepts, thesaurus, configuration),
- the thesaurus which is used,
- the ontology of concepts which is used,
- the hierarchy containing the Fields which are identified in the properties DefinitionFormats.

2. The “metalexicon of conceptual terms” category. This is a set of basic files representing the “world” or the domain of knowledge according to the model-maker’s own point of view. In particular it contains two central files which are extremely important:

a. the file with the conceptual vocabulary of the ASW-HSS domain. Any description of an audiovisual resource relies, in fine, on this conceptual vocabulary which designates the main notions or themes to be specified (to be “indexed”) so as to account for the content and its expression in a text;

b. the file with schemas of indexing enabling the analyst to specify each relevant conceptual term. In addition to a family of schemas of linguistic indexing, this file was attributed with the possibilities of non-linguistic indexing (in the form, for example, of visual icons, acoustic or visual extracts possessing a typical character for the considered video), of textual indexing-qualification (in the form of small qualifying sentences enabling the analyst, if he so desires, to better locate a specific conceptual term); digital and geographical indexing, as well as an indexing by terms from the thesaurus, which constitute pre-established values for a given conceptual term (e.g. for the conceptual term “Country”, a type of pre-established values is constituted by a list of countries which is now published under the authority of the UN).

3. The “Static ontology” category which is peculiar to the ASW computer system. It enables us to define the usual types of data (text, numbers, dates, etc.) or the types of data which have been introduced by the model-maker as well as, more particularly, the thesauruses which are associated with these types.

Hence, the metalexicons of conceptual terms are a very important tool in the sense that they enable us on the one hand to “contain” the proliferation of the conceptual vocabulary of the ASW-HSS domain and on the other hand to reasonably easily create “bridges of correspondence” (of “translatability”) between the approach which was developed in the ASW-HSS project and the whole wealth of glossaries, vocabularies, terminologies, etc. used elsewhere.

8.3.2.5. The object of analysis classes

A metadescription23 uses a set of classes representing the objects which may be described during a semiotic description: acoustic shot, visual shot, thematic level, rhetorical level, uses, actors, additional resources, translations, references and legal citations (Table 8.5).

Table 8.5. The objects of analysis classes in SemioscapeEntities

Class Main properties Description
SlotObject Guid Id
String Name
String Title
Guid TypeId
RichDescription
Description
Metaclass.
Each object is represented by a value (Title), an internal name (Name), a rich description and is necessarily typified by a Field-type element (therefore, it is defined in an ontology).
AudioObject: SlotObject List<AudioPlan>
AudioPlans
(Inherits from SlotObject)
Acoustic object, which may be associated to acoustic techniques.
AudioPlan: SlotObject   (Inherits from SlotObject)
Acoustic technique.
Company Guid Id
String Name
String Acronym
Collective entity (institution, company, university, etc.)
Concept: SlotObject String Scheme (Inherits from SlotObject)
Concept of a model of description. The exact location of the concept in the model of description is recorded in the Scheme property, in the form:
ID_of_the_sequence/
ID_of the_schema/
ID_of_the_concept/
ID_of_the_schema/
ID_of_the_sign.
A concept is necessarily associated to a static type (a text, a number, etc. or even a category of the thesaurus). The ID of the type is defined by the TypeId property, whereas its unique name is defined by the Name property.
Context: SlotObject Guid
TypeOfDestineeId
(Inherits from SlotObject)
Use case.
Associates a use case (TypeId) to a type of addressee/recipient (TypeOfDestineeId).
Copyright: SlotObject   (Inherits from SlotObject) Copyright.
Member: SlotObject String Forename
String Email
String Activity
Company Company
(Inherits from SlotObject)
Natural person/entity. May represent a participant, a director, an author, a user, etc.
Notice String HowToQuote
List<Copyright>
Copyrights
(Inherits from SlotObject)
Legal rights.
Pattern: SlotObject List<Concept>
Concepts
(Inherits from SlotObject)
Model of description, i.e. topic identified during the discourse analysis.
Reference: SlotObject   (Inherits from SlotObject)
Bibliographical reference.
Resource: SlotObject String URL
String Author
(Inherits from SlotObject)
Resource (book, Website, magazine, etc.)
Rhetoric: SlotObject   (Inherits from SlotObject)
Rhetorical level.
Translation: SlotObject String Language (Inherits from SlotObject)
Translation of the discourse (Literal translation, adapted translation, subtitling, etc.)
VideoObject: SlotObject List<VideoPlan>
VideoPlans
(Inherits from SlotObject)
Visual object, which may be associated with visual techniques.
VideoPlan: SlotObject   (Inherits from SlotObject)
Visual technique.

8.3.2.6. The metadescription classes

Table 8.6 shows the classes which are used as to manipulate objects representing the structure of a semiotic description: metadescription, video and segments.

Table 8.6. The metadescription classes in SemioscapeEntities

Class Main properties Description
VideoShot Guid Id
Guid TypeId
String Title
String BeginTime
String Duration
String EndTime

Notice Notice
List<Guid> Subjects
List<Member> Members
List<Resource> Resources
List<Translation> Translations
List<VideoObject> VideoObjects
List<AudioObject> AudioObjects
List<Pattern> Patterns
List<Rhetoric> Rhetoric
List<Context> Contexts
UserHistory History
Segment of a video, defined by a beginning, a duration and an end.

May be described by all the types of objects previously described.
Video Guid Id
Guid TypeId
Guid MediaId
String Title
List<VideoShot> shots
Notice Notice
List<Guid> Subjects
List<Member> Members
List<Resource> Resources
List<Translation> Translations
List<VideoObject> VideoObjects
List<AudioObject> AudioObjects
List<Pattern> Patterns
List<Rhetoric> Rhetoric
List<Context> Contexts
UserHistory History
Video.
The corresponding media is defined by the MediaId property.
The segmentation is defined by the “shots properties”.

May be described by all the types of objects previously described.
MetaDescription Guid Id
Guid TypeId
Guid DomainOntologyId

Video VideoDocument

Notice Notice
List<Reference> References
UserHistory History
Metadescription.
The domain ontology which was used for the semiotic description24 is defined by the DomainOntologyId property.
The described video is defined by the VideoDocument property.

8.3.2.7. The publication classes

Table 8.7 shows the classes which are used for manipulating the publications and their hierarchy via the ASW Publishing Workshop.

8.3.3. The data access layer

The data access layer, called SemioscapeDataAccess, contains a set of classes which define the routines, variables and DataSets25 and are essential to applications for manipulating the data in the database, or transferring data between applications. For applications which use these classes, the mechanisms, technologies and even the physical machines are totally invisible and indifferent. All the methods in the library are static methods.

The classes are gathered according to the type of object they help to manipulate, according to the same classification carried out in SemioscapeEntities presented in the previous section (8.4.2): common objects, user, media, ontology, object of analysis, metadescription and publication.

Table 8.7. The publication classes in SemioscapeEntities

Class Main properties Description
Publication String Alias
Guid TypeId
String State
MetaDescription
MetaDescription
DateTime DateOfSubmission
DateTime DateOfPublication
Publication of a metadescription.
Event Guid Id
Guid TypeId
String Alias
String State

List<Annotation>
Annotations
Annotation
CurrentAnnotation

List<Event> Events
List<Publication>
Publications
UserHistory History
Hierarchical level of publishing.
Its appellations in different languages are stored in the Annotations,
CurrentAnnotation automatically referring to the Annotation in the user’s language.
The hierarchical structure is ensured by the Events property.
Each level may be associated with publications thanks to the Publications property.

8.3.4. The data processing layer

The data processing layer encompasses two libraries:

– SemioscapeControllers, which defines the methods for controlling and validating data and requests – essentially the controls of the right of a user to manipulate an object;

– SemioscapeConverters, which defines the methods for converting data, to or from other formats – these formats may be external formats as well as internal ones (in which case these are methods for upgrading from previous (obsolete) formats of the system). The main classes and methods of the SemioscapeConverters library are briefly described in Table 8.8:

Table 8.8. The SemioscapeConverters classes

Class Namespace Description
  Escom.Semioscape.
Converters.
MediaConverters.
FicheMediaData
The classes of this namespace enable us to define a model of representation of a video in File.
This type has notably been used for the following projects:
- export to the Cerimes26 catalog
- export to the copyright registration catalog of the BnF.27
MediaConverter Escom.Semioscape.
Converters.
MediaConverters
Methods for converting timecodes;
Methods for converting the ARA media into a collection of Files;
Methods for converting the ARA media into a collection of SaphirMediaMetadata.
MediaSorter Escom.Semioscape.
Converters.

MediaConverters
Methods for updating the databases of media files (renaming directories, files, relocating files, cleaning directories, etc.).
SaphirMediaMetadata Escom.Semioscape.
Converters.
MediaConverters
The classes of this namespace enable us to define a model of representation of a video in SaphirMediaMetadata, respecting the formalism of the media base which was developed in the context of the SAPHIR project.
MetaDescriptionSorter Escom.Semioscape.
Converters.
MetaDescriptionConverters
Methods for upgrading the filebases of metadescription (renaming directories, files, relocating files, cleaning directories, etc.).
  Escom.Semioscape.
Converters.
MetaDescriptionConverters
The other classes of this namespace define the methods of conversions of metadescriptions using obsoletes system data models.
  Escom.Semioscape.
Converters.
OntologyConverters
The other classes of this namespace define the methods of conversions of ontologies using obsolete system data models.

8.4. Semioscape

The Semioscape server is at the heart of the architecture of the ASW environment (see Figure 8.2). It stores and provides the mechanisms for accessing data from the ESCoM Suite and Semiosphere tools. It implements two types of services:

– a relational database developed using Microsoft SQL Server 2008;

– a server of Web services, developed in C#/ASP.Net (.Net Framework 4), displayed with Microsoft Internet Information Services (IIS) 8.

8.4.1. The database

In addition to the tables created for the ASW environment, the Semioscape database contains tables and saved procedures which are generated when the ASP.Net28 authentication is automatically activated. They all bear the prefix: “aspnet”.

The tables and their relations are described in the following sections.29 Each time, we used a referent table (which refers to a main object of SemioscapeEntities). The figures respect the conventions shown in Figure 8.5: the solid lines represent relational constraints effectively imposed on the base, whereas the dotted lines represent relations without effective constraints in the base.

Figure 8.5. Conventions for representing relations between the tables

Figure 8.5

8.4.1.1. The aspnet_Users table

The aspnet_Users table (Figure 8.6) represents a user, who is represented very succinctly by an ID and password. More detailed information is stored in the aspnet_Membership table, as well as in the Company tables for the collective entity to which the user belongs, and the RichDecription table for its presentation.

Figure 8.6. The aspnet_Users table

Figure 8.6

The aspnet_Membership table does not only store the users. It may store any type of individual (it corresponds to the Member class of SemioscapeEntities). For this reason it may be referenced in the Video and VideoShot tables (in order to associate an author, a director, etc. with a video or video segment).

The aspnet_Applications table represents an application, i.e. a Semiosphere portal. To enable us to register users for all the portals, a parent application (named SEMIOSPHERE) was created: the users are linked with this application.

The UserRole table associates a user with a role and an activity. In the case of the activity of “Publishing”, it must be linked with a particular application (so as to limit the publication rights to one particular portal). The other types of activities are linked with SEMIOSPHERE, the parent application.

The UserAction table stores the history of the users’ actions: it associates a user and any type of object.

8.4.1.2. The Media table

The Media table (Figure 8.7) represents a piece of media. The languages of the media are stored in the Culture table, while the different files of the same media are stored in the MediaFile table.

Figure 8.7. The media table

Figure 8.7

The types (a piece of media) does not necessarily have to be a video: it may also be an image, a text, etc. Moreover, the same video may be available in several encodings), the resolution (the same media may, e.g. be available for both low- and high-speed Internet) and the mode of distribution (e.g. the same media may for example be available both in streaming and mobile distribution mode) of a file must be specified using the fields of the metalexicon of the conceptual terms, and thus, recordings in the Field table.

Finally, a video description (Video table) references a media file.

It should be noted that the relations between the Media and MediaFile tables and the UserAction table are not represented in Figure 8.8.

8.4.1.3. The Field table

The Field table (Figure 8.8) represents a conceptual term of metalinguistic resources. The hierarchical structure is ensured by the FieldId field referencing the parent branch in a recording of the same table. An Ontology object inherits from the Field object, only the Ontology table references a recording of the Field table storing the data resulting from the inheritance.

Figure 8.8. The Field table

Figure 8.8

The references of a branch in other standards are stored in the Benchmarking table.

The multilingual textual data of a branch are stored in the Annotation table, itself referencing several recordings of the RichDescription table, so as to store its definitions, short descriptions, long descriptions, information and examples (for the sake of clarity, only the examples have been represented in the relations between Annotation and RichDescription in Figure 8.8).

Finally, the links to/from a rich description are stored in the Link table, while the UserAction table stores the history of the Ontology, Field and Annotation objects.

8.4.1.4. The SlotObject table

As its name suggests, the SlotObject table (Figure 8.9) represents a SlotObject object that most of the objects of analysis inherit from (see Section 8.4.2.5). Hence, most of the objects of analysis are entirely stored in the SlotObject table, do not implement new properties. Only the Concept, Context, Resource and Translation objects implement new properties which are stored in tables with the same name. The properties which derive from the inheritance are still stored in the SlotObject table using a SlotObjectId reference in these tables. Only the Member object is an exception to this rule: its data are entirely stored in the aspnet_Membership table (because it is necessary for the ASP.Net authentication).

Figure 8.9. The SlotObject table

Figure 8.9

When an object of analysis is stored in the list of another object of analysis (a Pattern has a list of Concept, a VideoObject has a list of VideoPlan, etc.), the reference to a parent object is made by a ParentId field of the SlotObject table. In the opposite case, the ParentId field references a Video or a VideoShot with which the object of analysis is associated.

It is important to note that the type of object of analysis (PATTERN, CONCEPT, VIDEOOBJECT, etc.) is specified in the SlotObject table in the ObjectType field (not shown in Figure 8.9).

The type of a SlotObject must be specified in a field in the metalexicon of conceptual terms, and therefore, in a record in the Field table.

The description of a SlotObject is stored in the RichDescription table, whose links are always stored in the Link table.

8.4.1.5. MetaDescription table

The MetaDescription table (Figure 8.10) represents a metadescription. It is referenced by Video items, which are themselves referenced by VideoShot items the data for which are stored in the tables of the same name.

Figure 8.10. The metadescription table

Figure 8.10

These three tables reference a recording in the Notice table in the NoticeId field, which itself implements copyrights of SlotObject type – which are thus stored in the SlotObject table (not shown in Figure 8.10).

Once again, their type must be specified using a field from the metalexicon of conceptual terms, and therefore, of a record in the Field table.

Their description is stored in the RichDescription table, and their history in the UserAction table according to the mechanisms we have previously described.

Moreover, the Video and VideoShot tables reference each the languages in the Culture table, the objects of analysis in the SlotObject table and the actors in the aspnet_Membership table.

8.4.1.6. The Event table

As its name suggests, the Event table (Figure 8.11) represents an Event object. The hierarchical structure is ensured by the ParentId field referencing the hierarchical level of parent publication in a record of the same table.

– The root level, on the other hand, references a record in the aspnet_Applications table: a publication is hence limited to a single portal.

– The multilingual textual data of an Event are stored in the Annotation table, like for the Field table.

– Each Event may associate several metadescriptions through the Publication table, which among other things associates a reference with a MetaDescription and with a state of publication (State field).

– The history of the Event and Publication objects are stored in the UserAction table.

Figure 8.11. The event table

Figure 8.11

8.4.2. The Web services

The Semioscape Web server, called SemioscapeWebServices, offers Web services30 allowing ESCoM’s software packages to make requests of the database (or the system) remotely, via a simple Internet connection. We distinguish three types of requests:

– requests to update the applications (sent by ESCoM Update);

– requests to read from the database;

– requests to write in the database.

The requests for the database do not cover the all objects. Indeed, since the media and publications are manipulated from Semiosphere (which is already a Web application), such Web services are not necessary. The same also applies to the write requests on users, as they are created in Semiosphere.

The main Web services and their main features are described in the following sections.

8.4.2.1. Semioscape

The Semioscape Web services implement the methods which do not necessitate requests to the database updates of the applications and the retrieval of the filepaths for accessing the servers.

8.4.2.2. Read in database requests

Read in the database requests are requests to select, exist and search within the database. We may distinguish three services of reading in the database:

– for the ASW domain ontology;

– for the metadescriptions;

– for the users.

8.4.2.3. Write in the database requests

The write in the database requests are requests such as insertions, updates and deletions in the database. We may distinguish two services of writing in the database:

– for the ASW domain ontology;

– for the metadescriptions.

8.5. Conclusion

All the technologies presented in this chapter have been implemented and validated. The Semioscape server, with its database and Web services, is under construction. The Semioscape library is also finished, even if it will likely evolve in its data and control access layers. These two technologies constitute the basis of the ASW Studio tools, which will be described in the following chapter.


1 Chapter written by Francis LEMAITRE.

1 Microsoft France Education: http://www.microsoft.com/france/education/. The results of the R&D projects carried out in the context of that partnership were published in the collected volume [STO 03].

2 YouTube: http://www.YouTube.com.

3 Dailymotion: http://www.dailymotion.com.

4 Content Management System: http://fr.wikipedia.org/wiki/Syst%C3%A8me_de_gestion_de_contenu.

5 WordPress: http://wordpress.org.

6 Joomla!: http://www.joomla.org.

7 SPIP: http://www.spip.net.

8 Support System for Hypermedia Publishing by Intentional specification and Rhetorical modeling, project of the RIAM program of the ANR, 2006–2010: http://www.semionet.fr/fr/recherche/projets_recherche/06_09_saphir/saphir.htm.

9 LOGOS, Knowledge on demand for Ubiquitous Learning, project of the 6th Framework Program of the European Comission, 2006–2009: http://www.ina-sup.com/recherche/logos These works are published in the collective work [LEM 08].

10 A presentation of the environment is published in the collective work [LEM 10].

11 For more information, see the detailed documentation of the ARA program and of its audiovisual collection: http://www.archivesaudiovisuelles.fr/FR/about4.asp.

12 See Chapter 5 of this book and [STO 12b].

13 See section 1.5.

14 See section 9.3.

15 Progressive download is the protocol of video distribution whereby the video is downloaded to the client’s computer; the client has to wait for the download to finish in order to be able to play the end of the video: http://en.wikipedia.org/wiki/Progressive_download.

16 Streaming is a protocol of video distribution whereby the video is not downloaded to the client’s computer, but sent “piece by piece”. This method notably enables users to access any moment within a long video, without additional waiting time: http://en.wikipedia.org/wiki/Streaming.

17 The Dublin Core is a model for representing digital resources: http://dublincore.org.

18 The LOMFR is a model for representing educational resources: http://www.lom-fr.fr.

19 DEWEY classification is a decimal scientific classification system: http://en.wikipedia.org/wiki/Dewey_Decimal_Classification.

20 These points constitute a clear improvement on the ARA environment described in section 1.5 of this book.

21 ISO 639-1 is a standard regulating the representation of the names of languages: http://fr.wikipedia.org/wiki/Liste_des_codes_ISO_639-1.

22 See Chapter 5 of this book and [STO 12].

23 See Chapters 3, 4, 5 and 6 of this book.

24 See Chapter 3 of this book.

25 DataSets are faithful representations of the tables in the database.

26 Center for resources and information on multimedia for higher education: http://www.cerimes.fr. A selection of the Cerimes catalog is available on the ARA portal: http://www.archivesaudiovisuelles.fr/FR/_Cerimes.asp.

27 Bibliothèque nationale de France (The French National Library): http://www.bnf.fr/fr/professionnels/depot_legal.html. The copyright registration of the ARA’s collection should be completed by the end of 2011.

28 Configuration of the SQL Server for ASP.Net: http://msdn.microsoft.com/en-us/library/ms229862%28v=vs.80%29.aspx.

29 The detail of the columns of the tables has not been described here.

30 A Web service is an asynchronous communication protocol between two machines via the http protocol. It notably facilitates communication between heterogenous systems thanks to the use of the standards SOAP and WSD: http://fr.wikipedia.org/wiki/Service_Web.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.183.210