Adding text to your video

When displaying multilingual videos we often want to provide text for persons who speak other languages. This is a common practice for many conference talks as well as plenty of movies and TV shows. In order to enable external text track resources in the video the WebVTT (http://dev.w3.org/html5/webvtt/) standard was created.

Getting ready

For simplicity, we will use the same video together with the poster images, same as we used in the other examples. As for the other files we will create them ourselves. You can also pick other video on your own since the video itself will not be all that relevant.

How to do it...

We start with the HTML, where we include the video element and additionally add track elements as well as simple example.js. Perform the following steps:

  1. In the body element we include :
        <p>
          <video width="640" height="360" poster="poster.png" controls preload loop>
         Video playback not supported <a href="http://archive.org/download/WalterLantz-BoogieWoogieBugleBoy1941/WalterLantz-BoogieWoogieBugleBoy1941.ogv"> download</a> instead
            <source
            src="http://archive.org/download/WalterLantz-BoogieWoogieBugleBoy1941/WalterLantz-BoogieWoogieBugleBoy1941.ogv" type="video/ogg" />
            <track src="video.vtt" kind="subtitles" srclang="en" label="English" default />
            <track src="karaoke.vtt" kind="captions" srclang="gb" label="Other" />
          </video>
        <p>
        Video is part of animation shorts on <a href="http://archive.org/details/more_animation"> archive.org</a>. The video
        is titled : Walter Lantz - Boogie Woogie Bugle Boy
        </p>
        <script src="example.js"></script>
  2. The JavaScript will only log the objects available to our video element. The idea here is to show that tracks can be accessed and manipulated by code. The script will contain the following:
    (function(){
      var video = document.getElementById('theVideo'),
          textTracks = video.textTracks;
    
       for(var i=0; i < textTracks.length; i++){
        console.log(textTracks[i]);
       }
    }())
  3. As for the .vtt flies that we included for the tracks we will create them manually. The file video.vtt will contain the following:
    WEBVTT
    
    1
    00:00:01.000 --> 00:00:13.000
    this is the video introduction
    
    2
    00:00:15.000 --> 00:00:40.000
    There is also some awesome info in
    multiple lines.
    Why you ask?
    Why not ...
    
    3
    00:00:42.000 --> 00:01:40.000
    We can use <b>HTML</b> as well
    <i> Why not?</i>
    
    4
    00:01:42.000 --> 00:02:40.000
    {
    "name": "Some JSON data",
    "other": "it should be good for meta data"
    }
    
    5
    00:02:41.000 --> 00:03:40.000 vertical:lr
    text can be vertical
    
    6
    00:03:42.000 --> 00:04:40.000 align:start size:50%
    text can have different size relative to frame
  4. As for karaoke.vtt it will contain the following code:
    WEBVTT
    
    1
    00:00:01.000 --> 00:00:10.000
    This is some karaoke style  <00:00:01.000>And more <00:00:03.000> even more  <00:00:07.000>  

After running the example we should have subtitles at the given ranges.

Tip

If you construct the WebVTT file manually you can notice that it is easy to make a mistake. There is good validator available at http://quuz.org/webvtt/ with the source code on https://github.com/annevk/webvtt.

How it works...

Video has been available for quite some time now but adding subtitles was not an option. The track element enables us in a standard way to add information to our video. Tracks are not just used for subtitles, but can also be used for other kinds of timed cues.

Note

The general definition for the word cue is that it represents a thing said or done that serves as a signal to an actor or other performer to enter or to begin their speech or performance.

Cues can contain other data formats like JSON, XML, or CSV. In our example we included a small JSON data snippet. This data can be useful in many different ways since it connected with a given portion of time, but subtitles are not the real use of it.

The track element can contain the following values for its kind attribute:

  • subtitles: It is the transcription or translation for a given language.
  • captions: It is very similar to subtitles but it may also include sound effects or other audio. The main intention of this type is use for cases where the audio is not available.
  • descriptions: It is a text description of the video meant for use where the visual part is not available. For example, it can provide description for users who are blind or unable to follow the screen.
  • chapters: This track can contain chapter titles for given periods.
  • metadata: This is a track is very useful for storing meta data that can latter be used by a script.

Besides the kind attribute there is also the src attribute that is mandatory and shows the URL of the track source. The track element can also contain srclang containing the language tag of the timed track.

Note

The language tax often has two-letter unique key for representation of the specific language. For more details you can take a look at http://tools.ietf.org/html/bcp47.

There is also the attribute default, where if present on a track that is the track that will be shown by default.

Also we can use the label attribute that can have free text value specifying a unique label for the element.

Note

One clever use of the track element can be found on : http://www.samdutton.net/mapTrack/.

The WebVTT standard defines that the file needs to start with the string "WEBVTT". Following that we have the cue definitions, zero or more such elements.

Each cue element has the following form :

[idstring]
[hh:]mm:ss.ttt --> [hh:]mm:ss.ttt [cue settings]
Text string

The idstring is an optional element but it is good idea to have it specified if we need to access the cue using a script. As for the timestamp we have a standard format where the hours are optional. The second timestamp must be greater than first one.

Text string is allowed to contain simple HTML formatting like <b>, <i>, and <u> elements. There is an option to add a <c> element that can be used for adding a CSS class for portions of the text, for example <c.className>styled text </c>. There is also an option to add a so called voice label <v someLabel> the awesome text </v>.

The cue settings are optional as well and are appended after the time range. In this setting we can pick whether the text is shown horizontally of vertically. The settings are case sensitive so they must be in lowercase as shown in the examples. The following settings can be applied:

  • vertical: It is used with values vertical:rl where the rl stands for writing right to left and vertical:lr for left to right.
  • line: This setting specifies where the text will be shown vertically or in the case where we have already used vertical, it specifies the horizontal position. The value is specified with percentage or a number where the positive value means top and negative bottom. For example, line:0 and line:0% indicate top and line:-1% or line:100% indicate bottom.
  • position: It is a setting that specifies where the text will be shown horizontally, or if we have vertical property set it specifies where the text is shown vertically. It should have value between 0 to 100 percent. For example, it can be position:100% meaning right.
  • size: It specifies the width/height of the text area in percentage depending on the additional vertical setting. For example, size:100% means the text area will be shown.
  • align: It is a property that sets the aligning of text within the space of the area defined by the size setting. It can have the following values align:start, align:middle, and align:end.

In the text string we can also add more detailed order of appearance of given words, in a sort of karaoke style. For example, see the following:

This is some karaoke style  <00:00:02.000>And more <00:00:03.000>

It states that before the 2 seconds we have some text and the active cue And more is between 2 to 3 seconds.

One other thing to note about the text string is that the it cannot contain the string --> string, ampersand & or the less than character < since they are reserved. But no worries there we can always used the escaped version, for example &amp; for ampersand.

These restrictions do not apply if we use the file for metadata track.

There's more...

We also have the option to style the text using CSS. As previously mentioned VTT files can contain tracks with <c.someClass> for a more fine-grained styling but in the general case we want to apply the style on the entire track. Applying style for all the cues can be done:

::cue  {
        color: black;
        text-transform: lowercase;
        font-family: "Comic Sans";
}

But you may alienate the users by making their subtitles in comic sans.

There are also selectors for the past cues ::cue:past{} and ::cue:future{} that can be useful for making a karaoke-like rendering. We also can use the ::cue(selector) pseudo selector to target a node matching some criteria.

Not all of the feature are fully available in the modern browsers, most compliment at the time of writing is Chrome so for the others it is a good idea to use a polyfill. One such library is http://captionatorjs.com/ that adds support to all the modern browsers. Besides adding support for the WebVTT it also supports formats like .sub, .srt and YouTube's .sbv.

There is also one other format that was developed for the video tracks. The name is Timed Text Markup Language (TTML) 1.0 http://www.w3.org/TR/ttaf1-dfxp/ and it is only supported by IE without having any plans to get support in other browsers at the time of writing. The standard is more complex and it is based on XML but as such it is lot more verbose.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.66.94