When displaying multilingual videos we often want to provide text for persons who speak other languages. This is a common practice for many conference talks as well as plenty of movies and TV shows. In order to enable external text track resources in the video the WebVTT (http://dev.w3.org/html5/webvtt/) standard was created.
For simplicity, we will use the same video together with the poster images, same as we used in the other examples. As for the other files we will create them ourselves. You can also pick other video on your own since the video itself will not be all that relevant.
We start with the HTML, where we include the video element and additionally add track elements as well as simple example.js
. Perform the following steps:
<p> <video width="640" height="360" poster="poster.png" controls preload loop> Video playback not supported <a href="http://archive.org/download/WalterLantz-BoogieWoogieBugleBoy1941/WalterLantz-BoogieWoogieBugleBoy1941.ogv"> download</a> instead <source src="http://archive.org/download/WalterLantz-BoogieWoogieBugleBoy1941/WalterLantz-BoogieWoogieBugleBoy1941.ogv" type="video/ogg" /> <track src="video.vtt" kind="subtitles" srclang="en" label="English" default /> <track src="karaoke.vtt" kind="captions" srclang="gb" label="Other" /> </video> <p> Video is part of animation shorts on <a href="http://archive.org/details/more_animation"> archive.org</a>. The video is titled : Walter Lantz - Boogie Woogie Bugle Boy </p> <script src="example.js"></script>
(function(){ var video = document.getElementById('theVideo'), textTracks = video.textTracks; for(var i=0; i < textTracks.length; i++){ console.log(textTracks[i]); } }())
.vtt
flies that we included for the tracks we will create them manually. The file video.vtt
will contain the following:WEBVTT 1 00:00:01.000 --> 00:00:13.000 this is the video introduction 2 00:00:15.000 --> 00:00:40.000 There is also some awesome info in multiple lines. Why you ask? Why not ... 3 00:00:42.000 --> 00:01:40.000 We can use <b>HTML</b> as well <i> Why not?</i> 4 00:01:42.000 --> 00:02:40.000 { "name": "Some JSON data", "other": "it should be good for meta data" } 5 00:02:41.000 --> 00:03:40.000 vertical:lr text can be vertical 6 00:03:42.000 --> 00:04:40.000 align:start size:50% text can have different size relative to frame
karaoke.vtt
it will contain the following code:WEBVTT 1 00:00:01.000 --> 00:00:10.000 This is some karaoke style <00:00:01.000>And more <00:00:03.000> even more <00:00:07.000>
After running the example we should have subtitles at the given ranges.
If you construct the WebVTT file manually you can notice that it is easy to make a mistake. There is good validator available at http://quuz.org/webvtt/ with the source code on https://github.com/annevk/webvtt.
Video has been available for quite some time now but adding subtitles was not an option. The track element enables us in a standard way to add information to our video. Tracks are not just used for subtitles, but can also be used for other kinds of timed cues.
Cues can contain other data formats like JSON, XML, or CSV. In our example we included a small JSON data snippet. This data can be useful in many different ways since it connected with a given portion of time, but subtitles are not the real use of it.
The track element can contain the following values for its kind
attribute:
Besides the kind attribute there is also the src
attribute that is mandatory and shows the URL of the track source. The track element can also contain srclang
containing the language tag of the timed track.
The language tax often has two-letter unique key for representation of the specific language. For more details you can take a look at http://tools.ietf.org/html/bcp47.
There is also the attribute default
, where if present on a track that is the track that will be shown by default.
Also we can use the label
attribute that can have free text value specifying a unique label for the element.
One clever use of the track element can be found on : http://www.samdutton.net/mapTrack/.
The WebVTT standard defines that the file needs to start with the string "WEBVTT". Following that we have the cue definitions, zero or more such elements.
Each cue element has the following form :
[idstring] [hh:]mm:ss.ttt --> [hh:]mm:ss.ttt [cue settings] Text string
The idstring
is an optional element but it is good idea to have it specified if we need to access the cue using a script. As for the timestamp
we have a standard format where the hours are optional. The second timestamp
must be greater than first one.
Text string is allowed to contain simple HTML formatting like <b>
, <i>
, and <u>
elements. There is an option to add a <c>
element that can be used for adding a CSS class for portions of the text, for example <c.className>styled text </c>
. There is also an option to add a so called voice label <v someLabel> the awesome text </v>
.
The cue settings are optional as well and are appended after the time range. In this setting we can pick whether the text is shown horizontally of vertically. The settings are case sensitive so they must be in lowercase as shown in the examples. The following settings can be applied:
vertical:rl
where the rl
stands for writing right to left and vertical:lr
for left to right.line:0
and line:0%
indicate top and line:-1%
or line:100%
indicate bottom.position:100%
meaning right.size:100%
means the text area will be shown.align:start
, align:middle
, and align:end
.In the text string we can also add more detailed order of appearance of given words, in a sort of karaoke style. For example, see the following:
This is some karaoke style <00:00:02.000>And more <00:00:03.000>
It states that before the 2 seconds we have some text and the active cue And more
is between 2 to 3 seconds.
One other thing to note about the text string is that the it cannot contain the string -->
string, ampersand &
or the less than character <
since they are reserved. But no worries there we can always used the escaped version, for example &
for ampersand.
These restrictions do not apply if we use the file for metadata track.
We also have the option to style the text using CSS. As previously mentioned VTT files can contain tracks with <c.someClass>
for a more fine-grained styling but in the general case we want to apply the style on the entire track. Applying style for all the cues can be done:
::cue { color: black; text-transform: lowercase; font-family: "Comic Sans"; }
But you may alienate the users by making their subtitles in comic sans.
There are also selectors for the past cues ::cue:past{}
and ::cue:future{}
that can be useful for making a karaoke-like rendering. We also can use the ::cue(selector)
pseudo selector to target a node matching some criteria.
Not all of the feature are fully available in the modern browsers, most compliment at the time of writing is Chrome so for the others it is a good idea to use a polyfill. One such library is http://captionatorjs.com/ that adds support to all the modern browsers. Besides adding support for the WebVTT it also supports formats like .sub
, .srt
and YouTube's .sbv
.
There is also one other format that was developed for the video tracks. The name is Timed Text Markup Language (TTML) 1.0 http://www.w3.org/TR/ttaf1-dfxp/ and it is only supported by IE without having any plans to get support in other browsers at the time of writing. The standard is more complex and it is based on XML but as such it is lot more verbose.
3.145.66.94