Chapter 10

The Sequence

If the frame in a film is the equivalent of a word in a book, and the shot the equivalent of a sentence, then the sequence is a paragraph—a subdivision of the whole film, which maintains the same structure as the whole film but in miniature. It is composed of sentences—shots—strung together in such a way as to give the whole sequence a self-standing integrity. A sequence can be viewed by itself and, though it will not carry the whole message of the entire documentary, it will be a complete thought of itself.

Like a single paragraph in a passage of text, a sequence runs from one break to the next. Like a paragraph in a text, its function is to carry the content, the meaning, of the film from the previous sequence and hand it on to the next. Its function may be straight narrative, it may be description, it may be exposition. In Modern English Usage, Fowler describes the paragraph as a unit of thought, not of length. The writer, he tells us, is saying: ‘Have you got that? If so I’ll go on to the next point.’ Much the same definition can be given to the sequence in a documentary.

What ties a sequence together is a series of unities. Not quite the dramatic unities that Aristotle prescribed, but an equivalent set that function to the same ends—to bind together the series of images, sounds and the events that they represent into a single whole. The unities which a sequence obeys are those of time, of action, and of character.

A sequence has a beginning, a middle and an end. Like a scene in a play—and even in documentary work, a sequence is often called a scene—a sequence follows a single set of characters, performing a single action through a continuous stretch of time. The characters may be represented by a narrator and the action may even consist of the narrator putting forward an argument or an explanation accompanied by a series of mute images, yet it will still be perceived as a sequence as long as a single subject and a single visual treatment is maintained over a continuous passage of time.

A sequence’s shape is given by its storyline. The sequence’s story-line will begin with the start of its action, the starting point of its argument, and will end with its conclusion. If lifted out of context, it will be complete in itself. The audience should automatically recognize the beginning and the end of a sequence; the viewers’ conventional cultural assumptions and perhaps also their innate expectations will be satisfied. Yet at the same time, the beginning of a sequence will pick up the thread of the film from what came before it—unless, of course, it is the very first in the work—and the end of a sequence will hand it on to the next—unless, of course, it is the last sequence of all.

Picking up from what went before may be a matter of using a linking character, a linking location, a linking theme, a linking argument. The story is passed on from one sequence to the next like a relay baton. The handover depends on some of the subtler skills of story-telling: raising questions, suggesting consequences, hinting at complications, even faintly presaging doom. There will usually be some kind of connection, since the documentary form does not lend itself so easily to carrying ahead parallel storylines. Where a film’s attention is shared between a number of protagonists, it will usually be found that they each represent a part of the real subject of the film—the collective.

A sequence is composed of a series of shots—vision and sound—linked together subject to the dynamics of the scene. It is a carrier of the ‘illusion of presence’. The shots supply the viewers with what a real witness would see and hear if really present at the scene. Most film-makers try to make the viewers forget that they are not actually there. Thus the construction of a sequence depends on empathy with the audience, on satisfying the question: what would they want to look at and listen to if they were actually here?

A sequence can move the viewers around in the scene to look at and listen to all that is going on. A real observer both follows the action as well as looks constantly this way and that and listens here and there, to build up a complete picture of the pattern of events. The film sequence mimics this constant scanning of the scene by directing the viewer’s attention to whichever aspect the director finds appropriate at that particular stage of the continuing action. The sequence can also switch subjective viewpoint and present the action as perceived by any of the characters within the scene.

The choice of shots

The shots which make up a sequence may usefully be classified by function into three varieties: establishing shots, narrative shots and what can be called ‘look at this’ shots.

Convention suggests that a sequence should begin with an establishing shot. It is the way we usually confront a new scene in real life. We stand at the edge and look in on things—to see where we are and what is what. It is our eyes’ equivalent of the cinematic wide shot. And so too will a sequence often begin with an establishing wide shot, which sets out the geography of the location in which the action begins. The very first frames may not necessarily be wide, however. If the film-maker wishes to capture the attention of the audience for the new sequence, he or she may choose to start with some kind of striking and arresting close-up image. However, if the audience is to be kept au fait with what is going on, the close-up must eventually give way, or perhaps develop by tracking or even zooming—for this is one of the few cases where the viewer’s wish to see more of the scene will justify a zoom out—into a wider shot.

Film is not restricted to a single location per sequence. So every time a new location is introduced, there will be a need to establish the new geography early on with a wide shot. Unless, of course, the film-maker deliberately wishes to hold back from the audience the knowledge of where they are. Then a wide shot towards the end will come as a surprising revelation. To offer no wide shot at all will keep the audience unaware of where the sequence is located and the relationships between all the things shown in its course. Viewers are likely to feel ill at ease, anxious even, as they wonder where the camera has taken them.

The wide shot which places the viewer on the edge of the scene is not sufficiently intimate or involving to carry the narrative of the sequence strongly. Nor, on the television screen, can the wide shot carry enough detail to satisfy the viewer’s demand to see exactly what is going on. Narrative shots on television are from closer in, mid-shots and medium close-ups mostly, bringing the viewer right into the centre of the action and involving him or her in it directly. The narrative shots carry the sequence’s story line, answering the question ‘what next?’ with ‘this’.

On the other hand, if following the action only in closer shots, the viewer will remain unaware of what is going on elsewhere in the scene. To complete the picture, the third element of the sequence that the director will bring in are what can be called ‘look at this’ shots.

A real observer of a scene will be constantly looking around him- or herself, darting glances in every direction as the action unfolds. A film sequence provides the viewer with a similar experience by interspersing the narrative shots with others, mostly close-up shots of the significant details of the scene. What the film-maker is saying to the viewer is: ‘look at this, and now look at this, and now look at this.’ Since the film-maker is making the choice in advance on behalf of the viewer, it becomes the film-maker’s responsibility to ensure that these close-ups are relevant and really add something to the viewer’s understanding of the action.

The audience will in any case automatically interpret anything shown in close-up as being a significant detail. Audiences are fully accustomed to the use of tell-tale close-ups in cinema films. Anything included in close-up in a sequence must therefore be truly significant in one way or another, or the audience will be misled. Selecting and shooting the close-ups of what he or she sees as the significant details of a scene provides one of the documentarist’s most powerful tools for expressing a personal vision and interpretation, and in consequence shaping the audience’s response. As already suggested, many film-makers would say that the soul of a documentary scene is in its details.

Continuity

When joining many shots together to make up a sequence, the director must be strongly aware of the unity of time. A single sequence follows a single passage of time. Therefore each shot must continue the previous shot’s place in time in an unbroken flow. Within the sequence, continuity of time, character and action must be maintained throughout.

It is clear that if the principal character in a documentary sequence is a woman first seen wearing a flowered skirt, she cannot in the next shot be dressed in a trouser suit. Equally clearly a man shown hailing a taxi at the end of one shot with his arm raised in the air, cannot be seen in the very next shot with his hands in his pockets. Because time throughout a sequence is perceived—and understood—by the audience to be continuous, continuity of action, of character, of costume, and of all other elements of the scene must be kept going for as long as the sequence lasts.

But as we have seen earlier, the time which the audience experiences as continuous is the film time of the sequence, not of the reality on which the sequence is based. A real event taking an hour to unfold, may be compressed into only five minutes of screen time. Though each individual shot in the sequence covers a real stretch of time, which cannot be speeded up or slowed down, film time may make continuous what are discontinuous periods of real life. Equally a number of consecutive shots may in reality represent a single event, either photographed with more than one camera at the same time, or with the action repeated as many times as necessary.

The compression of an event from an hour down to five minutes results from selection: choosing to shoot for the sequence only the highlights, only the most significant moments—the moments which carry the sequence’s narrative, the sequence’s story, in the most economical as well as the most persuasive way. Selecting which moments to shoot for the sequence is another of the documentary director’s important creative acts.

The audience perceives the sequence’s time as continuous. The sequence is actually made up of shots which may not be, and probably are not, continuous in reality. Thus it is often impossible to join together one shot of a character or action directly to the next. This effectively means hiding from the viewer the fact that the shots were taken at different times. It is in constructing the sequence out of its separate shots that this filmic sleight of hand is performed. By taking discontinuous shots of the action and interposing between them the close-ups of significant details—which are needed for their own purposes: to add meaning and comment to the scene—the film-maker can avoid confronting the viewer with the evidence of lapsed time.

Cut-aways and cut-ins

For instance, should the sequence be concentrating on an action performed by a principal character’s hands, a girl rolling a cigarette perhaps, interspersing the action of the character’s hands with shots of her face screwed up into a grimace of concentration allows the action to be represented by only its beginning and its end, without the tedium of having to follow the entire process. Such condensation of time is almost universal in documentary making, since actions which are short enough and interesting enough to be followed through in their entirety are rare. The interpolated shots may record reactions to the event, either by the participants themselves or others—concentrating on faces avoids revealing the time jumps in the event itself. Or they may introduce other relevant and significant details. In another example, the sequence may be telling the story of a boy’s treatment at the dentist’s. By interpolating shots of the dental nurse preparing the instruments and mixing up filling materials, the long, painful process of drilling the boy’s tooth can be represented by only a few shots of its selected ‘highlights’.

Such shots are known as cut-aways, because they cut away from the principal action. It is important to remember that even though cut-aways are there to serve a structural purpose in the sequence, the audience will assume that they are being shown these details because of their inherent significance. Thus cut-aways can never be simply shots of convenience but must always be chosen with great care.

A common situation in which cut-aways are needed is the interview. If the film-maker does not wish to reveal evidence of editing by using jump-cuts or fast dissolves to join different sections of the interview together, cut-aways will be needed to disguise the discontinuities. Where the interviewee is performing an action at the same time as speaking, the cut-aways may cover details of the action. Where the film acknowledges the presence of an interviewer or presenter, the cut-aways may be shots of the interviewer’s reactions—the so-called ‘noddies’. In situations where neither of the above is possible, the film-maker may have difficulty in finding a suitable subject for a cut-away shot. One sometimes sees interviews, particularly in news or current-affairs programmes, where the director has in desperation chosen to use shots of the interviewee’s hands, other anatomical parts, or even items of clothing as cut-aways. An episode of the political documentary strand Panorama1 comes to mind in which the viewers were treated to a series of shots of the (female) interviewee’s trouser flies—which were not even undone. A directorial lapse must have left these unintentionally framed shots as the only available cutaway material. Unless such images are in themselves truly revealing, they serve rather to mystify the viewers than to mask the editing. It is usually better to recast the sequence than to struggle on with such bizarre material.

Though sometimes confused with each other, cut-aways are different from the shots called cut-ins. Cut-aways cut away from the action. Cut-ins cut into the details. Naturally whether a particular shot qualifies as a cut-away or a cut-in may depend on what the principal action is perceived to be. A Mid Shot of the girl rolling the cigarette—which will include both her face and her hands—can be followed by a cut-in, a close-up of only her hands performing the procedure. If the real subject of the sequence is what the girl is saying while rolling up, the close-up functions as a cut-away. If we are concentrating on the rolling-up process itself, it qualifies as a cut-in.

While cut-aways allow time to be telescoped without difficulty, cut-ins may need greater care if they are effectively to collapse the action. In the case of the roll-up girl, the need for continuity demands that her hands be in the same position in both mid-shot and close-up. However her hands will be almost certain to adopt a similar position many times in the course of carrying out their task. By joining the mid-shot to a point much later in the cut-in, though one where her hands again match positions, the same compression of time can be achieved as with the cut-away. Naturally there is a risk that the viewer may notice the sudden change from a cigarette barely begun to one which is almost complete. Such a transition would usually depend for its success on the movement of the hands distracting the viewer from immediately noticing the change.

Cut-ins are by definition photographed in closer framing than the preceding and following shots. Cut-aways can be of any size. But with both, the issue arises: from whose point of view are they to be presented? There are two possibilities. It may be taken for granted that all images are shot from the point of view of the audience watching the action. But film convention has made an alternative possible: such shots can also be from the point of view of a participant in the scene.

In the second example, the boy at the dentist’s, deciding from whose point of view to shoot the assistant affects the direction from which the camera observes her. The cut-aways may be taken from the same position as the shots of the boy himself. Or the camera may be put in the dental chair, thus representing the point of view of the boy himself. In the latter case, the audience will be given a hint on how to interpret the cut-away by the action of the boy’s eyes. If, in the previous shot, we see him anxiously looking around the surgery, a cut-away to his point of view will be expected. If he is sitting in the chair with his eyes closed, it will be clear to the viewers that he is not looking at anything and that any cut-away cannot represent what he is seeing.

Similar considerations affect the choice of point of view of a cut-in. Suppose the sequence to be of someone cooking. Suppose that the main shot, the master shot, is a mid-shot, and includes the table on which the character is preparing a dish. There are now two ways of shooting cut-ins, which will be close-ups, of the food on the table. They can be shot from the same general position as the master mid-shot—anticipating the viewer’s desire to focus in on the detail of the action. Alternatively cut-ins could be photographed from the cook’s position and therefore from the cook’s point of view. It is the preceding shot which will determine the choice. Following a mid-shot showing the presenter and the demonstration, particularly if the presenter is looking to camera, a cut-in from the audience’s point of view would be expected to follow. After a close-up of the presenter’s face looking down at the demonstration, a cut-in from the presenter’s point of view would be more natural. The first choice confirms the viewer’s perception: that of an uninvolved observer. The second offers a way of getting inside the screen character’s head.

Pace

It is possible to stretch a single developing or travelling shot into an entire sequence, it is often done in the cinema. But such a filmic tour de force is unusual in television, as the small size of the screen, and the distance from which it is usually viewed, make it hard to extract enough visual richness from a single shot to maintain interest over any length of time. The main exception is the single talking head, which may often occupy an entire sequence. Here maintaining the interest depends entirely on the appeal of the speaker. There are speakers who are capable of holding an audience’s attention for as long as any film-maker might need.

Mostly, a sequence will be made up of a number of shots joined together. How long the sequence plays on the screen and how many shots it is composed of depend on a number of things: the place of the sequence in the film, the length of its other, neighbouring sequences, the overall pace of that part of the film, whether the sequence is at the beginning of the film or at the end, and probably many other factors too.

The film-maker ought to have in mind, at the time of shooting the sequence, what part it will play in the structure of the work and therefore what its mood should be: happy, sad, contemplative, exciting, tense, relaxed. The action to be filmed will have its own pace which the film-maker may also wish to, or be forced to, reflect.

Because of a sequence’s unity of time, its pace will be carried from shot to shot by the pace, the speed of action, of each constituent shot. The shot’s pace may come from the action of the subject of the shot, in other words the shot may be static but the characters within it may move, or else the shot may develop, track or pan or zoom with a pace of its own. As the shots are joined together, each shot’s pace will seem to run seamlessly into that of the next. The film-maker will be conscious of those connections while shooting the scene.

Pace is also much affected by the content of the shot and by the viewer’s understanding of its subject. No matter how fast the speed of action in a close-up, the viewer is aware of its small dimensions and automatically translates it into a world scale, dividing the apparent pace down many times in the process. In any case, as previously noted, for purely screen-size reasons, close-ups cannot contain fast movement. Thus the closer the shot the slower paced it tends to be.

Stills are not cinematic shots at all. Unless camera moves are imposed on a still subject, the use of a still will bring the sequence to a sudden halt, like a train unexpectedly hitting the buffers. When a still frame appears in a film it brings time to a stop and the flow of the sequence is paused. It is very hard for a sequence to pick up movement again from a still frame. Thus the use of stills is generally restricted to the ends of sequences.

Of course the pace may change over the length of the sequence. Equally, one slow-paced shot will not inevitably be followed by another. The change of pace and the alternation of shots as well as the exact timing of their beginnings and ends will be dictated by the demands of the visual rhythm.

Rhythm

Rhythmic interest is found in an ordered deviation from a regular pulse. This applies to placing shots in a sequence just as much as to playing the guitar. Rigid adherence to the regular beat quickly becomes boring. Totally random deviations from the beat are incomprehensible. Rhythmic satisfaction comes from giving enough order to the deviations sometimes to satisfy and at other times to surprise the audience. But while a rock ‘n’ roller can hear the drum giving the regular pulse throughout the music, film and video have no such luxury. The sense of the regular pulse must be implied by the rhythmic flow of the shots themselves.

Sequences are often cut to music. Here the pace of the music will usually match the pace of the movements in the sequence. The cuts or other transitions in the sequence will be placed according to both the content of the shots but also to the beat of the music. Sound plays as important a part in establishing visual pace and rhythm as do the shots. Sound can motivate a cut or explain it. Music can demand a cut or deny it. But a sequence cut throughout on the same predictable beat of music without reflecting the content of the shots quickly becomes boring. When assembling the sequence, the editor will carefully take account of both sound and vision.

Sequence sound

Sound and vision are not equivalent senses. Sound vanishes utterly if the film is stopped, while the vision remains as a frozen frame. Sound is sensed serially, whereas all the elements of the visual signal are sensed in parallel, grasped at the same time. A given length of sound is far less information-dense than the same length of a series of images. A second’s worth of sound, for instance, can convey relatively little, compared with a second’s exposure to a picture. Much more visual information will be retained. Yet sound’s ability to conjure up a whole world of associations and emotions, something that vision by itself cannot do, makes sound an essential partner in assembling sequences in television documentary work. Many viewers, not paying great notice to the TV in the corner of the living room, will have their attention immediately attracted by a change in the quality of sound. If you want to make sure that your entire audience is watching, a sudden and unexpected silence in the sound track will usually do the trick.

Just as visual continuity must be maintained throughout a sequence so must continuity of sound. In fact sound continuity is in some ways even more important than continuity of vision. When shots are joined together into a sequence, great efforts are made to smooth the transitions of sound.

Sounds in a sequence are of two kinds: those relating to the individual shots in the sequence—speech, effects arising from the action, noises off, sound atmospheres—and those belonging to the sequence as a whole—music, spoken commentary, sounds split off from their visual source. In the case of the last named, the sound may begin as the natural synchronous sound associated with an on-screen event, but then continue over other shots to which it is only related by implication. A sound may begin in one category and end in the other. Thus a talking head may go on speaking while the visuals move off, to show the viewer other things in some way related to the speaker’s words. This technique, first used as a shocking surrealist device, has become an accepted convention in documentary film-making.

Of the sounds relating to the individual shots, there will be foreground sounds: speech, action effects—and background sounds: noises off, atmospheres. Where the shots in a sequence feature the same location, the audience will expect the sound background to remain unchanged throughout the series of shots. Atmosphere and noises off will be expected to keep continuity.

This often makes recording location sound quite difficult. A background noise such as a low flying aircraft may not interfere much with a screen character’s intelligibility, but may make it very hard to include just a sentence or two of the character’s dialogue, unless the chosen words begin before the aircraft noise is heard, and also continue beyond the plane’s final fade-out. Otherwise, when joining the dialogue at the chosen point, the background sound would then suddenly cut in at full volume at the beginning of the shot and equally suddenly cut out at the end.

Hence most film-makers try to isolate the foreground sounds with highly directional microphones and record the background sounds separately. In assembling the sequence, the foreground sounds are first joined together along with the visual shots, and the background atmospheres and noises off are subsequently added across them all.

The unifying power of sound continuity enables sequences to be assembled from shots that may have been recorded at widely different times—and even in different places. It is not unknown for a film-maker to recognize the need for a cut-in close-up of some object late in the production process when all the rest of the sequence has already been shot. It may then be necessary to shoot the close-up at an entirely different location (of course excluding any background that might give away the secret). As long as background sound continuity is maintained through the new shot, the viewers will be entirely unaware of the dislocation of time and place. In one episode of Life Power,1 a series of films about biotechnology, the director omitted to shoot a close-up of a bottle of tablets in one of the protagonist’s hands. He continued to try to take that shot on every later filming occasion, finally succeeding in capturing the needed image some two months after and some thousands of miles away from the original scene. In the assembled sequence, the speaker’s voice, of course, never faltered and the shot matched perfectly. Such are the dishonest ways of the working documentary director.

Sound also plays an important role in the way shots join together. When editing a sequence, most editors will overlap the outgoing sounds with the incoming ones, so that a very fast mix rather than an abrupt change of sound can be achieved. This provides a more aesthetically pleasing effect. But the timing of the sound change may not need exactly to match the change in vision. It takes the brain fractionally longer to analyse and comprehend sound than visual images. The eyes understand first, then the ears. Thus in switching from one shot to the next, the sound is often begun some fraction of a second before the new image appears, helping to motivate the change of picture. Editors call it ‘telegraphing the cut’, as if sending a message to the viewer that the cut is on its way.

Or the new sound can lead by a more noticeable time interval, priming the viewer to question the source of the sound, and to appreciate the dissolution of suspense when that source finally appears on the screen. Here the sounds of the subject of the shot is used in a way less like a purely naturalistic effect and more like a phrase of music.

Sequence music

Music has always been an important ingredient of the documentary film sequence, just as it is in fiction film technique. Nearly all films are, in a strict sense, melodrama. The music will usually be coterminous with the sequence, beginning with it and ending with it. In fact it is not unknown for some directors to lay the music down first and select the shots to accompany the music afterwards, rather than vice versa.

Music performs a similar task to the general sound atmosphere. of a sequence—but with even greater precision and greater impact on the viewer. Much sound is value-free, whereas music can have powerful emotional effects and is capable of helping to determine the response of the audience to what they are seeing.

The style of the accompanying music can be related to the participants in the scene itself or it can simulate the viewer’s response. There are so many kinds of music with such a huge variety of associations that the choice is legion. The music can simply generate atmosphere or it can comment; it can be in ironic contrast to, or support the emotion of, the scene. The chosen music can be baroque, classical or contemporary; it can be popular or dance, folk or techno; or it could be that most familiar kind of late-romantic-style film music once characterized by the critic Germaine Greer in the unforgettable oxymoron ‘Jewish Wagner’. Whichever is chosen, it will influence not only the response of the audience, but will reveal something of the attitude of the filmmaker to the material.

Dramatic music, pastoral music, dance music, all change the way the viewers respond to the sequence. In We Can Keep You Forever, the documentary about American GIs missing but believed still to be alive in Vietnam, the music maintained such an unstoppable and high-powered drive from beginning to end, that it kept the mood only just on the sane side of hysteria—a perfect match for the paranoid fantasies of the men whose exploits the film was following.

Sometimes the music can be allowed to take over the narrative function of the film. In The Last Exodus, the documentary about the history of Eastern European Jewry and the emigration of Soviet Jews to Israel, the makers were unwilling to use real archive film of the holocaust to carry the necessary narrative. Instead, the film turned a painting, The Dance of Death by Nussbaum, himself an Auschwitz victim, and travelled across its details while the film’s composer contributed a profound musical account of the Shoah.

Sequence music can belong to the first category of film sound: that arising from the events on the screen themselves, or it may be of the second: music not arising from the action but overlaid during the construction of the sequence.

Ethnographic documentaries often make use of the traditional music of the people who are the subject of the film. This seems a natural choice, particularly where the music is at one time the sound of an on-screen event and at another becomes the generic accompanying music for the whole sequence. However, there is a paradox hidden here. The response to the music of those whose everyday art it is—Amazonians for instance—will likely be quite different from the response of the documentary’s audience. A comic song in Yanomami will not raise many smiles among English or French speakers. Location music will certainly not have the same exotic associations for its performers as for those eventually hearing it on film. Thus using music recorded at the location, while undeniably providing flavour and atmosphere, will not necessarily tell the viewer much about the feelings and responses of the participants in the scene. And the exoticism conjured up by the unfamiliar melodies may work against the sense of identification between viewer and subject for which the film-maker may be striving.

Some music is so well known and carries with it such powerful associations that it will always impose those associations onto the filmed material. No documentarist could use the Mendelssohn Wedding March, Land of Hope and Glory or Entry of the Gladiators in other than their stereotypical roles, without recognizing that the viewers will interpret the inclusion of such well-known tunes as a comment, or even joke, on the subject by the film-maker.

Sequence narration

Words of commentary accompanying a sequence are not in the same category as the other elements of the sound. Whereas screen dialogue, sound effects and music all form part of the sequence and help the viewer to interpret the images at an experiential level, a commentary imposes a purely intellectual layer onto the documentary. For this reason there are many documentarists who feel that if a film needs a commentary to explain it, then it has failed as a creative work. It is certainly true that some factual television productions are little more than lantern lectures or radio productions with pictures of dubious relevance added afterwards. Clearly for a documentary to work primarily as a film, the filmic elements must be the priority. Commentary or narration is added afterwards for the additional richness it can add to the viewing experience. Nonetheless, narration or commentary are an accepted part of the familiar television documentary sequence and will need as much care in construction as the other elements of the film.

Commentary accompanying a sequence can be of two kinds: that which is contributed by an anonymous narrator, a never-seen voice, and that which is a continuation of the speech of an onscreen character—who may in turn be either the subject of the film or its presenter.

The continuation of an on-screen voice over material shot at another time and another place seems paradoxical, but is the expansion of a perfectly naturalistic technique. A real participant in a scene, while engaged in listening to a speaker, will at times concentrate on the speaker, and at other times look around the scene—to survey the environment, to determine the reaction of other observers, to check on actions taking place elsewhere in the scene. Presented on film, such a situation is the direct parallel of a sequence in which shots of the speaker are intercut with other material from the same location.

When other visual material comes from elsewhere while the speech continues, the viewer is no longer in a situation ever actually experienced in real life. As suggested earlier, this is a technique first devised for its surrealist effect. What we have here is an overt fiction, and the viewer’s understanding of what is happening on the screen is radically changed. But though it is a fiction it does have a real analogue. The extension of a character’s voice over subsequent discontinuous shots parallels our experience of memory. We can bring back to mind, like a replay, the sound of someone’s voice from the past, while at the same time being in a quite different location in the present.

The moment we switch from using a speaker’s voice at the same time and at the same location as when the speech is being made, even if the speaker is not in vision, to using the voice as an accompaniment to other shots from another place, we are telling the viewer something about the time-slice which the whole sequence inhabits. The important question for the maker of a documentary sequence to establish for the audience is: which is the past and which the present? Is the speaker remembering and do the images represent the speaker’s own memory. Or is the speaker the memory and the other shots the now? The distinction will determine the way in which the audience will instinctively judge the material presented to them. For we all assume, do we not, that memory is fallible, while the camera doesn’t lie?

The images to which a character’s voice-over is applied must be chosen with care; the viewer can easily be confused. Using a character’s voice-over while showing that same character speaking on screen at another time, quite apart from the difficulty the viewer will have in distinguishing the voice-over from the actual sound of the shot, risks suggesting to the audience that the sound synchronism has failed. It can take some time for a viewer to work out that the voice in the voice-over is not to be taken as the synchronous sound of the image.

Once we have left the character’s on-screen appearance behind, the voice may return as a voice-over at any time in other sequences—or at least for as long as the viewer can recognize the voice and remember who is supposed to be speaking. Every time the voice-over returns, the same original time perspective will be suggested. The film-maker will take care to ensure that the memory and the ‘now’ remain consistent; otherwise the audience may become disoriented.

All these considerations apply just as much to an on-screen presenter as to any other character in the film, perhaps even more so. For in a film which uses a presenter in vision, the presenter’s voice will mostly be used for the overall commentary as well. But judging the time perspective represented by the presenter’s appearances to camera may be more difficult. Many film-makers make a point of introducing the presenter at the beginning of the film in such a way as to suggest that this first appearance is to be understood as the present and the entire following documentary is a flashback in time. Indeed films of the reportage genre are sometimes set overtly in the past, the presenter saying explicitly or implicitly: ‘I did this, I saw that, I witnessed the other.’ Indeed the grammar of the spoken text usually makes use of the past tense, or at least the historic present.

The role played by an off-screen narrator, a commentary by a voice never identified as a character in the work, is rather different. Here the time slice of the words is that of a continuous present tense, no matter what tense the grammar of the text actually uses. The classic voice-over is understood by the viewer as an accompaniment to the journey of discovery that is the film. Thus the voice-over can express just as much surprise at the outcome of a series of events in the documentary as the audience feels. For in such a case the narrator is a surrogate for the viewer him or herself. The voice attaches itself to the images. It is the unseen presence behind the camera. Or rather, the character whose eye the camera is—the concrete embodiment of the illusion of presence that the viewer experiences. Often the narrator will be the voice of the film-maker him or herself; though that fact is not always revealed to the audience.

A consequence is that the selection of the voice for the narrator is crucial for the identification of the audience with the point of view of the film. Viewers are asked to identify themselves with the narrator. All the issues of sex, age, class and race are raised by this demand, particularly in the English language, but not uniquely—there are others, like Russian, in which culture, class, ethnicity and educational background are instantly made plain by the speaker’s accent and dialect. For this reason some filmmakers avoid the unidentified and unidentifiable narrator’s voice, preferring to present the narrator as a character in the film, at least at the very beginning, and therefore to a large extent made actual and objective. From the moment that the narrator appears in vision, he or she no longer represents the viewer but merely him or herself, giving an account of the events which the documentary brings to the screen.

For many years Horizon, the long-running BBC strand of science documentaries, was narrated by the unmistakable voice of Paul Vaughan, called by one critic ‘the first invisible star of television’. In an episode called Rail Crash1 a man was discovered standing on an empty railway station platform. He turned towards the camera and spoke—with the voice of Paul Vaughan. It was an extraordinary moment, almost shocking. And it was recognized as such by Paul Vaughan himself, who felt it necessary to acknowledge the audience’s inevitable surprise with the words ‘Yes, this is Paul Vaughan.’ Even screen-hardened television reviewers were moved by the moment; one newspaper expressed amazement that Paul Vaughan turned out to be ‘beetle-browed and mildly trendy’.

Different varieties of documentary habitually make use of different styles of narration. Films dealing with current affairs, being second cousin to news reports, often adopt an urgent, slightly breathless style. Documentaries about children may match the subject with a gentler sound. Educational productions frequently use a voice with an authoritative timbre and style. Filmmakers will take care, however, to ensure that their choice of voice and speaking style is not made by default and by cliché. All documentaries about women do not have to have a female narrator, all documentaries about the life of the poor do not have to be matched by a working class accent, all arts productions do not have to be narrated in the cut-glass tones of the academy.

A narrator who appears on screen must perforce adopt a relatively naturalistic style of presentation. Only relatively, however. Certain kinds of production have well-established conventions of presentation. News films, for instance, often make use of a kind of verbless language of headlines which viewers expect and understand. Once the narration is carried in voice-over, a certain formality is taken for granted by the viewer. The hesitations and deliberations of normal speech, even those artfully simulated by an on-screen presenter reading a prepared text, are not so appropriate when the speaker cannot be seen. By the same token, the film-maker is freed from writing in a totally naturalistic style. A commentary can be, and often is, composed in a far more mannered and stylized way than words intended to be spoken by a visible speaker. A British foreign correspondent has been known to cast his film commentaries in blank verse—though unnoticed by his bosses or the audience. In a previously mentioned episode of Welcome to my World, a shot tracking at great length past row upon row of grounded warplanes at an air force base, had the presenter Robert Powell seamlessly switching in commentary from prose to rhyming verse. The stanzas were written to match the visual image and paced to accord with an accompanying drumbeat. Viewers remained only subconsciously aware that the conventions of the narration had changed:

‘Where are the men of war today?

Not on the battlefield, where fate

Rules between armies of machines;

And silicon and armour-plate

Fight, without ever knowing fear—

Or hate.’

The effect was to supercharge the sequence with heightened emotion. The sequence—a single shot—was ended by cutting to a real conversation between Powell and an airman, which returned the film seamlessly to the world of prose.

The cut

Sequences consist of shots joined together. The joins, the junctions between shots, are the transitions. There are a number of different kinds, each affecting the surrounding material in a different way. By far the commonest in television documentary is the cut.

There is good reason for this. When introduced to a scene in reality, we do not stare fixedly in one direction, nor do we sweep our eyes across the scene in smooth panning movements. When watching an event, an action in reality, we constantly cast our gaze around—to observe reactions, to fill in the context, to be forewarned of other oncoming events. When we do this, our brain suppresses our consciousness of the movement of our eyes. At a cricket match one moment we are aware of looking at the batsman, the following moment we are glancing at the bowler, the next we are looking up at the sky to see if it is going to rain. The effect is as if we are switching abruptly from one view to the next. It is this sudden switching that the cut in a sequence of television images seeks to emulate. When well judged, the cut is no more noticeable than our switch of attention in a real scene. In fact, a skilful director and editor can ensure that a cut is not noticed at all.

Well judged means—as it did with camera movement in a shot—that the film does what a viewer’s eyes would automatically wish to do. If the shots in the sequence are joined together in the way in which a real viewer would use his or her eyes in a real scene, the fact that a succession of different images follow each other on the screen will be largely unnoticed by the viewer. Whether this is the effect wished for by the film-maker depends on the film-maker’s attitude to the material and desired audience reaction. However, even if the film-maker is striving to make the audience aware of the artificial nature of the work, for every cut in the film to be noticeable and shocking would be very wearying, and probably unacceptable, to the audience. A modicum of smooth and unnoticeable cutting is the norm in almost all documentary work.

Smooth cutting, like acceptable camera movement, depends on motivation. When a cut is well motivated it is less likely to be noticed by or disturb the viewer. Motivation of a cut depends, among other things, on what may be called ‘pointing actions’. Thus, if a woman speaking on the screen suddenly points a finger to her left and says: ‘look,’ a cut to what she is looking at will almost certainly go completely unnoticed by the viewer, who will accept the change of image as if the viewer him- or herself had made the decision to look away from the speaker.

Most pointing actions are not so explicit. Any gesture, look or movement referring to an event or a sight off-screen by an onscreen character can act as a pointing action. For example, if in real life you are speaking to another person, who then glances to the left as if he or she can see something relevant to the conversation, you are likely automatically to glance to your right to see what he or she is looking at. In a film sequence, if a speaker on the screen merely glances to the left, a cut to another shot showing what the speaker is looking at will so mirror the viewer’s own instinct that he or she is unlikely to be aware that the shot has changed.

Almost any movement can act as a pointing action. Just as in the real world any movement in a scene will attract our attention, so in a television wide shot, movement of any one of the objects in the shot will motivate a cut to a closer look. A wide shot of a scene in which a motor car begins to move will make a closer shot of the vehicle and its passengers automatically acceptable. A person fiddling with something in his or her hand, will justify a cut to a close-up of that object. Equally any movement in a closer shot which threatens to take the subject out of the frame will make the cut to a wider view instinctively acceptable. So, with a character sitting in a chair, the action of beginning to get up out of the chair will motivate a cut to the wider shot which will keep the character within the frame when standing up. In fact movement of any kind, even if not entirely motivating the cut, will usually sufficiently distract the viewer’s attention from the fact that the shot has changed. It is as if following a movement with our eyes makes us unconscious of the switch of image.

When cutting to another shot on a movement, the question arises of how much of that movement to include in the outgoing shot and how much in the incoming. In general, cuts seem to work best if motivated by the start of the movement, but reserving the major part of the movement for the shot after the cut. Thus the total movement is split between the two. The two shots should match in the speed of their action. The viewer has great sensitivity to the flow of the movement between shots. Repeating part of the movement in both outgoing and incoming shots, even by only a few frames, makes a noticeable jump. Editors call it ‘double action’. But a screen movement bridging two shots is not the same as a real movement. It is a representation.

Unexpectedly enough, shooting an action with multiple cameras and cutting live from one to another, as is done in studio video recordings—continuity cutting—can give the impression that part of the movement has been shown twice, even when logically it cannot have been. When putting together a sequence shot with multiple studio cameras, one often has to leave part of the movement out altogether to avoid this apparent double action. The pace of the sequence determines how much of the movement is represented. The faster the pace, the more of the action may be omitted. It is as if when switching attention in real life from one part of a scene to another, the time taken for our eyes to make the change is taken into account—as if we expect to miss part of the action while our eyes are readjusting.

Matching the shots

Actually, it takes a measurable time, even on film, for the viewer to take in the change from one image to another. The time it takes depends largely on the composition of the before and after frames. The main key to smooth transitions is that there should be as little movement of the eyes as possible when going from one shot to the next. Since the brain suppresses awareness of the movement of the eye from one view to another of a real scene, a cut which mimics that change of focus should also suppress awareness of movement of the eye. This means that the centre of interest in the image, the centre of the action usually, should be in the same place in the two frames, so as to minimize the movement the eye has to make. If the action in the outgoing shot takes place in the centre-left of the screen, the action in the incoming shot should also be placed in the centre-left. If the eye has to search the incoming shot to find where to look for the continuation of the action, part of the movement—as much as four or five frames—may well be missed. Where this is most important to allow for is not so much when cutting from a wide shot to a close-up, since in the close-up the action is likely to occupy most of the image, but when going in the reverse direction: from a closer shot to a much wider one. How often one finds oneself frantically scanning the screen to discover where the action is continuing.

Special considerations apply when the action of the sequence involves the characters on the screen interacting with each other. A screen conversation between two people will have one participant speaking left to right, the other, right to left. If shot in MCU, the ‘speaking room’ given to each character displaces their images slightly away from the centre of the screen. Thus the viewer’s eye travels a short distance each time the shot is changed, not unlike a real observer watching a conversation. In wider shots, matching two-shots for instance—that is, shots in which both participants appear, the two matching shots each favouring one of the speakers—the displacement from the centre will be greater. Such a cut in any case implies a major change in the viewer’s position, so should allow time for re-orientation.

The aim of putting shots together to cover a conversation or an event is as always to provide the illusion of presence. The shots in the sequence will mimic what a real spectator would do if present at the scene. From this follow a number of empirical restrictions on the way scenes are shot, which make sense if thought of in terms of the physical presence of a real witness. Where would a person place him- or herself if listening in to a conversation between two others? The closer beside the protagonists, the more involved would the witness feel. The conventional placement of the camera represents a position between, and almost in line with, both speakers. In practice, this is impossible to achieve, since the camera would block the view between the speakers. However, by placing the camera behind each speaker and zooming in (Fig 10.1), an equivalent impression can be given. Cutting between the two views is the equivalent of a spectator turning his or her head from side to side, while following the exchange. (This is one of those cases in which it would be necessary to shoot the same conversation twice, once from each position, unless two cameras are used simultaneously.)

Image

Fig 10.1 Matching medium close-ups simulate a postion in between the speakers.

However, it is clear that wider shots that include the back of head and shoulders of either of the partners would imply a physical jump by the viewer through space from one position to the next. There has to be good reason for such a leap of viewpoint to be made. The viewer is better given the illusion of presence by going from such a wider shot to the medium close-up first, as if he or she had stepped forward into another position.

Conversations involving more than two screen characters can be arranged in similar ways, cutting between individuals and groups of people. The question to be answered in deciding on the shots is as always: where would the viewer be standing if he or she were really here, and how can one minimize—or at least rationalize—the viewer’s implied movement from shot to shot. This is particularly the case when cutting from a wider to a closer view of the same subject. If the axis of the shot does not change—that is, if the change of size is achieved simply by using the zoom lens, the viewer can feel as if suddenly—and uncomfortably—jerked forward or backward, an effect often called ‘tromboning’. It is almost always better to move the camera to a different position when compiling different sizes of shot of the same subject.

Overall movement must, of course, maintain screen direction across a cut. If a woman is walking left in one shot, she must still be walking left in the next. A man lifting his hand must continue to lift it, not be letting it fall. If it is necessary to change the screen direction in the course of a sequence, the switch is not made between shots, but a shot is introduced within which the change is made.

In the opening sequence of Foundations, from the Living Islam-series, a religious procession is shown winding its way around the streets of Cairo. The marchers begin by going from screen left to right down smaller side streets, change over to from right to left for their entry into the old part of the city, and finally arrive at the main square going from left to right on the screen again. The first change of direction was achieved by a shot at a roundabout, following the marchers as they came forward and right, then panning with them as they turned screen left. The second turnaround was the result of the camera itself moving from one side of the marchers to the other, across the head of the procession.

The fade and the dissolve

‘Fade up from black’ says the traditional first line of a cinema film script. The image arises out of nowhere, preceded by nothing. It develops the incoming shot from an empty screen, a mirror of the opening lines of the book of Genesis in the Bible where ‘In the Beginning’ God creates heaven and earth and darkness is upon the face of the deep. The fade up from black is the archetypal transition technique for beginning a stretch of time or for opening an argument. The final fade down to black spells The End, even if the words ‘The End’ do not appear superimposed.

To fade out and then fade up again, pausing for a moment of black—of blank screen—in between, is by far the most divisive of separating techniques, marking not so much the end of a paragraph as the end of an entire chapter. It completely breaks the narrative chain of the film. It brings to a halt all story, all movement, all development. It says to the audience: ‘That was then. What follows is now.’

The fade-out and fade-up are therefore most useful when structuring films into quite separate sections, when different and clearly distinct approaches, stories, arguments, accounts of some phenomenon are being marshalled. In Imagina ‘89, the hour-long documentary about the Monte Carlo Festival of Computer Graphics and Animation, the account of the event was divided into separate sections, concerning different aspects of the festival. The sections of the film were separated by a fade to black and fade up again. ‘Chapter headings’ in text were superimposed over the blank screen in between to indicate to the audience that the subject was changing.

But if one shortens the gap in the middle, bringing the fade-down ever closer to the fade-up until they overlap, a variety of mixes or dissolves is created. If the fade-out begins some time before the new shot mixes in, and if the outgoing image is not quite gone while the incoming one is still fading up, some linkage between the two is maintained. If the fade-out and the fade-up are simultaneous, what results is no longer a dividing device but another form of transition—the dissolve.

Dissolves are used in two ways in documentary work: where shots cannot practicably be joined together without a jump, or where a perceptible break in the flow of shots is actually needed. In a way, dissolves used for the first purpose—to avoid an unseemly cut—might be seen as an admission of failure. Yet they are sometimes unavoidable, and sometimes even to be sought after. To cut from a shot which is static to one which is already moving—a zoom, a pan—is unsatisfactory; apart from its inelegance, it jerks the viewer, like a standing pedestrian suddenly grabbed by a passing car. But a dissolve is, in itself, a kind of movement. So it can serve as a way to begin a passage composed of moving shots. It is also a way to connect moving shots to each other, adding the counterpoint of dissolves to the visual rhythm of the shots. It is often done to music. And it is often composed of moving shots, taken with a rostrum camera, of stills.

The dissolve is not just a useful technical device. For it does not simply join shots together, as does the cut. The effect on the viewer is quite different. The dissolve brings the element of time right into the image. It makes time visible. The cut emphasizes serial continuity, one shot after another. The dissolve’s two independent images coexist on the screen during the same moments, as if in a kind of hallucination, or mental dislocation. The dissolve tells the audience something about time and place. The shots on either side of the dissolve represent either widely different times, or locations far from each other—or sometimes, confusingly, both. The director’s task is to make it possible for the audience to distinguish the two.

Sometimes the answer is obvious. Dissolving from a scene in daylight to the same scene at night implies ‘time passes’. Dissolving from a shot of a speaker to another, identically framed, shot of the same speaker—sometimes done to avoid a jump-cut in an interview while honestly informing the audience that material has been excised—tells the viewer that the two shots are not continuous in time. But dissolving from a scene at one clearly established location to another scene understood by the viewer to be elsewhere, may say—paradoxically—‘meanwhile’; in other words, elsewhere but at the same time. The distinction between the two is independent of the characteristics of the dissolve. It is an intellectual distinction, engaging the viewer’s understanding of the meaning of the sequence. It is implied by what comes before and what after.

Dissolves are not merely junctions between shots but are effects in their own right. They have an impact, and an aesthetic value, of their own. They can be long, lasting over many seconds, or very brief indeed. The longer the dissolve, the more conscious will the audience be of the active presence of the film-maker. A documentary joined mainly by dissolves will largely forego the illusion of presence and make a much more objective impression on the viewer.

On film, the dissolve is a relatively simple device. The outgoing shot is faded out at the same time as the incoming shot is faded in. But though simple, the technicalities are somewhat demanding. For a smooth effect, the fade-out and the fade-in are not made linear. That would make the end too sudden. Instead, plotting exposure against time reveals an exponential curve: the ends of the slope become ever shallower. If not placed with considerable care, an exponential fade-out added to an exponential fade-in can result in a peak of brightness in the middle of the dissolve. The aim of a good dissolve is to maintain the same level of exposure throughout.

Video dissolves are available in much greater variety. Devices exist which can produce many different kinds of dissolves: beginning at the edges and progressing towards the centre of the frame, beginning in a selected position and moving out from there, sweeping down from the top or up from the bottom of frame, variations which bring the dissolve close to the wipe as an effect.

Whatever technique is appropriate to the medium, a smooth dissolve will need to place the centre of interest in the incoming shot in the same screen position as the focus of the viewer’s attention in the outgoing shot. That is where the viewer’s eye will begin to examine the new frames. The eye will be far more easily satisfied if the action or other focal point of the image is there ready and waiting for attention, in the very place where it is already looking.

More thematically elaborate dissolves are often employed, in which the new image arises out of a chosen place in the old. In the first moments of The Last Exodus, to suggest the dreams of the emigrants waiting all night for their flight out of the Soviet Union, the shot of an old man’s face as he slept on a seat in Moscow’s Sheremetyevo airport dissolved to an image of the rising sun, burning through from behind his closed eyelids, the sun itself then dissolving to a shot of the old city of Jerusalem, with the golden cupola of the Dome of the Rock replacing the sun’s globe. The tone of the images changed with each dissolve: from the blue-black darkness and shadows of the Moscow airport night to the blood red of the rising sun to the bright daylight of Jerusalem the golden.

Image manipulation

The development of electronic video transitions has not only made many more kinds of dissolves possible, it has also rescued the wipe effect from the store cupboard of film history. In a wipe, a demarcation line, separating the incoming image from the outgoing image, moves across the frame. The movement can be horizontal, vertical, circular or yet more intricate.

The development of digital video has made entirely new styles of wipe transition possible. Since most documentary material shot on film is now transferred to video for editing, such techniques are available no matter what the original shooting medium. Converting the images into a series of numbers allows for elaborate manipulations picture-point by picture-point. Different shots can be contained in different portions of the screen and moved along arbitrary paths, to break up and reform at the programmer’s will.

The wipe turns the screen image into an object in itself. Its origin may well have been in imitation of the turning of a page, the moment when our perception of a book shifts from immersion in the meaning of the text to the physicality of the paper. In fiction films, wipes often retain a literary flavour. In the television documentary, a greater stylistic inspiration comes from graphic design.

Because the image is treated as an object, it can become any shape. When used as a transition device, that shape is often derived from the content of one of the images by deriving from it a ‘key’, an area of the screen detected as containing a particular colour or level of luminance. Thus black and white archive film could be inserted into the windows of a modern colour image of a building, the transition made by zooming into the windows until the archive material fills the screen. Or the image-object’s shape can be more fanciful. By ‘mapping’ the incoming image onto an object moving in the outgoing frame (or vice versa), the image can become almost any shape and undergo almost any degree of distortion. Some documentary makers delight in playing games with such images; though it must be said that on occasion, the games played tend to make a statement about the film-maker’s pleasure in using a new toy, rather than make any real contribution to the subject of the film. Recently seen was a shot of a police car rolling itself up into the shape of a large bee, buzzing around the head of the prostitute revealed waiting underneath, finally diving down and disappearing into her cleavage. It was fun, but told us little more than that a new machine had recently been delivered to the editing room.

Notes

1 BBC 1985

1 BBC 1983

1 BBC 1974

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.201.71