Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

11
Pronunciation and the Analysis of Discourse

BEATRICE SZCZEPEK REED

Introduction

Spoken interaction relies entirely on the way in which utterances are physically delivered. While the pronunciation of vowels and consonants can tell us a lot about the identity of a speaker in terms of, for example, where they come from, their speech melody, rhythm, and tempo will help create specific discourse meanings uniquely fitted to a given conversational moment. Producing vowels and consonants involves what phoneticians call articulation, that is, the pronunciation of individual speech sounds. Sounds are conceived of as segments of words and are therefore often referred to as representing the segmental level of speech. Features such as rhythm, intonation and tempo, on the other hand, are frequently referred to as suprasegmentals, as they apply not to individual sounds, but to entire words, or even utterances: they occur above the level of the single segment. For the analysis of spoken interaction the suprasegmental level of talk is the most relevant, as speakers employ it to subtly manipulate the pragmatic meaning of their utterances. Therefore, this chapter is primarily concerned with the suprasegmental aspects of speech.

Another term that is frequently used for suprasegmentals is prosody, often defined as the musical aspects of speech: pitch, loudness, and time. In the following section the role of prosodic features for the accomplishment of conversational actions will be considered, and it will be discussed whether it is possible to assign specific discourse functions to individual features. Subsequently, issues surrounding the learning and teaching of pronunciation will be presented, and the argument will be made that in order to achieve interaction successfully and fluently in a second language, it is not necessary to speak with “native-like” prosody.

The role of prosody for discourse

Research on prosody in conversation has shown that the pitch, loudness, and timing of utterances play a vital role in shaping the social actions that speakers perform through language. However, the fact that speakers do not follow a pre-scripted plan but instead continuously create new interactional situations, with new contingencies and risks, means that the role of prosody is a complex one. Nevertheless, there are some contexts in which certain prosodic features seem to be used regularly and systematically. Below we consider conversational turn-taking, sequence organization, and individual actions, such as repair and reported speech.

The examples of naturally occurring talk presented in this chapter are transcribed according to an adapted version of the GAT conventions (Selting et al., 1998), which can be found in the Appendix. Briefly, punctuation marks are used to denote phrase-final pitch movements, such as commas for rise-to-mid and periods for fall-to-low, and capital letters are used to denote levels of stress. The rationale for using such a system, rather than IPA transcription, for example, is to allow the analyst to incorporate prosodic (rather than phonetic) information while still providing an accessible transcript to a broad readership.

Turn-taking

One of the most important conversational activities is turn-taking, that is, speakers’ moment-by-moment negotiation over who speaks next, and for how long. Here, prosody is used as an important cue for whether an utterance, or turn, is potentially complete, or whether its speaker intends to continue talking.

In the following example, Rich is telling his brother Fred about life without a girlfriend.¹ In theory, Fred could come in to speak after line 2 or line 4; however, the intonation at the end of those turns is level, as indicated by the dash symbol in the transcript. Fred only starts speaking when Rich has produced low falling intonation at the end of his turn at lines 5–6, indicated by a period.

1. SBC047 On the Lot

1  Rich:  it's LONEly coming home after pUtting in t- twelve hours
2         on the LOT - =
3         and wOrking All DAY and;
4         yOU know wOrking all EVEning - =
5         and then you don't have Any(.)bOdy to come hOme and

6         SHARE it with.

7  (0.32)

8   Fred:  YEAH;

9  (0.54)
10        .hh a- are y- are yOU WORKing twelve hours?

Another piece of evidence that Rich has finished talking after line 6 is the pause at line 7: he does not say any more after he has produced the low falling pitch movement. This example demonstrates a regular occurrence for British and American standard varieties of English, where potential next speakers often wait until a current speaker has produced a low falling intonation contour before they come in to speak next. Of course, intonation is not the only factor affecting turn-taking decisions. Firstly, there are other prosodic features that play a role. Speakers usually slow down slightly towards the end of their turn and tend to lengthen the final syllable; their speech also decreases in loudness; and in some cases the last syllable takes on creaky voice quality. Secondly, nonprosodic features play an important role. Ford and Thompson (1996) show that it is a combination of grammatical, pragmatic, and prosodic cues that allows conversational participants to judge whether a speaker is finished or not, that is, a speaker has typically finished a sentence and the overall point they are making in terms of content before others come in to speak.

While the above example is representative of standard varieties of British and American English, the prosodic cues for turn-taking vary considerably across accents and dialects. For example, in Tyneside English, spoken in the North East of England, the prosody for turn completion is either a rise or a fall in pitch on the last stressed syllable, combined with a slowing down towards the end of the turn, a sudden increase and decrease in loudness on the last stressed syllable, and lengthening of that syllable (Local, Kelly, and Wells 1986). Similarly, the prosody of turn completion in London Jamaican (Local, Wells, and Sebba 1985) and Ulster English (Wells and Peppe 1996) varies from standard varieties of English.

While turn transition after low falling pitch is a frequently occurring phenomenon, it would be wrong to assume that every time a speaker uses a low fall in pitch they automatically stop speaking and another participant comes in. While discourse participants orient to systematic uses of conversational resources, they nevertheless negotiate each social action individually. This is also true for turn-taking, which means that at each potential turn completion point current speakers may choose to continue or not; and next speakers may choose to come in or not. The systematics for turn-taking have been described in a seminal paper by Sacks, Schegloff, and Jefferson (1974). The following example shows this clearly. At lines 5 and 10, Michael produces potential turn completion points, at which his intonation falls to low. Both are followed by pauses, showing that Michael is leaving the floor to be taken up by his co-participants. When this does not happen, he himself continues speaking.

2. SBC017 Wonderful Abstract Notions

1  Michael: but there's ONE techNOLogy that's uh:m;
2  (0.19)
3           gonna overtake THA:T and that's;
4  (0.17)

5           DNA research.

6  (0.12)

7           WHICH is LIKE(0.11) a TOtal SCAM at thIs point still
8           it’s they're just like (0.18) bomBARDing;
9  (0.75)
10           .h ORganisms with radiAtion to see what comes UP.

11  (0.31)

12           .hh you KNOW;

13           we have vEry little conTROL over it;
14           but once we ↑DO;
15  (0.58)
16           .hh we'll be able to prOgrA:m biOlogy as WELL.

17  (0.83)

18  Jim:     well THA:T’S pretty frIghtening cOncept.

19  Michael: it IS frIghtening but-
20  (0.3)

21           [uhm

22  Jim:     [we cAn't even control our FREEways.

It is only at line 18, and after a considerable pause following another low falling turn completion point that Jim comes in to speak. His utterance ends in low falling intonation, and is immediately responded to by Michael. However, at line 22 we see another local variation of the turn-taking system: Jim comes in to speak even though Michael’s previous turn (line 19) is neither grammatically nor prosodically complete.

The above example demonstrates that we cannot assume a straightforward form-function relationship between prosodic features and discourse actions. Speakers may routinely orient to certain patterns, but nevertheless negotiate individual sequences afresh. Furthermore, in the same way that we cannot assume that speakers always implement turn-taking after prosodic turn-taking cues, we also cannot assume that the prosodic cues for turn-taking are always the same. While there is a strong orientation to low falling intonation, many other patterns may appear at the end of turns depending on the immediate context (Szczepek Reed 2004). In the following example, Joanne describes her favorite holiday destination, Mexico, by listing the many things she likes about it.

3. SBC015 Deadly Diseases

1  Joanne:  BEAUtiful BEAUtiful blue hehe blue WAter,
2           and and .hh WARM Water -
3           and like CORal and TROPical F:ISH - =
4           and inCREDible r- like reSORT -
5  (1.26)
6           lIke uh::m;
7           <<p> hoTEL:S,


8           and REStaurants,>

9  Ken:    .hh Oh when wE were there LAST;

10           we- th- it was JUST after an eLECtion;

The intonation for each list item is either slightly rising (lines 1, 7, 8) or level (lines 2, 3, 4). Neither pitch movement is routinely employed as a turn completion cue; however, in this instance, another prosodic feature plays an important role. Towards the end of her list, Joanne’s voice becomes softer (lines 7–8), indicated by <<p>> for “piano”. As the turn fades out, Ken comes in to speak (line 9) after a slightly rising intonation contour. There is no further talk from Joanne, which suggests that she indeed had not planned to continue speaking. It is also relevant that her earlier pause of 1.26 seconds (line 5) and her use of the tokens like uhm (line 6) indicate local difficulties in the construction of the turn, while the rising pitch on the final two list items projects the potential for more items, rather than necessarily their upcoming delivery.

Prosody also plays a role when turn-taking becomes problematic. French and Local (1986) describe how it is primarily through prosody that participants show whether they consider themselves to be the rightful turn holder, in which case they increase their loudness in the face of an interruption, or whether they are illegitimately interrupting, in which case they increase both loudness and pitch register. Participants who are being interrupted typically raise their overall loudness until the interrupter drops out. See, for example, the following excerpt, in which Angela interrupts Doris at line 4.

4. SBC011 This Retirement Bit

1  Doris:  I’m not a very good PILL taker.=
2          I’m re-
3          i THINK i’m [reSENTing;

4  Angela:        [I’m not EIther but [i get-
5  Doris:                  [<<f> I’m resenting> this

6          MEDicine.
7          and I think it’s conTRIButing to my PROBlems.
8          i REALly DO.

In response to Angela’s interruption at line 4, Doris increases her loudness (lines 5–6), indicated in the transcript by <<f >> for forte. She does so only for a very short part of her utterance (I’m resenting), until Angela has stopped speaking, after which Doris returns to her default loudness.

In the following example, mother Patty and daughter Steph are discussing Steph’s SAT scores with Steph’s friend Erika.

5. SBC035 Hold my Breath

1  Steph:   i KNOW what the trIcks are.=
2           that's ALL you need to KNOW.

3  Erika:   TEACH them to [me.
4  Steph:              [<<f> the Only [way you can-
5  Patty:                        [<<f+h> but whAt you HAVE to

6           remEm[ber I:s that-
7  Steph:        [<<f+h> the Only way you can SCORE high> <<dim> is

8           if you READ a lot.>

9           [THAT’S ALL.
10  Patty:  [<<f+h> what you HAVE to re[MEMber is;>

11  Steph:                  [you CAN’T study;
12  Patty:  <<f> that the SAT> is nOt a whole mEAsure of who you

13          ARE.>

Patty interrupts Steph at line 5, at a point where Steph has clearly not finished speaking. Patty does so with high overall pitch register and high overall loudness. At line 7, Steph also raises her loudness and pitch register, but reduces both once her mother has dropped out. Patty interrupts again, with high loudness and overall pitch, but also returns, first, to her default pitch register, and then to her default loudness as Steph drops out (line 12).

In considering these examples we must bear in mind that increased loudness and high pitch register may accomplish many other things besides interruptions in conversation and that interruptions may not always display these features, depending on the type of interruption a speaker is engaged in. While it is the case that participants in conversation use prosodic features systematically, they also do so flexibly, as each instance emerges as part of its specific interactional context.

Sequence organization

Another primary action participants are involved in during spoken interaction is sequence organization (Schegloff 2007). This term refers to the way in which speakers organize larger conversational projects, such as narratives, complaints, or requests. Here prosody also plays an important role. For example, Couper-Kuhlen (2004) suggests that when speakers begin a new sequence in conversation, they usually do so by stepping up to a higher pitch. Similarly, Local (1992) shows that when speakers design an utterance that was interrupted as a “restart”, they do so with a change to higher pitch, whereas when they design it as a “continuation” of prior talk they do so at the same pitch level as the prior utterance. However, Szczepek Reed (2006, 2009, 2012a) has shown that it is not so much a specific prosodic pattern, or even feature, that is relevant for designing talk as continuing a previous sequence or starting a new one. What seems more relevant is whether participants repeat the prior speaker’s overall prosodic design, or not.

In the following excerpt, two short sequences are accomplished with prosodic repetition, or “matching” (Szczepek Reed 2006). At line 3, Alan initiates repair on Jess’s previous turn, that is, he indicates that he has a problem with it: Jess claims that a book she has been looking for is not held by the British Library, which Alan responds to with what? He does so with a high pitch register (line 3).

6. BSR REC6

1  Jess:    the british LIbrary doesn’t even hAve it though.
2  (0.4)
3  Alan:    <<h> WHAT - >

4  Jess:    <<h> YEAH - >

5     (0.4)
6           because like an amErican -
7  Alan:    OH yeah;
8  Jess:    BOOK;
9  Alan:    they USED to have every (.) english bOOk of course dOn’t
10          they;
11 Jess:    oh IS it;
12 (0.3)
13 Alan:    YEAH::;
14 Jess:    i had ↑QUITE a [lot of Other ones tOO;
15 Alan:              [i think it might be the LAW (.) they have
16          to give them all-
17 Jess:    <<h> oh REALly;>

18 Alan:    <<h> YEAH:: i thInk> [so:;

19 Jess:                  [SHI::T;

In response to Alan’s repair initiation, Jess provides the repair, that is, she confirms that what Alan had found hard to believe is indeed true (line 4). What is interesting is that while Jess’s pitch at line 1 is in her default range, her pitch at line 4 matches Alan’s: his repair initiation is produced with high pitch register and so is Jess’s repair. The pitch matching can be seen in Figure 11.1 (Alan’s turn is represented in the top tier, Jess’s turn in the bottom tier). Shortly afterwards at line 17, Jess issues a news receipt, oh really, of Alan’s previous informing at lines 15–16. The news receipt is again produced with a high overall pitch register and so is Alan’s response, yeah I think so (line 18). Once again, a response is designed as matching the prosodic design of the turn it is designed to respond to.

c11-fig-0001 — **Figure 11.1** Pitch matching in excerpt (4), lines 3–4.

Both sequences in the above excerpt are adjacency pairs (Schegloff 2007), that is, they are sequences in which one turn (a so-called First Pair Part) initiates and makes relevant a certain type of action by another speaker (a Second Pair Part). Typical adjacency pairs are question-answer, greeting-return greeting, or, as in this case, repair initiation-repair and news receipt-confirmation. The matching of the prosodic design plays an important role for the second turn to be heard and treated as a Second Pair Part. If second speakers do not match first speakers’ prosody, their responses may not be treated as appropriate. See, for example, the next excerpt, in which Julie greets Tricia, a 9-year-old child. Julie’s first greeting is delivered with a high pitch register and a musical interval. Tricia’s next turn is produced with a low pitch register.

7. BSR Farm (no recording)

1  Julie:    <<h + musical interval> HI TRIcia - >

2  Lisa:    <<l> hellO;>


3  Julie:   <<l> hellO;>

4  Mum:    <<h > HI TRISH –
5           yOU alRIGHT,>
6  Lisa:    <<l> NO;>

What is noticeable about this excerpt is that Julie issues another greeting at line 3. This shows that she does not treat Tricia’s low pitched hello as a return greeting to her own earlier high pitched greeting; instead, she treats it as a new first greeting, requiring a response. Her return greeting matches Tricia’s low pitch.

This example shows that with regard to the starting of new sequences in conversation, it is not necessarily the precise nature of the prosodic design that is relevant (i.e., high pitch), but instead the presence or absence of prosodic matching more generally. Turns that match the prosody of prior turns may be treated as responding. Turns that do not match the prosody of previous talk may be treated as starting something new – even if the speaker uses low pitch, as in excerpt (7).

Conversational actions

Besides turn-taking and sequence organization, which have to be achieved throughout speakers also make use of prosody in their accomplishment of individual social actions. In most cases, prosody is not the only feature that implements actions, but there are some instances in which it plays a primary role. For example, Selting (1996) shows that the German repair initiation “bitte” (“pardon”) is used in two different prosodic variants, which are treated by recipients as implementing two different conversational actions. Both versions of “bitte” are produced with rising intonation, but in one case the pitch rises considerably higher than in the other and the overall loudness is increased. Selting shows that while “bitte” with a default pitch span and loudness is treated as initiating repair over mechanical issues, such as an acoustic issue or other understanding problem, “bitte” with a wide pitch span and increased loudness is treated as a cue for astonishment. In the second case, next speakers do not repeat what they said, as they do after “bitte” with default prosody. Instead, they display accountability, thus showing that they heard the loud and high-pitched “bitte” as initiating repair over the content of their previous turn, rather than over acoustic features.

Another activity for which prosody seems to be crucial is participants’ quoting of others. Couper-Kuhlen (1996) shows that there is a clear interactional distinction between simply repeating what another speaker said and mimicking it. This distinction is achieved primarily through prosody, particularly pitch register. Couper-Kuhlen describes two variations of male speakers’ repeating female speakers’ talk. In both cases, the men match the women’s pitch register. However, they do so either on a relative or on an absolute scale. If pitch register is matched on a relative scale that means that a male speaker repeats the word or phrase of an immediately prior female speaker in the same pitch register, but relative to his own voice range. Thus, if a female speaker is speaking in an upper-mid register, the male speaker will repeat the female speaker’s words in what is an upper-mid register for him. This is the default case, and is treated as unmarked repetition by participants themselves. If, on the other hand, the quoting male speaker matches the female pitch register on an absolute scale, this means he uses exactly the same pitch as the woman, thus speaking extremely high in his own voice range. This is treated by participants as mimicry and a form of implicit criticism.

Klewitz and Couper-Kuhlen (1999) consider quoting nonpresent speakers, and compare prosody to the use of quotation marks in written texts. They show that while a change to a different prosodic pattern may indicate the onset of reported speech, spoken discourse is much more flexible than written punctuation and does not require prosodic marking to continue for the whole stretch of reported speech. A change to a high pitch register, for example, may be enough to indicate that reporting has begun, even though it may not be sustained throughout the entire turn. Interestingly, speakers may also project upcoming reported speech by adopting the prosodic design before the actual reported speech sequence has begun, as in the following excerpt.

8. SBC006 Cuz

1  Alina:  (JOY) talked the whole time;=
2          <<falsetto+extra high+all> in a voice like THIS -
3  (0.44)
4          <<higher falsetto> HI:: ((alina)) -

5          i'm so HAPpy to see YOU::;>>

6          <<laughing> and we're going - >
7  (0.4)
8          .hh <<h> GO::D;
9  (0.34)
10          turn the VOLume <<laughing> dOwn;>

In this excerpt, Alina voices the speech of a nonpresent person referred to here as Joy. She uses an extreme prosodic format, involving falsetto voice quality in combination with extremely high pitch register and fast speech rate. However, she starts using these features already on her pre-quotation talk at line 2 (in a voice like this), thus indexing the voice before actually voicing it. Following the reported speech sequence, Alina returns to her default voice quality, pitch register, and speech rate (line 6).

Summary

In this section the role of prosody for conversation has been outlined, with a focus on the two main discursive activities that speakers are involved in almost continuously in interaction: turn-taking and sequence organization. While it is clear that prosodic features are important in speakers’ negotiation over these activities, it is not at all easy to establish a specific form-function relation for any given prosodic feature. For example, low falling intonation may at times be interpreted as a cue for turn completion; at other times, it may not be. Similarly, while slightly rising intonation can be a cue for a speaker’s intention to continue talking, at times co-participants may come in after a slightly rising contour without being treated as illegitimately interrupting. This points to the multilayered role that prosody plays in conversation: while it might be treated as a turn-taking cue in some instances, which might require falling pitch, in others its main role may be to contribute to an utterance as a list item, which may require rising pitch. Prosody also contributes to linguistic distinctions. For example, increases in pitch, loudness, and lengthening cause a syllable to be perceived as stressed, while intonation helps listeners separate syntactic phrases. Furthermore, speakers use prosody as a cue for displaying affect and stance (Reber 2012). On the other hand, most actions in conversation are not accomplished only through prosody, but through other interactional resources, such as grammar, word choice, gaze, and gesture. It may be that at those times when prosody is not employed as a turn-taking cue other resources are used in its place.

However, while individual pronunciation features cannot easily be assigned specific discourse functions, there are broader interactional activities, such as ending an activity or starting a new one, which are systematically accomplished prosodically. As the examples above show, speakers orient to a distinction between repeating and not repeating a prior prosodic design and treat it as a distinction between continuing an ongoing sequence and a new beginning.

For the analysis of discourse it is vital to maintain a flexible perspective on prosody that allows for an understanding of interaction as emerging and locally negotiated. For the teaching and learning of prosodic pronunciation features such a flexible perspective presents a potential problem, as it is much easier to learn and teach specific functions of prosody than to acquire pronunciation as a resource for locally accomplished actions. In the following section, these issues are considered in more detail.

Implications for learning and teaching pronunciation

Since it is impossible to speak without prosody – speech will almost always be produced at some pitch level, with some form of intonation and loudness, with some form of voice quality, etc. – the question arises as to which aspects of prosody differ across languages. One might argue that it is only those features that differ between a learner’s L1 and L2 that should be taught in the language classroom. However, the discursive perspective detailed above suggests that what counts in interaction is not necessarily the “correct” pronunciation of utterances according to “native” speaker standards, but the appropriate use of prosodic features in any given context of social interaction. Thus, a different argument might be that if a language learner is able to use prosody in a way that implements social actions appropriately, then the influence of their L2 phonology does not matter interactionally outside those action contexts.

Jenkins (2000) argues that the primary goal for English pronunciation teaching should be intelligibility and, for learners who use English mainly as a lingua franca, international intelligibility. That is, only those pronunciation features should be taught that contribute to internationally intelligible speech, a suggestion that has inspired much debate (Levis 2005; Dziubalsak-Kołaczyk and Przedlacka 2005). Jenkins makes her argument against the background of academic discussions of English as a global, rather than a regional language, and the consequences this has for language learning and teaching. It is possible to develop this argument further, and take not only intelligibility but the successful accomplishment of actions in interaction as the criterion for teaching and learning pronunciation features. Regarding prosody, this argument is a particularly powerful one, given the flexible use of prosodic parameters by “native” speakers compared to segmental pronunciation features.

In the following we explore these issues by looking at speech rhythm, a prosodic feature whose form varies widely across languages. While features such as loudness, voice quality, and even pitch may have certain universal applications due to their close relation to physical sound production, time-related features such as syllable lengthening, stress, and rhythm have closer connections to the linguistic structures of each language.

Speech rhythm: stress timing and syllable timing

Rhythm is a feature of all languages, as all speech adheres to some form of regularity in its fluent organization of words and syllables. However, describing languages as rhythmic does not mean that speech in those languages is perfectly isochronous, that is, absolutely regular. Rhythm is very much a perceptive phenomenon, and listeners will hear regularity even if the placement of rhythmic beats deviates to some extent from perfect isochrony. Nevertheless, most speech shows some form of regularity, even if languages differ greatly in their rhythmic organization. Phoneticians typically identify languages as belonging to one of two “rhythm classes”: stress timing and syllable timing (Pike 1945; Abercrombie 1967), with most languages located somewhere along this spectrum of extremes (Dauer 1983; Miller 1984). Standard British English is classed as a highly stress-timed variety.

In short, stress timing refers to the perception of stressed syllables as being placed on rhythmic beats, as in:

a total scamat thispoint (excerpt 2, line 7)

This may not seem particularly remarkable, but becomes more so in utterances where stressed syllables are separated by unequal numbers of unstressed syllables, as in:


nota whole measure of who you are(excerpt 5, lines 12-13)

In order for the stressed syllables in this last utterance to be perceived as a rhythmic pattern, the unstressed syllables between them must be spoken in a similar time interval, even though there are only two unstressed syllables following the first “beat” (a whole) but four (-sure of who you) following the second. As a result, the unstressed syllables following the second beat must be produced more quickly and will therefore be much shorter in duration than those following the first beat. This explains why stress timing is determined by measuring how much syllable duration varies in a given language or variety: languages in which syllable duration varies a lot are typically classed as tending towards stress timing; languages in which syllable duration is more equal are classed as tending towards syllable timing, which involves a perception of each syllable as a rhythmic beat in itself. As a result, syllable-timed speech is sometimes described as having more of a “staccato” rhythm (Brown 1988).

Speech rhythm in conversation

A common perception of English rhythm is that content words are stressed and function words are unstressed. However, as the examples above demonstrate this is not an accurate description of natural talk. In real-life conversation participants frequently stress function words, depending on the pragmatic meaning they are conveying and the social action they are engaged in.

In British English conversation, speech rhythm has been found to play an important role for turn-taking. Auer, Couper-Kuhlen, and Müller (1999) show that in British English, next speakers integrate their talk rhythmically into the rhythm of a previous speaker. They do so by placing their first stressed syllable on what has been projected by the previous speaker’s turn as the next rhythmic beat. Auer et al. show that this type of rhythmic integration is the default case for British English conversation, whereas producing a turn too early, or too late, with respect to a previous rhythm is treated by participants as a cue for conversational trouble. In the following example form a radio phone-in programme, the radio host’s greeting is delivered with a clear rhythmic pattern (line 2). In his reply, the caller places is return the greeting token hi precisely on the next rhythmic beat (line 4).

9. BE Scientist: Roger

1  Host:    joining us on the show;
2           ROger’s in CLACton hi ROger.

3  (0.16)

4  Caller:  HI.

5  Host:    now we’ve gOt uh just a MINute or two LEFT;

Figure 11.2 shows how the caller’s production of hi overlaps almost perfectly with the onset of the projected next beat. The vertical dotted lines indicate rhythmic beats, while the bold vertical lines in the text tier indicate the onset of vowels in stressed syllables. In speech rhythm research it is customary to measure rhythmic beats from the vowel onset, rather than the onset of the syllable, due to the wide variations of syllable onsets in English.² The first interval was measured (0.4792 s) and then automatically superimposed over the rest of the waveform. The host’s turn is represented in the top tier in the figure, the caller’s turn in the bottom tier.

c11-fig-0002 — **Figure 11.2** Rhythmic integration (Szczepek Reed 2009: 1234).

Reproduced by permission of Elsevier.

By integrating their next turn rhythmically into a prior speaker’s turn, participants also achieve interactional integration and conversational fluency without noticeable gaps or overlaps (McCarthy 2009). Thus, regarding the learning and teaching of English speech rhythm, a vital question to ask is whether learners of English are able to accomplish integrated turn beginnings or if the rhythm of their first language impacts on their pronunciation to the effect that turn-taking is impeded.

Speech rhythm in conversations between syllable-timed and stress-timed speakers of English

Speech rhythm seems to affect the pronunciation of learners of English considerably, particularly if the speaker’s first language has a tendency towards syllable timing (Adams 1979; Anderson-Hsieh, Johnson, and Koehler 1992; Anderson-Hsieh and Venkatagiri 1994; Bond and Fokes 1985; Brown 1988; Low 2006; Taylor 1981). The main influence of syllable-timed rhythm is on the pronunciation of unstressed syllables, such as weak forms. Learners of English whose first language has a tendency towards syllable timing may produce both stressed and unstressed syllables with relatively equal duration, thus making it difficult for listeners from a stress-timed background to identify which syllables are being stressed and which are not. From a conversational perspective, the main question is whether speakers of English with syllable-timed rhythm accomplish turn-taking successfully given the important role speech rhythm plays for the organization of speaker change. In a study of interactions between speakers of British English (BE) and Singapore English (SE), Szczepek Reed (2010, 2012b) investigated the rhythm and timing of turn transitions. The first language of the Singapore English speakers was Mandarin and all SE speakers had learned English from the age of 6. Both Mandarin and Singapore English have been classified as syllable-timed languages (Benton et al. 2007; Chen et al. 2001; Deterding 2001; Low, Grabe, and Nolan 2000). The study found that in any given conversation between a third and half of all turn transitions from the BE speaker to the SE speakers were rhythmically integrated. Many additional transitions could be perceived as rhythmic, but did not show sufficient isochronony in the acoustic analysis.³ The majority of rhythmic turn transitions were either monosyllabic turns (such as “yeah” or “no”) or turns in which the first syllable was rhythmically integrated, and the speaker then continued with a more syllable-timed rhythm.

This suggests that in spite of considerable differences in speech rhythm, at the point where it matters most, SE speakers often accomplish interactionally what their “native-speaking” counterparts accomplish, i.e., smooth transitions from one speaker to the next. In order to do so it is not necessary for them to speak with stress-timed rhythm, but only to perceive stress timing in their British English speaking co-participants, and to orient to it wherever this becomes relevant in interaction, i.e., at the point of turn transition. Thus, SE speakers show interactional competence without adhering to BE pronunciation rules. From a conversational perspective it is not important how stress-timed or syllable-timed learners’ speech is, but how successfully they employ rhythm for the accomplishment of conversational actions, such as turn-taking.

Concluding observations

The discursive perspective on pronunciation has gained much ground in recent years and will continue to do so with the increase in research on the role of phonetics and prosody for interaction. Insights from discourse and conversation analysis that reveal talk to be a collaborative achievement, rather than an individualistic activity, have filtered into much of communicative language teaching practice, and concepts that used to be considered the exclusive responsibility of individual speakers are now addressed as interactional issues. This applies, for example, to the concept of fluency, which McCarthy (2009) suggests to consider

… as an interactive achievement, perhaps more adequately captured by the metaphor of confluence. Achieving confluence, successfully interacting in talk that flows and being perceived as both able to create within one’s own utterances and across utterances the satisfactory perception of flow for all participants is an art, the evidence of which will not be found or fairly assessed in monologic contexts but in the robust evidence of dyadic and multi-party talk (2009: 23).

Similarly, pronunciation does not fall within the domain of the single speaker, as each utterance is designed for specific recipients in response to specific prior talk and in order to accomplish a social action fitted to its specific context. Furthermore, as Lindemann (2006, 2011) has shown, intelligibility is as much the responsibility of the listener as it is that of the speaker. Therefore the teaching and learning of pronunciation requires an understanding of its nature as fundamentally entwined with the collaborative activity that is talk-in-interaction.

REFERENCES

Abercrombie, D. 1967. Elements of General Phonetics. Edinburgh: Edinburgh University Press.
Adams, C. 1979. English Speech Rhythm and the Foreign Learner, The Hague: Mouton.
Anderson-Hsieh, J., Johnson, R., and Koehler, K. 1992. The relationship between native speaker judgements of nonnative pronunciation and deviance in segmentals, prosody, and syllable structure. Language Learning 42(4): 529–555.
Anderson-Hsieh, J. and Venkatagiri, H. 1994. Syllable duration and pausing in the speech of Chinese ESL speakers. TESOL Quarterly 28(4): 807–812.
Auer, P., Couper-Kuhlen, E., and Müller, F. 1999. Language in Time: The Rhythm and Tempo of Spoken Interaction, New York: Oxford University Press.
Benton, M., Dockendorf, L., Jin, W., Liu, Y., and Edmondson, J.A. 2007. The continuum of speech rhythm: computational testing of speech rhythm of large corpora from natural Chinese and English speech. In: Proceedings of the 16th International Congress of Phonetic Sciences. http://www.icphs2007.de/conference/Papers/1591/1591.pdf.
Bond, Z.S. and Fokes, J. 1985. Non-native patterns of English syllable timing. Journal of Phonetics 13: 407–420.
Brown, A. 1988. The staccato effect in the pronunciation of English in Malaysia and Singapore. In: New Englishes: The Case of Singapore, J. Foley (ed.), 115–147, Singapore: Singapore University Press.
Chen, Y., Robb, M.P., Gilbert, H.R., and Lerman, J.W. 2001. A study of sentence stress production in Mandarin speakers of American English. Journal of the Acoustical Society of America 109(4): 1681–1690.
Couper-Kuhlen, E. 1996. The prosody of repetition: on quoting and mimicry. In: Prosody in Conversation, E. Couper-Kuhlen and M. Selting (eds.), 366–405, Cambridge: Cambridge University Press.
Couper-Kuhlen, E. 2004. Prosody and sequence organization: the case of new beginnings. In: Sound Patterns in Interaction. Cross-linguistic Studies from Conversation, E. Couper-Kuhlen and C.E. Ford (eds.), 335–376, Amsterdam: Benjamins.
Dauer, R.M. (1983). Stress-timing and syllable-timing reanalysed. Journal of Phonetics 11: 51–62.
Deterding, D. 2001. The measurement of rhythm: a comparison of Singapore and British English. Journal of Phonetics 29: 217–230.
Du Bois, J.W. and Englebretson, R. 2004. Santa Barbara Corpus of Spoken American English, Part 3, Philadelphia: Linguistic Data Consortium.
Du Bois, J.W. and Englebretson, R. 2005. Santa Barbara Corpus of Spoken American English, Part 4, Philadelphia: Linguistic Data Consortium.
Du Bois, J.W., Chafe, W,L., Meyer, C., and Thompson, S.A. 2000. Santa Barbara Corpus of Spoken American English, Part 1, Philadelphia: Linguistic Data Consortium.
Du Bois, J.W., Chafe, W.L., Meyer, C., Thompson, S.A., and Martey, N. 2003. Santa Barbara Corpus of Spoken American English, Part 2, Philadelphia: Linguistic Data Consortium.
Dziubalsak-Kołaczyk, K. and Przedlacka, J. (eds.). (2005. English Pronunciation Models: A Changing Scene, Bern: Peter Lang.
Ford, C. E. and Thompson, S.A. 1996. Interactional units in conversation: syntactic, intonational, and pragmatic resources for the projection of turn completion. In: Interaction and Grammar, E. Ochs, E.A. Schegloff, and S.A. Thompson (eds.), 135–184, Cambridge: Cambridge University Press.
French, P. and Local, J. 1986. Prosodic features and the management of interruptions. In: Intonation and Discourse, C. Johns-Lewis (ed.), 157–180, London: Croom Helm.
Jenkins, J. 2000. The Phonology of English as an International Language, Oxford: Oxford University Press.
Klewitz, G. and Couper-Kuhlen, E. 1999. Quote-unquote. The role of prosody in the contextualization of reported speech sequences. Pragmatics 9(4): 459–485.
Levis, J.M. (ed.) 2005. Reconceptualizing Pronunciation in TESOL: Intelligibility, Identity, and World Englishes, Special issue. TESOL Quarterly 39(3).
Lindemann, S. 2006. What the other half gives: the interlocutor's role in non-native speaker performance. In: Spoken English, TESOL and Applied Linguistics: Challenges for Theory and Practice, R. Hughes (ed.), 23–49, Basingstoke: Palgrave Macmillan.
Lindemann, S. 2011. Who’s “unintelligible”? The perceiver’s role. Issues in Applied Linguistics 18(2): 223–232.
Local, J. 1992. Continuing and restarting. In: The Contextualization of Language, P. Auer and A. di Luzio (eds.), 273–296, Amsterdam: John Benjamins.
Local, J., Kelly, J., and Wells, B. 1986. Towards a phonology of conversation: turn-taking in Tyneside English. Journal of Linguistics 22(2): 411–437.
Local, J., Wells, B., and Sebba, M. 1985. Phonology for conversation: phonetic aspects of turn delimitation in London Jamaican. Journal of Pragmatics 9: 309–330.
Low, E.L. 2006. A cross-varietal comparison of deaccenting and given information: implications for international intelligibility and pronunciation teaching. TESOL Quarterly 40(4), 739–761.
Low, E.L., Grabe, E., and Nolan, F. 2000. Quantitative characterizations of speech rhythm: syllable-timing in Singapore English. Language and Speech 43(4): 377–401.
MacWhinney, B. 2007. The TalkBank project. In: Creating and Digitizing Language Corpora: Synchronic Databases, J.C. Beal, K.P. Corrigan, and H.L. Moisl (eds.), vol.1, 163–180, Basingstoke: Palgrave-Macmillan.
McCarthy, M. 2009. Rethinking spoken fluency. Estudios de Lingüística Inglesa Aplicada 9: 11–29. http://institucional.us.es/revistas/elia/9/3.%20McCarthy.pdf.
Miller, M. 1984. On the perception of rhythm. Journal of Phonetics 12, 75–83.
Pike, K.L. 1945. The Intonation of American English, Ann Arbor, MI: University of Michigan Publications.
Reber, E. 2012. Affectivity in Interaction. Sound Objects in English, Amsterdam: John Benjamins.
Sacks, H., Schegloff, E.A., and Jefferson, G. 1974. A simplest systematics for the organization of turn-taking for conversation. Language 50: 696–735.
Schegloff, E.A. 2007. Sequence Organization in Talk-in-Interaction: A Primer in Conversation Analysis, Cambridge: Cambridge University Press.
Selting, M. 1996. Prosody as an activity-type distinctive cue in conversation: the case of so-called “astonished” questions in repair initiation. In: Prosody in Conversation, E. Couper-Kuhlen and M. Selting (eds.), 231–270, Cambridge: Cambridge University Press.
Selting, M., Auer, P., Barden, B., Bergmann, J.R., Couper-Kuhlen, E., Günthner, S., Meier, C., Quasthoff, U., Schoblinski, P., and Uhmann, S. 1998. Gesprächsanalytisches Transkriptionssystem (GAT). Linguistische Berichte 173: 91–122.
Szczepek Reed, B. 2004. Turn-final intonation in English. In: Sound Patterns in Interaction. Cross-linguistic Studies from Conversation, E. Couper-Kuhlen and C.E. Ford (eds.), 97–118, Amsterdam: Benjamins.
Szczepek Reed, B. 2006. Prosodic Orientation in English Conversation, Basingstoke: Palgrave.
Szczepek Reed, B. 2009. Prosodic orientation: a practice for sequence organization in broadcast telephone openings. Journal of Pragmatics 41(6): 1223–1247.
Szczepek Reed, B. 2010. Speech rhythm across turn transitions in cross-cultural talk-in-interaction. Journal of Pragmatics 42(4), 1037–1059.
Szczepek Reed, B. 2012a. Beyond the particular: prosody and the coordination of actions. Language and Speech 55(1): 12–33.
Szczepek Reed, B. 2012b. A conversation analytic perspective on teaching English pronunciation: the case of speech rhythm. International Journal of Applied Linguistics 22(1): 67–87.
Taylor, D.S. 1981. Non-native speakers and the rhythm of English. International Review of Applied Linguistics 19(3): 219–226.
Wells, B. and Peppe, S. 1996. Ending up in Ulster: prosody and turn-taking in English dialects. In: Prosody in Conversation, E. Couper-Kuhlen and M. Selting (eds.), 101–130, Cambridge: Cambridge University Press.

Appendix

Transcription Conventions (adapted from Selting et al. 1998)

Pauses and lengthening

(2.85)	measured pause
:::	lengthening

Accents

ACcent	primary pitch accent
Accent	secondary pitch accent

Phrase-final pitch movements

?	rise-to-high
,	rise-to-mid
-	level
;	fall-to-mid
.	fall-to-low

Pitch step-up/step down

↑	pitch step-up
↓	pitch step-down

Changes in pitch register and volume

<<l> >	low pitch register
<<h> >	high pitch register
<<f> >	forte
<<p> >	piano

Breathing

.h, .hh, .hhh	in-breath
h, hh, hhh	out-breath

Other conventions

[	overlapping talk
[

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.