3
Multimodality and Language Learning

MARK DRESSMAN

Introduction

The term multimodality in language education theory and research commonly refers to the coordination of multiple different systems of signification to communicate a single, or at least a unified, message or meaning. For example, a meteorologist on a nightly newscast stands before a map, explaining the progress of an oncoming band of thundershowers across a region, and gesturing to suggest its direction and speed via an overlay of color‐coded radar images. In this case, spoken language combines with multiple visual images and arm movements to convey a very complex message about an imminent weather event's strength, duration, timing, and likely consequences. In a similar combination of the linguistic, the kinetic, and the visual, political protesters wear bright pink knitted caps with two points during a public march, carrying signs that say, “Viva la Vulva!” and “NASTY WOMAN,” to protest a politician's vulgar remark about grabbing a part of a woman's anatomy. Or, more subtly and less provocatively, a child creates a birthday card for her mother, drawing a picture of the two of them with the caption, “I ♥ u, Mummy.”

The coordination of input from multimodal sources exceeds the intentional combining of the spoken, the written, and the visual in the examples above. From birth or perhaps before, the input from which we construct reality is a multisensory and almost infinitely complex array of sights, sounds, smells, tastes, and feelings. Our knowledge of the world and of how to act in it is phenomenological; it depends on processes that bring into consciousness daily experiences that are both intended and unintended, and culturally both formalized and highly informal. Language, as a fundamental human experience and process, is both embedded in human experience and living and, at times, pulled from these as its own object; it functions both autonomically at times as a part of our whole being and, also, as a means of reflection upon itself and as our principal means of objectification and explication of the world.

Perhaps this dual function of expression and objectifying explication helps to explain why language is less trusted than other modes of signification, at least within proverbial spheres of wisdom. “A picture,” the adage goes, “is worth a thousand words”; and similarly, “Seeing is believing.” These sayings speak (ironically) to the perceived primacy of first, less mediated, experiences of the world as the basis of what is “real” and “true” over linguistically processed renderings. But, what of the relationship between linguistic and visual signs in combination, as in the meme shown in Figure 3.1 celebrating the excessive use of helicopters by a former Speaker of the Australian House of Representatives? In this case, it could be argued that “getting” the joke of the meme depends on the absurdity of a dog tethered to a helicopter, and so the visual part of the meme would seem to carry most of its punch, if not its meaning. However, without the linguistic part of the meme, it would be no more than an unusual image whose meaning and context were lost, save for perhaps the most insightful followers of Australian politics. Even with the linguistic part, as a non‐Australian I needed to google Bronwyn Bishop's name and read more text before I completely understood the context. The use of language in this meme does not reproduce the meaning of the image, and is not more or less critical to the message conveyed by the meme as a whole; rather, language and the image interact, or perhaps transact (Dewey and Bentley 1949), in our perception to produce a meaning that exceeds what either mode could have conveyed on its own.

Meme displaying a dog on the ground tied from a helicopter with text, “Bronwyn Bishop just walking her dog.”

Figure 3.1 A meme about Australian politician Bronwyn Bishop's misuse of government perks.

It is the relationship between language and other modes of communication that any theory of multimodality must explain, especially in the context of informal language learning. However, because language is both something in the world and the primary means of consciously objectifying and understanding that world, any theory of multimodality that might help to explain how second or additional languages are acquired informally – that is, through engagement with and reflection upon a language – must also begin with an account of how language relates to other modes of signification even as it remains separable from them. This task is further confounded rather than illuminated by folk beliefs and sayings about these relations, and by the fact that in attempting to understand these relations we are required to use the very thing, language, that we seek to understand. We are, as a species, it seems, inescapably logocentric, or bound through language to language as our primary medium for grasping how language relates to other modes of meaningful input.

This chapter takes the issues above into account in developing a theory of multimodality and its implications for informal language learning. This will not be an easy task, nor will a fully completed theory be presented within this chapter, because, without exaggeration, multimodality as a concept is as difficult to understand fully as is the concept of gravity. Like gravity, multimodality is a constant part of our experience, and as critical to our understanding of the world as gravity is to that world's existence. Like gravity, too, its effects are easily observed and recorded; and yet, just as science has yet to explain fully what gravity is and how and why it works, the social sciences and humanities have yet to provide a clear, compelling, and non‐contradictory explanation of how and why multiple and very different systems of meaning combine to produce a message whose sum may far exceed its parts. In fact, I will argue in this chapter that prevailing theories of multimodality in language and literacy are far behind current thinking in physics about gravity; they are pre‐Newtonian, and in some circles of discourse offer explanations that involve the denial of accepted linguistic facts.

Research on multimodal platforms for learning

Because multimodality's dynamics are so difficult to explain, perhaps the best approach to studying the phenomenon's implications for language learning might be to avoid questions of process entirely and instead focus on the observable outcomes of multimodal interactions. From this perspective, some researchers and theorists characterize multimodality as an enabling condition, in which the combination of meaning from two or more modes combines to have a demonstrable learning outcome. For these researchers, multimodality itself is not under investigation as much as certain combinations of modes – print text and audiobooks; videos with subtitles in L1 or L2; video games with written or spoken chat – and their learning outcomes in comparison to unimodal or other combinations. Multimodality in these studies is an enabling feature, something that adds motivational and cognitive power to learners' acquisition of a second language. Researchers working from this perspective may employ a range of theoretical explanations, largely cognitive or psychological, that account for the learning but not for how multimodality itself functions within the learning process.

For example, Cummins et al. (2015) provided evidence from two studies of multilingual adolescents' literacy development to argue that the multimodal aspects of composing texts about their cultural identity was critical in improving their academic achievement, but how this happened and specifically how multimodality enabled learning was not explained. Other studies have focused on the cognitive benefits of multimodality in language learning. Chang (2009) compared listening only to a text to simultaneous reading and listening to texts and found that while the improvement in the multimodal (reading + listening) condition was only a 10% increase in comprehension, students preferred the multimodal condition and claimed that it made it easier for them to listen to an audio recording for a longer time. In a related study, Kevin et al. (2010) found that listening to a voice‐synthesized text as they read it improved scores on a cloze procedure test introducing new vocabulary words. Their explanation was that the combination of the two modes “reinforced” each other, without accounting for how that reinforcement occurred. Zarei and Khazaie (2011) studied the use of pictorial and written annotations as aids to learning vocabulary on mobile devices and found that these devices improved users' recognition and recall of new words. They framed their explanation of the efficacy of the multimodal conditions in cognitive terms and suggested that “cognitive style” (p. 369) was an important variable in learners' use of multimodal input. Similarly, Beatrice and Luna (2013) documented the positive effects of using songs to teach Spanish primary students English and framed their discussion of multimodality through reference to Gardner's (2006) theory of multiple intelligences.

Two areas of research that are highly relevant to the study of informal language learning, because they may also be attractive leisure‐time and out‐of‐school activities, are video gaming and captioned videos. Although these are two different formats, multimodally they share some similar features, such as moving images, audio tracks that are both linguistic and acoustic, and written captions or chat boxes. Because of their inherent multimodality, the development or use of theories of multimodality might be presumed to be central to this research. However, as multiple reviews (Hung et al. 2018; Peterson 2010a; Perez et al. 2013; Vanderplank 2010) of research indicate, such studies tend to be framed again in cognitive terms, or atheoretically and very practically. Findings from research on video gaming and video captioning across many combinations of languages, captioning, and visual images nearly all suggest that the multimodality of captioned video, and especially the combination of written and spoken texts in a target language, is a powerful aid to listening comprehension and vocabulary acquisition (e.g. Danan 2004; Johnson et al. 2005; Peterson 2010b).

Studies that have used eye‐tracking as a methodological element provide a window on relations among linguistic and nonlinguistic modalities. Hsu et al. (2014) provided Taiwanese students in an English course filtered captions (not transcriptions but key words written in English script with translations in Chinese script) with an English‐language video. They found that listening comprehension increased when videos were captioned, and that 76% of the students focused most of their attention (eye gazing) on the captions rather than on the other visual elements of the video (see also Specker 2008). However, Winke et al. (2013) found that when English‐speaking students used captioned video to learn one of four foreign languages (Arabic, Chinese, Russian, or Spanish), the amount of attention paid to the captions varied with each language's script, with less attention paid to written Arabic and Chinese, which are more distant from written English, and more attention paid to Russian and Spanish, which are less distant.

In summary, as Ritterfield et al. (2009) note,

Multimodality is an important property of serious games, yet empirical research specifically testing the impact of multimodality is limited. Research on computer‐mediated communication often compares the social and psychological impact of different modalities such as text and voice but seldom discusses the distinct impact of media formats where multiple modalities are combined in one (e.g., digital games and hypertext)

(p. 691).

While there have been many studies published in recent years of multimodal digital platforms and their positive language‐learning outcomes, “the bulk of existing research has focused on the adaption of commercial platforms” (Peterson 2010a, p. 73), that is, on testing many different permutations of platform, written text, spoken text, images, and other features for their efficacy, rather than on trying to understand basic principles of multimodality or develop and test theory. This is unfortunate, because it means that what is learned, beyond the robust finding that multimodality is a critical feature of language acquisition, remains fragmented and generalizable only to specific platforms and permutations of modalities, and that where theorization sometimes does occur, it remains ad hoc and local, or “grounded” (Glaser and Strauss 1999) in its impact.

There is, then, an urgent need for a general theoretical framework (Dressman 2008) that could guide discussion across studies of how and why multimodality “works,” or how the combination of multiple modes of communication contributes to language‐learning outcomes that are more powerful than learning through any single mode. Such a theoretical framework would be complementary to but separate from linguistic, cognitive, and sociological approaches; it would be semiotic, meaning that it would focus on describing how different modes and their different sensory inputs, mainly visual and auditory but potentially also tactile, olfactory, gustatory, and kinesthetic, produce meaning, individually and in combination.

Social semiotics and multimodal research

In the last two decades, an extensive literature borrowed from the field of English language literacy and three main sources – a manifesto published by the New London Group (1996); the “functional systemic linguistics” of Halliday (1994); and especially the work of Kress and his associates (Jewitt and Kress 2003; Kress 2010; Kress and van Leeuwen 2006; Lankshear and Knobel 2011) has attempted to create a “social semiotic” theory of multimodality. This approach is the prevailing framework in L2 contexts for language and literacy studies focusing on multimodality. It appears in articles advocating for the significance of multimodality in language education (e.g. Belcher 2017; Early et al. 2015; Elola and Oskoz 2017; Guichon and Cohen 2016; Hafner et al. 2015; Kress 2000; Lotherington and Jenson 2011; Reinhardt and Thorne 2011; van Leeuwen 2015; Yi 2014), and as part of the theoretical background of many empirical studies (Akyel and Erçetin 2009; Brown 2015; Guichon and McLornan 2008; Hampel 2003; Marchetti and Valente 2017; Nelson 2008; Smith et al. 2017; Sørensen and Meyer 2007; Vandommelle et al. 2017; Zheng et al. 2012). Relatively few studies have made extensive use of social semiotic theory, however (but see Baldry and Thibault 2006; Royce 2002, 2007; Shin and Cimasko 2008; Yang 2012). In short, despite widespread citation, the influence of social semiotic theory remains more honored in the breach than in the observance, limited largely to brief citation and advocacy for the general concept and significance of multimodality.

Acceptance of a social semiotic view of multimodality within L2 language and literacy studies has not been total, however, particularly within the field of second‐language writing. At issue is whether, in the digital age, written language is merely one among many semiotic resources available to an author, as advocates of social semiotics argue, or whether it remains fundamentally different from other resources such as images and audio, both in type and significance. Responses to an article by Belcher (2017) in the Journal of Second Language Writing embracing a social semiotic view of composition have taken up this issue in some detail. According to Belcher, the field of second language writing is behind the times and needs to embrace the new reality of digital composition, which is not going away. As one of multiple arguments for her position, she has argued that “language, as Canagarajah [2016] and others have observed, is now seen as ‘only one of the resources that goes into writing’ [266], and literacy may be more aptly conceived of in design terms [Kress and Jewitt 2003]” (p. 81).

Belcher's five commentators within the “Disciplinary Dialogues” section of the Journal of Second Language Writing, however, have taken a less sanguine view of the past, present, and future of second language writing and composition. Yi (2017), for example, embraced Belcher's general call for a focus on multimodal composition, but also noted that multimodality itself “is not new,” raised a “cautionary note” that written language should not be “trivialized,” and that “written language is not going to disappear and will probably continue to be the most powerful mode of formal learning” (p. 90). Miller‐Cochran (2017) also noted that “multimodal does not always mean digital” (p. 88) and argued for an expanded view of multimodality that would include posters and other non‐digital genres. In a twist on Belcher's argument, Manchón (2017) took an SLA (second language acquisition) position to ask what affordances in multimodal composition would facilitate the learning of language, arguing implicitly for the primacy of language in the classroom and for learners. Similarly, Warshauer (2017) questioned on practical grounds how much time spent using digital technology would contribute to language acquisition and how much to learning to use software. He cited Cummins et al. (2015) and argued that a central benefit of multimodal composition is the opportunity it provides for composing “identity texts” (p. 86) that engage learners in, again, the development of L1 and L2 proficiency. Last, and most directly, Qu (2017), who titled his response, “For L2 Learners, It Is Always the Problem of Language” (p. 92), argued that while multimodality is important, the ability to compose with images and sound remains secondary to the demands on L2 learners to learn to use language both to comprehend and express themselves for academic purposes.

The issues raised within second language writing studies about the relation of written language to other modes of communication speak to broader, fundamental problems with the theoretical base of social semiotics, and especially with language's practical and theoretical relationship to other modes of expression and communication. Space does not permit a full discussion of those problems here, but as I have previously demonstrated (Dressman 2016), problems begin with Kress's (Kress and van Leeuwen 2006; Kress 2010) assertion that language is not, as Ferdinand de Saussure and virtually every linguist of the last two centuries would agree, arbitrary and unmotivated – that is, that bird in English signifies a flying, feathered animal as a matter of historical and linguistic convention, and not because of anything inherent in the spelling or sound of bird itself (it could just as easily be oiseau, as in French, or con chim, as in Vietnamese). Instead, Kress (2010) and Kress and van Leeuwen (2006) argue that meaning derives from a user's intentionality. Kress illustrates his point with the story of a child who draws circles on a paper and says, “This is a car” – implying that the child has written “This is a car,” but ignoring that without the child's statement the circles would be unintelligible (Kress and van Leeuwen 2006, p. 43). Kress must violate the linguistic principle of the arbitrary and unmotivated nature of language because he wants to argue that other types of signs, such as photographs and paintings, which clearly are non‐arbitrary and motivated in that they typically resemble the things they represent, are all part of the same social semiotic system. They have their own grammar, and are also a form of literacy, in that they are encoded and can be decoded, or “read,” just as can print text, using grammatical rules that bear a striking resemblance to traditional English school grammar (and all in the service of escaping from print‐centric and logocentric thinking; see Dressman 2016). Thus, his grammatical analyses of images are typically old‐style diagrammings (Florey 2007) of the captions below the images, not the images themselves; reinterpret the image and rewrite the caption to fit it, and the grammatical analysis of the image itself is changed (Dressman 2016).

In short, Kress and his associates' social semiotic approach yields a framework for multimodal analysis that is filled with non sequiturs, ad hoc rules, and thus, highly interpretable and unreliable readings of multimodal texts. It also provides facile grounds for promoting the idea of multimodality within language studies without providing a coherent basis for analyzing relations among different modes of communication. In making this very strong criticism, I am also fully aware of how strongly I am swimming against the tide of scholarly opinion; yet the evidence and reasoning in support of my critique and against the illogic of social semiotics is incontrovertible. In addition, I intend no insult or criticism of the sincerity or diligence of the scholars or of research that cites the work of Kress and his associates as its theoretical foundation; but I would urge scholars to return to that work and reread it carefully and with a critical eye, paying attention to non sequiturs such as the argument that language is not arbitrary or unmotivated, before deciding whether social semiotics is a viable platform for their research.

An alternative semiotic approach

A very different but in many ways more appropriate approach to understanding multimodality and its role in informal language learning is the theory of semiosis, or the process of semiotic understanding, of American pragmatist Charles Sanders Peirce (Peirce 1955; Parmentier 1987). Peirce was not a linguist but a philosopher, a logician and phenomenologist who sought to explain how meaning is created from otherwise inchoate sensory experience. Peirce's semiotics is grounded in scholastic philosophy, in traditions extending from Aristotle through Augustine, Duns Scotus, William of Ockham, and into the seventeenth century and the philosophy of John Locke (Beuchot and Deely 1995). A sign, in Peirce's theory, is composed of three parts: an object (the thing that is sensed); its representamen, or its representation as a sign (its mental, or general representation); and its interpretant, or meaning (what it is taken to be, relative to other signs). We see/hear/feel/touch/taste something; we relate it to our prior experience of such objects; and we begin to recognize its significance to us and to other signs: This is the process of semiosis.

But the process is more complicated than simply sensing an object, relating it to prior knowledge, and conjuring its interpretation. An object and its representamen are related, or grounded, in one of three ways: (i) through iconicity, or resemblance (I recognize an object as a chair because it looks like other chairs I've seen); (ii) through indexicality, or by spatial/temporal contiguity (I recognize smells as I enter a house indicating that dinner is being made); or (iii) symbolically, through convention (as when I use my knowledge of written English to read an email message). These three types of relation govern all meaning‐making, and correspond to three levels of understanding: Firstness, in which we grasp the quality of something, intuitively, even viscerally; Secondness, in which relations between and among signs are expressed and levels of understanding that are largely nonverbal develop; and Thirdness, in which systematic understanding of relations through propositions, rules, and arguments are expressed through language and other symbolic systems such as mathematics. Finally, there is a progressive evolution, or “chain of semiosis,” from Firstness to Secondness, to Thirdness, or from primary experiences to the location of those experiences relative to others, to the full articulation of experience abstractly and formally, that is, from sensibility to intelligibility, or from basic awareness to the full articulation of knowledge about the world.

Relatively few signs in an environment are exclusively iconic (convey meaning only through resemblance), indexic (simply indicate, or point to, meaning), or symbolic (are completely abstract), however; instead, most signs are combinations of all three relations. For example, a sign on an office door that reads “Dept. Head” is symbolic, because of its use of the linguistic conventions of writing; but it is also indexical in that its placement on a particular office door indicates the identity of the person behind the door and that person's placement within an organizational structure. The sign is also iconic in that its function is recognizable because it resembles signs on other doors we've seen. Table 3.1 provides an expanded description of Peirce's expanded classification of signs, showing how, in combination, iconic, indexical, and symbolic relations create 10 different signs. Peirce termed signs that are principally iconic, qualisigns; signs that are principally indexic, sinsigns; and signs that are principally symbolic, legisigns.

Peircean semiotics provides a framework for theorizing and investigating the role of multimodality in informal language learning that is logical, coherent, verifiable, and grounded in centuries of Western philosophical reasoning. Three different types of sign based on differences in how an object is recognized and becomes knowable – through resemblance (icons/qualisigns), through spatial/temporal contiguity (indices/sinsigns), or through convention (symbols/legisigns) – clearly and parsimoniously map the ways that the multiple modes of a video game, a captioned video, an illustrated text or website, or even a song convey meaning to learners. Semiosis, or the development of understanding from Firstness, or primary experience of phenomena, through Secondness, or an emerging understanding of experience in relation to other signs, to Thirdness, the symbolic expression and articulation of understanding, also suggests a process whereby the initial experience of multimodal texts – of visual images, sound, voice, and writing – are the foundation of more articulated, abstract and general levels of understanding: of knowledge that moves beyond immediate experience to knowledge and meaning applicable across a wide variety of contexts. An unfortunate problem with Peircean semiotics, however, is its unique terminology and logic, such as the use of terms like rhematic (a sign which refers to the quality of its object) and dicent (a sign which references its object within a singular and immediate, contiguous setting), and the use of numerical notations such as 1‐1‐2 or 2‐2‐3 (First‐First‐Second; Second‐Second‐Third), which provide a sense of the continuous progression of signification from Firstness (qualitative experience) to Thirdness (symbolic or principled understanding). An excellent website providing detailed explanations and references to Peirce's work in semiotics and other topics can be found at Commens.org (http://www.commens.org/).

Table 3.1 Peirce's ten classifications of signs and terminology.

Sign Type Description Examples
Firstness Qualisign 1. Rhematic iconic qualisign
1‐1‐1
The sense of something prior to its identification; a quality Up; down; sense of font shape; a smell in the air
A solid downward arrow. 2. Rhematic iconic sinsign
1‐1‐2
Recognition of a type or token; something nameable A ball; a hand; sound of a whistle
Sinsigns 3. Rhematic indexical sinsign
1‐2‐2
A basic signal or indication of something's presence A baby's cry; a train whistle; a moving arrow
4. Dicent indexical sinsign
2‐2‐2
A definite association between two objects; a signal or indicator A weathervane; an automobile turn indicator
Secondness 5. Rhematic iconic legisign
1‐1‐3
Represents reality through conventions of resemblance; a genre of indicator Blueprints; maps; schematic diagrams
6. Rhematic indexical legisign
1‐2‐3
Communicates/instructs/leads to action through modes of resemblance A hand in a PDF; toobar icons; exclamations (Hello!)
A solid downward arrow. Legisigns 7. Dicent indexical legisign
2‐2‐3
Directs action; simple instructions; basic rules “Merge left”; “Look out!”; “No Trespassing.”
8. Rhematic symbol (legisign)
1‐3‐3
Universally recognized symbol representing an idea or ideal A flag; a logo; a slogan; Guernica

Thirdness
9. Dicent symbol (legisign)
2‐3‐3
A statement; an ordering of multiple symbols to express an idea A propositional sentence; a simple judgment or statement;
10. An argument (symbolic legisign)
3‐3‐3
A system; a coherent ordering of symbols to express complex relations E= MC2; a novel; the Napoleonic Code

A Peircean approach also addresses questions about the role of language vis‐à‐vis other modes of communication within a multimodal context. Because language as a system is arbitrary and unmotivated, meaning that its signs – its paradigmatic (lexical) and syntagmatic (syntactic) elements – bear no resemblance to what they represent, and because it produces propositional statements, rules, and arguments that are abstract, symbolic expressions of general and not specific contexts and conditions, language functions as a legisign, as a symbol whose interpretant must also be another symbol. However, in its use – that is, in speaking, listening, writing, and reading – it produces functions that are iconic and/or indexic, and that link language‐in‐use to other modes of communication and action. For example, in writing, punctuation is indexical; it indicates when to stop, when to pause, how a previous idea relates to a coming idea, and so on; in speaking, a raised tone at the end of a sentence indicates a question; and in tonal languages, changes in tone index the meaning of a word. Basic phrases and expressions are also indexic in their function; a shout, “Hello!” or “Watch out!” directs, or indexes, attention; expletives, such as “Damn!” or “OMG!” indicate astonishment, frustration, and so on. Other aspects of language are iconic: we recognize the phonemes and morphemes of speech or the characters and words of a written language because they are categorical and resemble other phonemes and morphemes or other characters and words. A story or a genre is recognized as a story or genre because we've seen or heard it before; they are partially iconic. However, these iconic and indexical functions of language are limited in that they largely produce meaning about language within a context of use, not meaning through language as a symbolic system; they are highly facilitative, but they do not constitute the message or the meaning of any extended proposition, rule, or argument itself.

What language does, then – how it functions as part of human beings' being and action in the world – is explicate and make general our understanding of more concrete, or iconic and indexical, signs as we encounter them. Unless one wants to subscribe to an exclusively Cartesian view of human reality, imagining that the corporeal aspects of existence are illusionary and fleeting and that, ergo, cogito ergo sum, then language does not stand in any superior position to other modes of communication and understanding, but rather in complement to them. We are logocentric/printcentric beings, but our logocentrism is in the service of coming to understand and function in the world. Unless one imagines that life can (or should) be lived exclusively cerebrally, in one's head, then it is Firstness and the experience of qualisigns, and Secondness and the experience of sinsigns that are the foundations of our daily lives and indeed, of nearly all knowledge about the world.

It is from this perspective of relations between language and other types of signs that the power of multimodality in language learning becomes clear. If, as every progressive, creative educator knows, learning begins in and with experience of the world, then to learn a language as an abstracted system of meaning with only very limited access to the world, is an almost pointless and ultimately dreary and very frustrating enterprise. Yet, that is how languages for years were, and in most places still are, formally taught: as symbolic systems divorced from the world they were created for. I speak here not only about the much‐poo‐pooed grammar‐translation method, but also the more “progressive,” approved approaches to language pedagogy, such communicative language teaching (CLT), with its focus on dialogues and authentic texts composed of nothing but language and more language, or similarly, task‐based language teaching (TBLT) and its problem‐based approach in which the medium of problem‐solving is, again, nothing but or little more than language, or even the writing process, with its focus on self‐expression through, again, nothing but…language. Here is logocentrism in the service of nothing but more logocentrism, and for once, aficionados of social semiotics are right: language learning under these conditions is unnatural, forced, largely meaningless, and all too often counterproductive to the goals of learning a language in the first place, be it in the digital age, before, or into the future.

Peircean multimodality in research and design

There is, however, a revolution underway in second and additional language learning, and it is being driven not by teachers or researchers or theorists like me or the readers of this chapter, but by learners who have found places online, in television, in music, and in face‐to‐face, live contact, where second and additional languages service their growing knowledge of those worlds. These are learners who have learned to harness the resource of multimodality – a resource composed not of independent modes, each with its own unique “affordances” competing for primacy and attention, but of multiple types of signs working in coordination with three relational principles of resemblance, contiguity, and convention to produce an articulated experience that is whole and that encompasses all our ways of knowing, from the visceral and qualitative to the relational and rational to the cerebral and intellectual.

How do they do it, and what is it about the condition of multimodality that enables this process? The basics of Peircean semiotics, as discussed above, provide a starting point for an explanation of the process of multimodality in informal language learning. But alas, there has been very little to no research from a Peircean perspective – yet. With the goal of future studies in mind, then, I will suggest four principles that might begin to explain how relations among language and other signs produce a context or contexts for language learning that may exceed those of most formal instruction.

Reciprocal indexicality

The first principle is reciprocal indexicality. It refers to the dynamic that takes place when two signs, typically of different levels, such as a qualisign (a sign linked to its object mainly through resemblance), in combination with a legisign (a sign linked to its object mainly through convention) produce an interpretant, or new sign, which is typically a sinsign, or a sign that indexes its relation to other signs. For example, consider the meme (Figure 3.1) that introduced this chapter. In the photo, three rhematic iconic sinsigns (Sign type 2; see Table 3.1), or images of a helicopter in the air, a long leash, and a dog on the ground, are paired with a proposition or dicent symbol (Sign type 9), “Bronwyn Bishop just walking her dog.” In combination, the three images are perceivable as the highly anomalous and somewhat alarming single sign of a dog tethered to a helicopter. Because this sign indexes or points to something (danger for the dog), it can be classified as a dicent indexical sinsign (Sign type 4). Yet, beyond an indication of alarm, the image itself communicates little more. What does it mean? Grasping the meaning of the image, then, requires more: It requires symbolic understanding, or in this case, language, which is supplied by the written proposition.

It might seem, then, that in this case it is a linguistic sign that supplies the meaning of the meme. However, without the photo, notice that the proposition itself is meaningless. Bronwyn Bishop is “just walking her dog”: So, what? With the image, however, possible meanings are now indexed: Bronwyn Bishop is in the helicopter; the dog is hers; she's walking it. But why? If only we knew who Bronwyn Bishop was. The meme as a whole, then, becomes a rhematic iconic legisign, or Sign type 5 that points or directs our attention toward further understanding (requiring one, perhaps, to google “Bronwyn Bishop”), although for Australians or those familiar with Australian politics it might be considered a rhematic symbol (Sign type 8) of the state of Australian politics or even a dicent symbol (Sign type 9), a proposition about the state of Australian politics. In either case, the relationship between image(s) and written text is a reciprocal one, in which each points to the other to create a more fully developed understanding of its implications, or meaning.

How, then, does the principle of reciprocal indexicality contribute to informal language learning? Let's consider the example of a captioned video, in which an audio voice track is combined with its captioned transcription and a series of video images that are the visual representation of (and that index) the meaning of the audio track and transcribed captions (for an example, watch this video in Spanish about making chocolate chip cookies: https://youtube/‐Ma4xsmQV98). In this example, reciprocity happens in at least two ways. First, the visual signs of recognizable objects and actions provide a concrete and almost visceral level of understanding to what is being talked about, or represented in linguistic, symbolic terms. Second, the written, transcribed captions visually index the spoken language of the video, pointing to individual words, to the phonetic transcription of words, and to aspects of language that are often elided in speech. In combination, the reciprocity between images and language and between spoken and written forms of language provides a context for language learning that is immediate, meaningful across multiple levels of understanding, and when the topic is of interest to a learner, highly engrossing.

The principle of indexical reciprocity of multimodal texts is a powerful tool for language learning because it dramatically increases the amount and quality of comprehensible input available to a learner (Krashen 1985). In Vygotskian (Vygotsky 1980) terms, it lowers the threshold of the zone of proximal development, or space in which a learner can function within a task environment. If the only input a learner has is a written or spoken text in the target language, the ability to comprehend and learn is limited essentially to clues provided within the context of the text itself and her or his prior knowledge; but with the support of a complementary written or spoken transcription of the text, and even more so with the addition of a visual representation of the text offered in close contiguity, comprehensibility increases exponentially. And so, too, according to Krashen, does the opportunity to acquire and know how to use new vocabulary and grammatical structures appropriately.

However, the conditions described here are optimal and not often met. The opportunities for reciprocal indexicality of many captioned videos are typically weaker than what might be expected because the visual images do not correspond very clearly to the linguistic text. This is the case even in many videos created expressly for the purpose of language education, as in the case of a series of French‐language videos on YouTube called “Frenchy French” (https://youtu.be/EaNqp4FXh‐s). In Episode One, for example, in which two young women discuss a letter received from a friend in the United States, the general situation of the dialogue is graspable from the images, but only the letter is written and transcribed in the caption, and all discussion about the letter's contents is done between the two women's “talking heads.” Gestures and expressions provide some level of support to the tone of the discussion, but not enough to provide the letter's or the conversation's meaning. Similarly, consider the videos of Fluent U (https://www.fluentu.com/blog/french/french‐tv‐series‐to‐learn‐french/) a website devoted to teaching French through “authentic” (not produced for instruction) captioned French‐language videos. These two examples may have their flaws, but they still make better use of their multimodal resources than many others, in which a host simply speaks to the camera (e.g. “How to Stop Translating in Your Head and Start Thinking in English Like a Native” [https://youtu.be/FUW_FN8uzy0], which has received a phenomenal eight million plus views already).

Finally, consider the case of videos or other multimodal texts that are not captioned but subtitled, in which a translation of the spoken text is provided along with an image. Multimodal texts in this configuration provide learners with exposure to the target language and a clear means of comprehending the video's meaning, but a lack of indexical reciprocity between the target languages (especially if their written script is different) and often between visual and linguistic signs creates a weak foundation for language learning. And yet, as findings from multiple studies in this volume (see Chapters 10, 12, 14, and 21) suggest, subtitled videos downloaded from the internet or watched on satellite television are often a learner's first source of input for learning a new language.

Reciprocal indexicality, then, is ironically both a powerful and a highly inefficient principle for language learning within multimodal contexts, suggesting that additional factors, such as learner motivation and the support of formal, systematic instruction, are also key to learners' success in using multimodal texts.

Chained semiosis

A second principle of multimodality in language learning derived from Peircean semiotics is chained semiosis, or the principle that understanding moves from primary experience and an unarticulated grasp of the meaning of sensory input, to increasing articulation and then to fully developed statements and “theorizing” about the world. As an object is perceived and mentally represented, its interpretation (or interpretant, in Peirce's terms) becomes a new sign, which is higher in Peirce's classification scheme than its representation (or representamen). In turn, the new sign's interpretant is a sign of a higher order than its predecessor, and so on. This is the simplest explanation of the process, and it is also the least common, because signs are seldom perceived in isolation but usually within the context of other signs, so that, in fact, Peirce's “chain” is more like a multistranded rope or perhaps a thick web of associations, progressing toward increasingly abstract, symbolic levels of comprehension and expression. With respect to multimodal texts, reciprocal indexicality likely plays a key role, as can be seen in the analysis above of the Bronwyn Bishop meme, in which images and an otherwise ambiguous written text combine to produce a chain of semiosis, or signification that moves from Firstness (Sign type 2) to Secondness (Sign type 4) to Thirdness (Sign types 8 or 9).

The principle of chained semiosis provides a very clear account of relations among different types of signs and sign “systems” – visual/imagistic, nonlinguistic audio (music, background sounds), spoken language, written language – common in multimodal texts used by informal language learners across a wide variety of formats and platforms, based on the grounds of their signification: iconic (resemblance), indexic (contiguity), or symbolic (convention). Social semioticians following the arguments of Kress and his associates might criticize these relations as hierarchical and implicitly logocentric, with nonlinguistic signs (pictures, gestures, sounds) at the “bottom” of Peirce's classification and predominantly linguistic signs (propositions, arguments) at the “top”; but another interpretation, and one that is critical to understanding how languages are learned informally, is to argue that the primary signs of initial experience – sights, sounds, smells, tastes, touches – are foundational to acquiring language and linguistic modes of understanding and expression. The chapters of this Handbook attest again and again to the advanced capacity of informal language learners to acquire and use a target language pragmatically and communicatively, in ways that learners who only learn from textbooks and in formal classroom settings seldom learn to do. That capacity is surely due to the robust experiential grounding in the target language that comes from first encountering that language in fully multimodal contexts, that is, to a “bottom‐up” grounding in the multimodal, multisensory context of language‐in‐use.

Overdetermination

The third principle is overdetermination, a term borrowed from Freudian psychoanalysis (Freud 2010) and structural Marxism (Althusser 1985) to describe the multiple underlying causes of a single event. Dreams, for Freud, were overdetermined because they were considered the product of multiple experiences of the dreamer. In structural Marxism, the term was applied to describe the multiple social, economic, and historical forces at work to maintain the stability of capitalist systems despite inherent and contradictory flaws in that system. With respect to multimodality and informal language learning, overdetermination refers not only to the multiple channels of input available to a learner that provide the opportunity for reciprocal indexicality but to the multiple opportunities over time that are provided within a video game, a captioned video or movie, an encounter with a native speaker, or even a well‐illustrated written text for a connection between a familiar, already known sign and the sign of the target language to be made at multiple levels of understanding. For example, in a captioned movie, uses of vocabulary or structures of speech recurring during the movie provide multiple iterations of the use of these signs, each within a slightly different but overlapping situation, so that if one occurrence of a word or phrase is ambiguous or unnoticed by a learner, other later occurrences may be less so, and learning will occur. The overdetermination of multimodal learning experiences thus helps to compensate for their inefficiency, that is, for the frequent lack in many multimodal contexts of a clear connection between what is read and/or heard and what is seen.

Overdetermination is frequently described in studies of multimodality as the condition of “redundancy,” in which multiple, overlapping channels of input are integrated to create a signal that is more robust and complex than the input of any single channel (see, for example, Oviatt 1999; Partan and Marler 1999). However, redundancy also implies that these channels are the same in the ways they produce meaning, and that they strengthen the signal to a perceiving individual through repetition rather than through their complementarity. For a language learner in a multimodal context, the power of multimodal input is that multiple channels produce multiple types, or levels of input and of understanding – qualitative, indicative, symbolic – simultaneously. “Overdetermination” thus seems a more accurate description of this principle of multimodality.

Cooperative principle

The fourth principle of multimodality and informal language learning comes not directly from Peircean semiotics but rather through Peirce's philosophical pragmatism and its connection to linguistic pragmatics. This principle is borrowed from Grice (1975), whose work focuses on the contextual aspects of linguistic communication. The cooperative principle refers to an understanding on the part of communicants that messages are meaningful and coherent and that within an interaction the goal is mutual communication and understanding. A corollary is Grice's concept of implicature, which refers to features of communication that are implied but not explicitly stated or shown. For example, if someone asks, “Would you like some cake?” and the listener replies, “I'm on a diet,” the cooperative principle states that these two seemingly unrelated utterances are in fact related – that “I'm on a diet” is a reply to the first speaker's question, with the implicature that being on a diet actually means “No, thank you, I can't eat cake because I'm on a diet.”

The cooperative principle and implicature are typically applied only to language and to conversation, but they are easily extended to include other forms and modes of communication. In the Bronwyn Bishop meme (Figure 3.1), for example, the cooperative principle is present in the assumption that the image of dog and helicopter is explained by the written caption. The improbability of the image is an example of visual implicature, in that it implies that perhaps (hopefully!) the image was photoshopped and does not represent something that in reality happened. The full naming of someone in the caption, “Bronwyn Bishop,” implies that this is someone of note, whom we should know (and that if we knew who Bronwyn Bishop was, we'd understand the meme), whereas the inclusion of the word “just,” or “simply,” underscores the irony of the situation in the image. The combined modes of communication in the meme are an implicature suggesting that the full meaning of what is being seen and read depends on a context that can be understood once Bronwyn Bishop's identity is discovered.

Because implicature is a quite sophisticated aspect of linguistic communication, it might be assumed that for second language learners, it requires a high level of proficiency in a language to be functional. However, within multimodal contexts, nonlinguistic modes of input provide additional context clues to a learner that can lower the threshold of understanding significantly. The “foreignness” of linguistically and culturally specific settings and situations can also be mitigated by learners' prior experiences and knowledge of the conventions of a genre of video game or video. The ability to grasp what is implied but not said within a language that is being learned is thus likely to be enhanced within multimodal contexts rather than impeded.

Conclusion: Why do we need a theory of multimodality?

In summary, current research on multimodality and both formal and informal language learning can be divided into two categories: research that largely focuses on the learning outcomes of a variety of multimodal platforms or combinations with limited focus on multimodality as a process; and research/advocacy for multimodality as a process, grounded in the manifesto of the New London Group, the systemic functional linguistics of Halliday, and the social semiotics of Kress and his associates. In the case of the former, I argued that research lacking a theory of multimodality may demonstrate its power and effectiveness but provides only a partial account of why and how platforms and programs are effective, whereas in the case of the latter, I demonstrated that a theory grounded in linguistic principles is unavoidably logocentric and fraught with unreconcilable contradictions. In its place, I have proposed a new theory of multimodality and language learning based on four principles grounded in the semiotics of Peirce, structuralism, and the pragmatics of communication.

But I must also admit, in conclusion, that there is still much work to be done before the four principles I have proposed could be called a full theory of multimodality. There may be more principles involved, or some of the ones proposed may need revision. There is also an obvious need for research and verification. Do these principles apply across multiple multimodal contexts? What do they not help to explain? What new questions do they suggest?

There is also the problem of the complexity of Peircean semiotics. Peirce's 10 sign categories are not inclusive, because combinations of qualisigns, sinsigns, and legisigns are not limited to combinations of three that produce 10 sign types, but continue through combinations of four, five, six combinations and more. In a paper subsequent to the original, Peirce proposed 75 sign categories, indicating that the number of different types of signs could, in fact, be infinite. His categories, then, are only heuristics, approximations of what any type of sign a natural phenomenon might be, and they can be difficult to assign with accuracy (Dressman 2016).

Because of the complexity of multimodality and the challenges inherent in attempts to accurately code, or map, any multimodal event, it may be that no theory of multimodality will ever fully account for its dynamics. The issues may be so great that perhaps it is sensible to ask whether researchers of SLA in general and informal language learning more specifically would be better to avoid theorizing entirely and simply accept the power of multimodality as a given feature of the broader phenomenon under investigation. But I reject that “solution” for two reasons. The first is that to do so would be anti‐intellectual and a denial of the fundamental principles of scholarship. The second reason is more practical. It is that researchers of human communication need a robust, intellectually tenable and usable theory of multimodality in second language research for the same reasons that physicists need a theory of gravity that explains more than Newton's theory does: because without such a theory, we remain trapped in a world whose possibilities are even less than Newtonian, feeling our way along at gross levels of understanding, struggling to design better ways of teaching and learning languages in ways that are haphazard and reactive and tethered to our own prejudices about what language and communication are and can be. Just as gravity stands at the center of understanding the physical universe, so, too, multimodality is central to learning how languages are learned and may be learned in better ways.

REFERENCES

  1. Akyel, A. and Erçetin, G. (2009). Hypermedia reading strategies employed by advanced learners of english. SYSTEM 37: 136–152.
  2. Althusser, L. (1985). For Marx. (trans. Ben Brewster). New York: Verso.
  3. Baldry, A. and Thibault, P.J. (2006). Multimodal Transcription and Text Analysis. Sheffield, UK: Equinox Publishing.
  4. Beatrice, B.V. and Luna, R.M. (2013). Teaching english through music: a proposal for multimodal learning activities for primary school children. Encuentro 22: 61–28.
  5. Belcher, D.D. (2017). On becoming facilitators of multimodal composing and digital design. Journal of Second Language Writing 38: 80–85.
  6. Beuchot, M. and Deely, J. (1995). Common sources for the semiotic of Charles Peirce and John Poinsot. The Review of Metaphysics 48 (3): 539–566.
  7. Brown, D.J. (2015). Approaching the grammatical count/mass distinction from a multimodal perspective. TESOL Quarterly 49 (3): 601–608.
  8. Canagarajah, S. (2016). Translingual writing and teacher development in composition. College English 78: 265–273.
  9. Chang, A.C.‐S. (2009). Gains to listeners from reading while listening vs. listening only in comprehending stories. System 37: 652–663.
  10. Cummins, J., Shirley, H., Markus, P., and Kristina Montero, M. (2015). Identity texts and academic achievement: connecting the dots in multilingual school contexts. TESOL Quarterly 49 (3): 555–581.
  11. Danan, M. (2004). Captioning and subtitling: undervalued language learning strategies. Meta 49 (1): 67–77.
  12. Dela Rosa, K., Kevin, Parent, G., and Eskenazi, M. (2010). Multimodal learning of words: a study on the use of speech synthesis to reinforce written text in L2 language learning. Paper P2‐10, Second Language Studies: Acquisition, Learning, Education and Technology (September 22–24): Tokyo.
  13. Dewey, J. and Bentley, A. (1949). Knowing and the Known. Boston: Beacon Press.
  14. Dressman, M. (2008). Using Social Theory in Educational Research: A Practical Guide. London: Routledge.
  15. Dressman, M. (2016). Reading as the interpretation of signs. Reading Research Quarterly 51 (1): 111–136.
  16. Early, M., Kendrick, M., and Potts, D. (2015). Multimodality: out from the margins of english language teaching. TESOL Quarterly 49 (9): 447–460.
  17. Elola, I. and Oskoz, A. (2017). Writing with 21st century social tools in the L2 classroom: New literacies, genres, and writing practices. Journal of Second Language Writing 36: 52–60.
  18. Florey, K.B. (2007). Sister Bernadette's Barking Dog: The Quirky History and Lost Art of Diagramming Sentences. New York: Houghton Mifflin Harcourt.
  19. Freud, S. (2010). The Interpretation of Dreams. (trans. James Strachey). New York: Basic Books.
  20. Gardner, H. (2006). Multiple Intelligences: New Horizons. New York: Basic Books.
  21. Glaser, B. and Strauss, A. (1999). The Discovery of Grounded Theory: Strategies for Qualitative Research. Piscataway, NJ: Transaction Publishers.
  22. Grice, P. (1975). Studies in the Way of Words. New York: Academic Press.
  23. Guichon, N. and Cohen, C. (2016). Multimodality and CALL. In: The Routledge Handbook of Language Learning and Technology (eds. F. Farr and L. Murray), 509–521. London: Routledge.
  24. Guichon, N. and McLornan, S. (2008). The effects of multimodality on L2 learners: implications for CALL resource design. System 36: 85–93.
  25. Hafner, C.A., Chik, A., and Jones, R.H. (2015). Digital literacies and language learning. Language Learning and Technology 19 (3): 1–7.
  26. Halliday, M.A.K. (1994). An Introduction to Functional Grammar. London: Routledge.
  27. Hampel, R. (2003). Theoretical perspectives and new practices in audio‐graphic conferencing for language learning. ReCALL 15 (1): 21–36.
  28. Hsu, C.‐K., Hwang, G.‐J., and Chang, C.‐K. (2014). An automatic caption filtering and partial hiding approach to improving the English listening comprehension of EFL students. Educational Technology and Society 17 (2): 270–283.
  29. Hung, H.‐T., Yang, J.C., Hwang, G.‐J. et al. (2018). A scoping review of research on digital game‐based language learning. Computers and Educationhttps://doi.org/10.1016/j.compedu.2018.07.001.
  30. Jewitt, C. and Kress, G. (2003). Multimodal Literacy. New York, NY: Peter Lang.
  31. Johnson, W.L., Vilhjálmsson, H.H., and Marsella, S. (2005). Serious games for language learning: How much game, how much AI? AIED 125: 306–313.
  32. Krashen, S.D. (1985). The input hypothesis: Issues and implications. Addison‐Wesley Longman Ltd.
  33. Kress, G. (2000). Multimodality: challenges to thinking about language. Teachers of English to Speakers of Other Languages 34 (2): 337–340.
  34. Kress, G. (2010). Multimodality: A Social Semiotic Approach to Contemporary Communication. New York: Routledge.
  35. Kress, G. and Jewitt, C. (2003). Introduction. In: Multimodal Literacy (eds. C. Jewitt and G. Kress), 1–18. New York: Peter Lang.
  36. Kress, G. and van Leeuwen, T. (2006). Reading Images: The Grammar of Visual Design, 2e. New York, NY: Routledge.
  37. Lankshear, C. and Knobel, M. (2011). New Literacies: Everyday Practices and Social Learning, 3e. New York: Open University Press.
  38. van Leeuwen, T. (2015). Multimodality in education: some directions and some questions. TESOL Quarterly 49 (3): 582–589.
  39. Lotherington, H. and Jenson, J. (2011). Teaching multimodal and digital literacy in L2 settings: new literacies, new basics, new pedagogies. Annual Review of Applied Linguistics 31: 226–246.
  40. Manchón, R.M. (2017). The potential impact of multimodal composition on language learning. Journal of Second Language Writing 38: 94–95.
  41. Marchetti, E. and Valente, A. (2017). Interactivity and multimodality in language learning: the untapped potential of audiobooks. Universal Access in the Information Society 17 (2): 257–274.
  42. Miller‐Cochran, S. (2017). Understanding multimodal composing in an L2 writing context. Journal of Second Language Writing 38: 88–89.
  43. Nelson, M.E. (2008). Multimodal synthesis and the voice of the multimedia author in a Japanese EFL context. Innovation and Language Learning and Teaching 2 (1): 65–82.
  44. New London Group (1996). A pedagogy of multiliteracies: designing social futures. Harvard Educational Review 66 (1): 60–92.
  45. Oviatt, S. (1999). Ten myths of multimodal interaction. Communications of the ACM 42 (11): 74–81.
  46. Parmentier, R.J. (1987). Peirce divested for non‐intimates. RSSI: Recherches Sémiotiques/Semiotic Inquiry 7 (1): 19–39.
  47. Partan, S. and Marler, P. (1999). Communication goes multimodal. Science 283 (5405): 1272–1273.
  48. Peirce, C.S. (1955). Logic as semiotic: the theory of signs. In: Philosophical Writings of Peirce (ed. J. Buchler), 98–119. New York: Dover.
  49. Perez, M.M., Van Den Noortgate, W., and Desmet, P. (2013). Captioned video for L2 listening and vocabulary learning: a meta‐analysis. System 41: 720–739.
  50. Peterson, M. (2010a). Computerized games and simulations in computer‐assisted language learning: a meta‐analysis of research. Simulation & Gaming 4 (1): 72–93.
  51. Peterson, M. (2010b). Massively multi‐player online role‐playing games as arenas for second language learning. Computer Assisted Language Learning 23 (5): 429–439.
  52. Qu, W. (2017). For L2 writers, itiis [it is] always the problem of the language. Journal of Second Language Writing 38: 92–93.
  53. Reinhardt, J. and Thorne, S.L. (2011). Beyond comparisons: frameworks for developing digital L2 literacies. In: Present and Future Promises of CALL: From Theory and Research to New Directions in Language Teaching (eds. N. Arnold and L. Ducate), 257–280. San Marcos, TX: CALICO.
  54. Ritterfield, U., Shen, C., Wang, H. et al. (2009). Multimodality and interactivity: connecting properties of serious games with educational outcomes. Cyberpsychology and Behavior 12 (9): 691–697.
  55. Royce, T. (2002). Multimodality in the TESOL classroom: exploring visual‐verbal synergy. TESOL Quarterly 36 (2): 191–205.
  56. Royce, T. (2007). Multimodal communicative competence in second language contexts. In: New Directions in the Analysis of Multimodal Discourse (eds. T. Royce and W.L. Bowcher), 361–390. Mahwah, NJ: Lawrence Erlbaum.
  57. Shin, D.‐s. and Cimasko, T. (2008). Multimodal composition in a college ESL class: new tools, traditional norms. Computers and Composition 25: 376–395.
  58. Smith, B.E., Pacheco, M.B., and de Almeida, C.R. (2017). Multimodal codemeshing: bilingual adolescents' processes composing across modes and languages. Journal of Second Language Writing 36: 6–22.
  59. Sørensen, B.H. and Meyer, B. (2007). Serious games in language learning and teaching‐a theoretical perspective. In: Proceedings of the DiGRA Conference, 559–566.
  60. Specker, E A. (2008). L1/L2 eye movement reading of closed captioning: a multimodal analysis of multimodal use. PhD dissertation. University of Arizona.
  61. Vanderplank, R. (2010). Déjà Vu? A decade of research on language laboratories, television and video in language learning. Language Teaching 43 (10): 1–37.
  62. Vandommelle, G., Van den Brandena, K., Van Gorp, K., and De Maeyer, S. (2017). In‐school and out‐of‐school multimodal writing as an L2 writing resource for beginner learners of Dutch. Journal of Second Language Writing 36: 23–23.
  63. Vygotsky, L.S. (1980). Mind in Society: The Development of Higher Psychological Processes. Cambridge, MA: Harvard University Press.
  64. Warshauer, M. (2017). The pitfalls and potential of multimodal composing. Journal of Second Language Writing 38: 86–87.
  65. Winke, P., Sydorenko, T., and Gass, S. (2013). Factors influencing the use of captions by foreign language learners: an eye‐tracking study. The Modern Language Journal 97 (1): 254–275.
  66. Yang, Y.‐F. (2012). Multimodal composing in digital storytelling. Computers and Composition 29: 221–238.
  67. Yi, Y. (2014). Possibilities and challenges of multimodal literacy practices in teaching and learning english as an additional language. Language and Linguistics Compass 8 (4): 158–169.
  68. Yi, Y. (2017). Establishing multimodal literacy research in the field of L2 writing: let's move the field forward. Journal of Second Language Writing 38: 90–91.
  69. Zarei, G.R. and Khazaie, S. (2011). L2 vocabulary learning through multimodal representations. Procedia Social and Behavioral Sciences 15: 369–375.
  70. Zheng, D., Newgarden, K., and Young, M.F. (2012). Multimodal analysis of language learning in world of warcraft play: languaging as values‐realizing. ReCALL 24 (3): 339–360.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.242.165