Home Page Icon
Home Page
Table of Contents for
Table of Contents
Close
Table of Contents
by Mohamed Zakaria Kurdi
Natural Language Processing and Computational Linguistics
Cover
Title
Copyright
Introduction
I.1. The definition of NLP
I.2. The structure of this book
1 Linguistic Resources for NLP
1.1. The concept of a corpus
1.2. Corpus taxonomy
1.3. Who collects and distributes corpora?
1.4. The lifecycle of a corpus
1.5. Examples of existing corpora
2 The Sphere of Speech
2.1. Linguistic studies of speech
2.2. Speech processing
3 Morphology Sphere
3.1. Elements of morphology
3.2. Automatic morphological analysis
4 Syntax Sphere
4.1. Basic syntactic concepts
4.2. Elements of formal syntax
4.3. Syntactic formalisms
4.4. Automatic parsing
Bibliography
Index
End User License Agreement
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Cover
Next
Next Chapter
Title
Table of Contents
Cover
Title
Copyright
Introduction
I.1. The definition of NLP
I.2. The structure of this book
1 Linguistic Resources for NLP
1.1. The concept of a corpus
1.2. Corpus taxonomy
1.3. Who collects and distributes corpora?
1.4. The lifecycle of a corpus
1.5. Examples of existing corpora
2 The Sphere of Speech
2.1. Linguistic studies of speech
2.2. Speech processing
3 Morphology Sphere
3.1. Elements of morphology
3.2. Automatic morphological analysis
4 Syntax Sphere
4.1. Basic syntactic concepts
4.2. Elements of formal syntax
4.3. Syntactic formalisms
4.4. Automatic parsing
Bibliography
Index
End User License Agreement
Guide
Cover
Table of Contents
Begin Reading
List of Illustrations
1 Linguistic Resources for NLP
Figure 1.1.
Extract from a parallel corpus [MCE 96]
Figure 1.2.
Lifecycle of a corpus
Figure 1.3.
Data collection system using the Wizard of Oz method
Figure 1.4.
Diagram of a corpus data collection system using a prototype
Figure 1.5.
Transcription example using the software Transcriber
Figure 1.6.
Segment of a corpus analyzed using parts of speech
Figure 1.7.
Extract from the Penn Treebank
Figure 1.8.
Extract from a tree corpus for French
Figure 1.9.
Semantic annotation with a has_target relationship
2 The Sphere of Speech
Figure 2.1.
Communication system
Figure 2.2.
Speech organs
Figure 2.3.
Position of the soft palate during the production of French vowels
Figure 2.4.
Parts and aperture of the tongue
Figure 2.5.
Degree of aperture
Figure 2.6.
Displacement of air molecules by the vibrations of a tuning fork
Figure 2.7.
Frequency and amplitude of a simple wave
Figure 2.8.
An aperiodic wave
Figure 2.9.
Analysis of a complex wave
Figure 2.10.
A collection of tuning forks plays the role of a spectrograph
Figure 2.11.
Spectrogram of a French speaker saying “la rose est rouge” generated using the Prat software
Figure 2.12.
Spectrograms of the French vowels: [a], [i] and [u]
Figure 2.13.
Spectrograms of several non-sense words with consonants in the center
Figure 2.14.
Physiology of the ear
Figure 2.15.
Lip rounding
Figure 2.16.
Front vowels and back vowels
Figure 2.17.
French vowel trapezium
Figure 2.18.
Nasal and oral consonants
Figure 2.19.
Examples of some possible syllabic structures in French
Figure 2.20.
Examples of how double consonants are dealt with by the timing tier
Figure 2.21.
Propagation of nasality in Warao
Figure 2.22.
General architecture of speech recognition systems
Figure 2.23.
Markovian model of Xavier’s moods
Figure 2.24.
HMM diagram of Xavier’s behavior and his moods
Figure 2.25.
Markov chain for the word “ouvre” (open)
Figure 2.26.
Markov chain for the recognition of vocal commands
Figure 2.27.
HMM for the word “ouvre” (open)
Figure 2.28.
Trellis with three possible paths
Figure 2.29.
Typical architecture of an SS system
Figure 2.30.
General architecture of a concatenation synthesis system
Figure 2.31.
Serial and parallel architecture of formant speech synthesis systems
3 Morphology Sphere
Figure 3.1.
FSM for expressions of encouragement
Figure 3.2.
Examples of regular expressions with their FSM equivalence
Figure 3.3.
Conjugation of the verbs poser and porter in the present indicative tense
Figure 3.4.
Correspondence pair for the word houses
Figure 3.5.
FST for some words in French with the prefix “anti–”
Figure 3.6.
Partial FST for the derivation of some French words
Figure 3.7.
Kay and Kaplan diagram
Figure 3.8.
Xerox approach to the use of FST in morphological analysis
Figure 3.9.
A micro-text tagged with POS
Figure 3.10.
Tag sequences for “The written history of the Gauls is known”
Figure 3.11.
Architecture of the Brill tagger [BRI 95]
Figure 3.12.
Example of transformation-based learning
4 Syntax Sphere
Figure 4.1.
The role of grammar according to Chomsky
Figure 4.2.
Relationships in the framework of formalism, WG [HUD 10]
Figure 4.3.
Analysis of a simple sentence by the formalism of WG
Figure 4.4.
Example of an analysis by chunks [ABN 91a]
Figure 4.5.
Example of attachment ambiguity of a prepositional phrase
Figure 4.6.
Syntax trees of some noun phrases
Figure 4.7.
Grammar for the structures as shown in Figure 4.6
Figure 4.8.
Syntax trees and rewrite rules of an adjective phrase
Figure 4.9.
Grammar for the structures presented in Figure 4.8
Figure 4.10.
Grammar for the noun phrase with a recursion
Figure 4.11.
Examples of VP with different complement types
Figure 4.12.
Analysis of two types of sentences with two types of complements
Figure 4.13.
Example of analysis of two relative sentences
Figure 4.14.
Examples of the coordination of two phrases and two sentences
Figure 4.15.
Two syntax tree for a syntactically ambiguous sentence
Figure 4.16.
Hierarchy of formal grammars
Figure 4.17.
Grammar for the language a
n
b
n
c
n
Figure 4.18.
Syntax tree for the strings: abc and aabbcc
Figure 4.19.
The derivation of strings: ab, aabb, aaabbb
Figure 4.20.
Example of a grammar in Chomsky normal form with examples of syntax trees
Figure 4.21.
Syntax tree of an NP in Chomsky normal form
Figure 4.22.
Example of grammar in Greibach normal form
Figure 4.23.
Regular grammar that generates the language a
n
b
m
Figure 4.24.
Types of branching in complex sentences
Figure 4.25.
Type-2 grammar modified to account for the agreement
Figure 4.26.
Feature structures of the noun “house” and of the verb “love”
Figure 4.27.
CFS of a simple sentence
Figure 4.28.
Feature graphs for the agreement feature for the words “house” and “love”
Figure 4.29.
Example of structures of shared value and of a reentrant structure
Figure 4.30.
Example of structures of shared value and of a reentrant structure
Figure 4.31.
Examples of feature structures with subsumption relationships
Figure 4.32.
Examples of unifications
Figure 4.33.
DCG Grammar
Figure 4.34.
DCG enriched with FS
Figure 4.35.
Rewrite rule and syntax tree of a complex noun phrase
Figure 4.36.
Examples of phrases with their heads
Figure 4.37.
Diagrams of the two basic rules
Figure 4.38.
Examples of noun phrases
Figure 4.39.
Diagram and example of a determiner phrase according to [ABN 87]
Figure 4.40.
Example of the processing of a verb phrase with the X-bar theory
Figure 4.41.
Diagram and example of analysis of entire sentences
Figure 4.42.
Analysis of a completive subordinate
Figure 4.43.
Diagram of a typed FS in HPSG
Figure 4.44.
Simplified lexical entry of “house”
Figure 4.45.
Some abbreviations of FS in HPSG
Figure 4.46.
Enriched FS of the words “house” and “John”
Figure 4.47.
Some simplified FS of verbs
Figure 4.48.
FS of the verb “sees”
Figure 4.49.
General diagram of l-rules
Figure 4.50.
Rule of plural
Figure 4.51.
Rule of derivation of an agent noun from the verb
Figure 4.52.
Head-Complement Rule
Figure 4.53.
Head-Complement Rule applied to a transitive verb
Figure 4.54.
Head-Modifier Rule
Figure 4.55.
Head-Specifier Rule
Figure 4.56.
Lexical entry of the determiner “the”
Figure 4.57.
Feature structures of the noun phrase: the house
Figure 4.58.
Analysis of the verb phrase: sees the house
Figure 4.59.
The FS of the pronoun “the”
Figure 4.60.
The analysis of the sentence: he sees the house
Figure 4.61.
Examples of initial and auxiliary elementary trees
Figure 4.62.
Diagram and example of substitution in LTAG
Figure 4.63.
General diagram and example of adjunction
Figure 4.64.
An example of a derived tree and a corresponding derivation tree
Figure 4.65.
Examples of feature structures associated with elementary trees
Figure 4.66.
An example of a substitution with unification
Figure 4.67.
Diagram of an addition with unification
Figure 4.68.
Example of a recursive transition network
Figure 4.69.
A DCG and the corresponding RTNs TRVIDF PP
Figure 4.70.
Context-free grammars for the parsing of a fragment
Figure 4.71.
Example of parsing with a top-down algorithm
Figure 4.72.
Basic top-down algorithms
Figure 4.73.
Micro-grammar with a left recursion
Figure 4.74.
Left recursion with a top-down algorithm
Figure 4.75.
Example of parsing with a bottom-up algorithm
Figure 4.76.
Basic top-down algorithms
Figure 4.77.
CFG Grammar
Figure 4.78.
Repeated backtracking with a top-down algorithm
Figure 4.79.
Left-corner algorithm
Figure 4.80.
Example of parsing with the left-corner algorithm
Figure 4.81.
Table of an incomplete parsing
Figure 4.82.
Table of a complete parsing of a sentence
Figure 4.83.
Partial active chart
Figure 4.84.
Diagram of the first fundamental rule
Figure 4.85.
Example of application of the fundamental rule
Figure 4.86.
Tabular parsing algorithm with a bottom-up approach
Figure 4.87.
Example of a probabilistic context-free grammar for a fragment of French
Figure 4.88.
Parsing tree for a sentence from the PCFG of the Figure 4.87
Figure 4.89.
Supervised learning of a PCFG
Figure 4.90.
General structure of the parse table of the CYK algorithm
Figure 4.91.
The first step in the execution of the CYK algorithm
Figure 4.92.
The second step in the execution of the CYK algorithm
Figure 4.93.
The third step in the execution of the CYK algorithm
Figure 4.94.
The fourth step in the execution of the CYK algorithm
Figure 4.95.
Architecture of a neural network for handwritten digit recognition [NIE 14]
Figure 4.96.
Example of a recurring network
List of Tables
2 The Sphere of Speech
Table 2.1.
Examples of IPA transcriptions from French and English
Table 2.2.
The three first formants of the vowels [a], [i] and [u]
Table 2.3.
Examples of rounded and unrounded vowels in French
Table 2.4.
Nasal vowels in French
Table 2.5.
Oral vowels in French
Table 2.6.
Places of articulation of French consonants
Table 2.7.
French semi-vowels
Table 2.8.
Examples of distinctive features according to the taxonomy by Chomsky and Halle [CHO 68]
Table 2.9.
Constraint forbidding three successive consonants in Egyptian Arabic
Table 2.10.
Constraints involved in the case of joining (liaison) in French
Table 2.11.
Classification parameters of speech recognition systems
Table 2.12.
Probabilities of Xavier’s moods tomorrow, with the knowledge of his mood today
Table 2.13.
Probability of Xavier’s behavior, knowing his mood
Table 2.14.
Micro-corpus unigrams
Table 2.15.
Bigrams in the micro-corpus with their frequencies
Table 2.16.
Abbreviations to be normalized before synthesis
Table 2.17.
Examples of transcriptions with the Arpabet format
3 Morphology Sphere
Table 3.1.
Examples of Arabic words derived from the stem k-t-b
Table 3.2.
Examples of words in Turkish
Table 3.3.
Examples of prefixes commonly used in English
Table 3.4.
Examples of suffixes commonly used in English
Table 3.5.
Examples of collocations in three French literary corpora [LEG 12]
Table 3.6.
Examples of colligation
Table 3.7.
Successors of the word read [FRA 92]
Table 3.8.
Bigrams of the words bonbon and bonbonne
Table 3.9.
Some regular expressions with simple sequences
Table 3.10.
Regular expressions with character categories
Table 3.11.
Priority of operators in regular expressions
Table 3.12.
FSM transition table for expressions of encouragement
Table 3.13.
A minimal list of tags
4 Syntax Sphere
Table 4.1.
Clefting patterns
Table 4.2.
Examples of restrictive negation
Table 4.3.
A few examples of variation of the word order at the oral framework
Table 4.4.
Examples of noun phrases and their morphological sequences
Table 4.5.
Summary of formal grammars
Table 4.6.
Adopted notation and variants in the literature
Table 4.7.
Types in HPSG formalism [POL 97]
Table 4.8.
Labels adopted for the annotation of RTN
Table 4.10.
Table of left-corners of the grammar of the Figure 4.70
Table 4.11.
Summary of spaces required by the three parsing approaches [RES 92a]
Pages
C1
iii
iv
v
ix
x
xi
xii
xiii
xiv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
275
276
277
G1
G2
G3
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset