A.2 Formal Language Theory Review

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

A.2 Formal Language Theory Review

A.2.1 Strings

Let ∑ be a finite, non-empty set of symbols, called an alphabet.

A string (or word) over ∑ is a finite sequence of symbols put side-by-side.

For example, abacc is a string over {a, b, c}.

The length of string s over ∑, denoted by |s|, is the total number of occurrences of symbols in s.

For example, |abacc| = 5.

A string of length 0 is called the empty string or null string, denoted by ε or λ.

A.2.2 Binary Operation on Strings

Let x = a₁ a₂ … a_m and y = b₁ b₂ … b_n be strings over ∑, of lengths m and n respectively. The concatenation of x and y, denoted by xy, or xoy is the string a₁ a₂ … a_m b₁ b₂ …b_n .

A.2.3 Relationships Between Strings

A string x is called a prefix of a string z iff ∃y, [z = xy]. A string x is called a suffix of a string z iff ∃y, [z = yx]. A string x is called a substring of a string z iff ∃u ∃ v [z = uxv].

x is a proper prefix of z iff x is a prefix of z and x ≠ z. x is a proper suffix of z iff x is a suffix of z and x ≠ z. x is a proper substring of z iff x is a substring of z and x ≠ z.

∑* denotes the set of all strings over ∑.

Theorem A.2.1 Let ∑ be an alphabet. (∑*, o, ε) forms a monoid, where οis concatenation and ε is the empty string.

This theorem means that strings as defined have associative operator οand a unique identity element ε.

Sometimes, especially when applying the formal language theory to programming languages, we use an alternative nomenclature.

Definition A.2.1 (Vocabulary) Vocabulary V is a finite, non-empty set of symbols:

V = v₁, v₂, …

Definition A.2.2 (Strings) Strings are formed by concatenation operation ‘.’, for example, ‘a.b’, which can also be written as ‘ab’ If there is no chance of confusion between symbols ‘a’, ‘b’ and ‘ab’. In short, it means zero or more symbols from some vocabulary V, placed side-by-side in a series or sequence.

Definition A.2.3 (Length of a string) A string has length equal to number of symbols in it and isdenoted by / s /.

Definition A.2.4 (Set of strings) Let V¹ denote the set of all the strings of length 1 from V, V² denote the set of all the strings of length 2 from V, etc.

Then, V² = V¹.V¹; similarly for V³, V⁴, etc.

Then V⁺ = V¹ ∪ V² ∪… and V* = {ε} ∪ V¹ ∪ V² ∪….

String Sets as Algebraic System

λ or ε is the identity element in the algebraic system: < V*, ., λ >, because for any s ∈ V*, s.λ = λ.s = s. Comparing with < N, +, 0 >, we find that the two systems are isomorphic. This result is important and forms basis of many of our later derivations.

A.2.4 Languages

In short, a Language L over vocabulary V is given by L ⊂ V*. Usually, some “alphabet” V_T (a set of Terminal symbols) is used.

We shall call the strings in a language a sentence. Although structurally they look similar, note that “string” and “sentence” carry different meanings. A string is any arbitrary member of the set V*, while a sentence has to be a member of a particular language L, which may be defined over V*, i.e. L ⊂ V*. A language has some form of discipline built into it, as specified by its grammar (see below).

We have already seen that ∑* denotes the set of all strings over ∑.

Inductive Definition of ∑*

Base Clause: ε is in ∑*.
Inductive Clause: If x is in ∑* and b ∈ ∑, then xb ∈ ∑*.
Extremal Clause: ∑* includes only those strings which can be constructed by a finite number of applications of (1) and (2).

A language over ∑ is a subset of ∑*.

For example, let ∑= {a, b}.

Language L = {aⁿ bⁿ | n ≥ 0} = λ, ab, aabb, aaabbb, …

A.2.5 Binary Operation on Languages

Let X and Y be languages over ∑. The concatenation of X and Y, denoted by XY, is defined as

XY = {xy | x ∈ X and y ∈ Y}.

For example,

X = {a, ab, bb, baa}, Y = {0, 11} and XY = {a0, a11, ab0, ab11, bb0, bb11, baa0, baa11}

Note that XYand YX are different in general.

A.2.6 Power Xⁱ f a Language X

Inductive definition of Xⁱof a language X is:

Base Clause: X 0 = {ε}
Inductive Clause: Xⁱ⁺¹ = (Xⁱ) X

For example,

Y = {0, 11}, Y² = Y¹ {0,11} = (Y⁰ {0, 11}){0, 11}) = ({λ}{0, 11}){0, 11} = {00, 011, 110, 1111}

A.2.7 Closure of a Language

Let X be a language over ∑. The closure of X, denoted by X*, is defined as follows:

X* = ∪_i≥₀ Xⁱ =x⁰∪x¹∪x²∪…

The positive closure of X, denoted by X⁺, is defined as follows:

X⁺ = ∪_i≥₁ Xⁱ = x¹∪x²∪…

Note that the set of all strings over ∑ is equal to the closure of ∑, where ∑ is viewed as a set of strings of length 1 rather than an alphabet. This is the reason we denote the set of all strings over ∑ by ∑*.

We often regard X* as a unary operation to X, called Kleene star.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for A.2 Formal Language Theory Review

Create new playlist

Sign In

Sign Up

A.2 Formal Language Theory Review

A.2.1 Strings

A.2.2 Binary Operation on Strings

A.2.3 Relationships Between Strings

String Sets as Algebraic System

A.2.4 Languages

Inductive Definition of ∑*

A.2.5 Binary Operation on Languages

A.2.6 Power Xi f a Language X

A.2.7 Closure of a Language

Table of Contents for
A.2 Formal Language Theory Review

A.2.6 Power Xⁱ f a Language X