Doubly-Stochastic EM

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

3.4. Doubly-Stochastic EM

This section presents an EM-based algorithm for problems that possesses partial data with multiple clusters. The algorithm is referred to as as a doubly-stochastic EM. To facilitate the derivation, adopt the following notations:

X = {x_t ∊ R^D; t = 1,..., T } is a sequence of partial-data.
Z = {z_t ∊ C; t = 1, . . . , T} is the set of hidden-states.
C = {C⁽¹⁾, . . . _,C(^J)}, where J is the number of hidden-states.
Г = {γ⁽¹⁾⁾, . . . , γ^(K)} is the set of values that x_t can attain, where K is the number of possible values for x_t.

Also define two sets of indicator variables as:

and

Using these notations and those defined in Section 3.2, Q(θ|θ_n) can be written as

Equation 3.4.1

where

If θ defines a GMM—that is, θ = {π⁽^j⁾, μ⁽^j⁾, — then

3.4.1. Singly-Stochastic Single-Cluster with Partial Data

This section demonstrates how the general formulation in Eq. 3.4.1 can be applied to problems with a single cluster and partially observable data. Referring to Example 2 shown in Figure 3.3(b), let X = {x1, x₂ x₃, x₄, y} = {1, 2, 3, 4, {5 or 6}} be the observed data, where y = {5 or 6} is the observation with missing information. The information is missing because the exact value of y is unknown. Also let z ∊ Г, where Г = {γ⁽¹⁾, γ⁽²⁾} = {5, 6}, be the missing information. Since there is one cluster only and x₁ to x₄ are certain, define θ ≡ {μ,σ²}, , set π⁽¹⁾ = 1.0 and write Eq. 3.4.1 as

Equation 3.4.2

Note that the discrete density p(y = ^γ(^(k⁾|y ∊ Г, θ) can be interpreted as the product of density p(y = ^γ^(k)|y ∊ Г) and the functional value of p(y|θ) at y = γ^(k⁾ as shown in Figure 3.7.

Figure 3.7. The relationship between p(y|θ₀), p(y|y ∊ Г), p(y|y ∊ Г, θ₀), and P(y = γ^(k) |y ∊ Г, θ₀), where Г = {5, 6}.

[View full size image]

Assume that at the start of the iterations, n = 0 and } = {0, 1}. Then, Eq. 3.4.2 becomes

[View full size image]

Equation 3.4.3

In the M-step, compute θ₁ according to

The next iteration replaces θ₀ in Eq. 3.4.3 with θ₁ to compute Q(θ|θ₁). The procedure continues until convergence. Table 3.4 shows the value of μ and σ² in the course of EM iterations when their initial values are μ₀ = 0 and . Figure 3.8 depicts the movement of the Gaussian density function specified by μ and σ² during the EM iterations.

Table 3.4. Values of μ and σ² in the course of EM iterations. Data shown in Figure 3.3(b) were used for the EM iterations.
Iteration (n)	Q(θ\|θ_n)	μ	σ²
0	−∞	0.00	1.00
1	-29.12	3.00	7.02
2	-4.57	3.08	8.62
3	-4.64	3.09	8.69
4	-4.64	3.09	8.69
5	-4.64	3.09	8.69

Figure 3.8. Movement of a Gaussian density function during the EM iterations. The density function is to fit the data containing a single cluster with partially observable data.

[View full size image]

3.4.2. Doubly-Stochastic (Partial-Data and Hidden-State) Problem

Here, the single-dimension example shown in Figure 3.9 is used to illustrate the application of Eq. 3.4.1 to problems with partial-data and hidden-states. Review the following definitions:

X = {x₁, x₂,..., x₆, y₁, y₂} is the available data with certain {x₁,..., x₆} and uncertain {y₁, y₂} observations.
where z_t and is the set of hidden-states.
and such that y₁ ∊ Г₁ and y₂ ∊ Г₂ are the values attainable by y₁ and y₂, respectively.
J = 2 and K = 2.

Figure 3.9. Single-dimension example illustrating the idea of hidden-states and partial-data.

[View full size image]

Using the preceding notations results in

Equation 3.4.4

where is the posterior probability that y_t' is equal to given that y_t' is generated by cluster C⁽^j). Note that when the values of y₁ and y₂ are certain (e.g., it is known that y₁ ═┴ 5, and and become so close that we can consider y₂ = 9), then K = 1 and Г₁ = {γ₁} = {5} and Г₂ = {γ₂} = {9}. In such cases, the second term of Eq. 3.4.4 becomes

Equation 3.4.5

Replacing the second term of Eq. 3.4.4 by Eq. 3.4.5 and seting x₇ = y₁ and x₈ = y₂ results in

which is the Q-function of a GMM without partially unknown data with all observable data being certain.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Doubly-Stochastic EM

Create new playlist

Sign In

Sign Up

3.4. Doubly-Stochastic EM

3.4.1. Singly-Stochastic Single-Cluster with Partial Data

Figure 3.7. The relationship between p(y|θ0), p(y|y ∊ Г), p(y|y ∊ Г, θ0), and P(y = γ(k) |y ∊ Г, θ0), where Г = {5, 6}.

Figure 3.8. Movement of a Gaussian density function during the EM iterations. The density function is to fit the data containing a single cluster with partially observable data.

3.4.2. Doubly-Stochastic (Partial-Data and Hidden-State) Problem

Figure 3.9. Single-dimension example illustrating the idea of hidden-states and partial-data.

Table of Contents for
Doubly-Stochastic EM

Figure 3.7. The relationship between p(y|θ₀), p(y|y ∊ Г), p(y|y ∊ Г, θ₀), and P(y = γ^(k) |y ∊ Г, θ₀), where Г = {5, 6}.