Gaussian mixture

In Chapter 2, Introduction to Semi-Supervised Learning, we discussed the generative Gaussian mixture model in the context of semi-supervised learning. In this paragraph, we're going to apply the EM algorithm to derive the formulas for the parameter updates.

Let's start considering a dataset, X, drawn from a data generating process, pdata:

We assume that the whole distribution is generated by the sum of k Gaussian distributions so that the probability of each sample can be expressed as follows:

In the previous expression, the term wj = P(N=j) is the relative weight of the jth Gaussian, while μj and Σj are the mean and the covariance matrix. For consistency with the laws of probability, we also need to impose the following:

Unfortunately, if we try to solve the problem directly, we need to manage the logarithm of a sum and the procedure becomes very complex. However, we have learned that it's possible to use latent variables as helpers, whenever this trick can simplify the solution.

Let's consider a single parameter set θ=(wjμjΣj) and a latent indicator matrix Z where each element zij is equal to 1 if the point xi has been generated by the jth Gaussian, and 0 otherwise. Therefore, each zij is Bernoulli distributed with parameters equal to p(j|xiθt).

The joint log-likelihood can hence be expressed using the exponential-indicator notation, as follows:

The index, i, is referred to the samples, while j refers to the Gaussian distributions. If we apply the chain rule and the properties of a logarithm, the expression becomes as follows:

The first term represents the probability of xi under the jth Gaussian, while the second one is the relative weight of the jth Gaussian. We can now compute the Q(θ;θt) function using the joint log-likelihood:

Exploiting the linearity of E[•], the previous expression becomes as follows:

The term p(j|xiθt) corresponds to the expected value of zij considering the complete data, and expresses the probability of the jth Gaussian given the sample xi. It can be simplified considering Bayes' theorem:

The first term is the probability of xi under the jth Gaussian with parameters θt, while the second one is the weight of the jth Gaussian considering the same parameter set θt. In order to derive the iterative expressions for the parameters, it's useful to write the complete formula for the logarithm of a multivariate Gaussian distribution:

To simplify this expression, we use the trace trick. In fact, as (xi - μj)T Σ-1 (xi - μj) is a scalar, we can exploit the properties tr(AB) = tr(BA) and tr(c) = c where A and B are matrices and c:

Let's start considering the estimation of the mean (only the first term of Q(θ;θt) depends on mean and covariance):

Setting the derivative equal to zero, we get the following:

In the same way, we obtain the expression of the covariance matrix:

To obtain the iterative expressions for the weights, the procedure is a little bit more complex, because we need to use the Lagrange multipliers (further information can be found in http://www.slimy.com/~steuard/teaching/tutorials/Lagrange.html). Considering that the sum of the weights must always be equal to 1, it's possible to write the following equation:

Setting both derivatives equal to zero, from the first one, considering that wj = p(j|θ), we get the following:

While from the second derivative, we obtain the following:

The last step derives from the fundamental condition: 

Therefore, the final expression of the weights is as follows:

At this point, we can formalize the Gaussian mixture algorithm:

  • Set random initial values for wj(0)θ(0)j and Σ(0)j
  • E-Step: Compute p(j|xiθt) using Bayes' theorem: p(j|xiθt) = α w(t)j p(xi|j, θt)
  • M-Step: Compute wj(t+1), θ(t+1)j and Σ(t+1)j using the formulas provided previously

The process must be iterated until the parameters become stable. In general, the best practice is using both a threshold and a maximum number of iterations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.12.140