GMM uses the expectation-maximization algorithm to identify the components of the mixture of Gaussian distributions. The goal is to learn the probability distribution parameters from unlabeled data.
The algorithm proceeds iteratively as follows:
- Initialization—Assume random centroids (for example, using k-Means)
- Repeat the following steps until convergence (that is, changes in assignments drop below the threshold):
- Expectation step: Soft assignment—compute probabilities for each point from each distribution
- Maximization step: Adjust normal-distribution parameters to make data points most likely
The following screenshot shows the GMM cluster membership probabilities for the Iris dataset as contour lines: