Cluster algorithms

Cluster algorithms use a measure of similarity to identify observations or data attributes that contain similar information. They summarize a dataset by assigning a large number of data points to a smaller number of clusters so that the cluster members are more closely related to each other than to members of other clusters.

Cluster algorithms primarily differ with respect to the type of clusters that they will produce, which implies different assumptions about the data generation process, listed as follows:

  • K-means clustering: Data points belong to one of the k clusters of equal size that take an elliptical form
  • Gaussian mixture models: Data points have been generated by any of the various multivariate normal distributions
  • Density-based clusters: Clusters are of an arbitrary shape and are defined only by the existence of a minimum number of nearby data points
  • Hierarchical clusters: Data points belong to various supersets of groups that are formed by successively merging smaller clusters
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
44.222.146.114