Summary

In this chapter, we looked at:

  • Partition-based clustering.
  • The k-means algorithm is a partition-based clustering algorithm. The centroids of clusters are defined as a representative of each cluster. In k-means clustering, a set of n data points in a D-dimensional space and an integer k are given. The problem is to distribute a set of k points in the centers to minimize the SSE.
  • The k-medoids algorithm is a partition-based clustering algorithm. The representatives of each resulting clusters are chosen from the dataset itself, that is, the data objects belong to it.
  • CLARA depends on sampling. It draws a sample from the original dataset instead of the entire dataset. PAM is then applied to each sampling. Then, the best result is kept during all the iterations.
  • CLARANS is a clustering algorithm based on randomized search.
  • The affinity propagation clustering algorithm recursively passes affinity messages between objects or points and converges to exemplars adaptively.
  • Spectral clustering is used to construct graph partitions based on eigenvectors of the adjacency matrix.
  • Hierarchical clustering decomposes a dataset D into levels of nested clusters; this is represented by a dendrogram, a tree that iteratively splits the dataset D into smaller subsets. The process stops only after each subset consists of only one object.

The next chapter will cover more advanced topics related to clustering algorithms, density-based algorithms, grid-based algorithms, the EM algorithm, high-dimensional algorithms, constraint-based clustering algorithms, and so on.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.174.0