Density-based clustering algorithms

In density-based methods, clusters are considered as regions where the multiple objects' density is high, which are separated by regions with a low density of objects.

The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is one of the first density clustering algorithms. The basis of this algorithm is several statements, detailed as follows:

  • The  of an object is a neighborhood of the  radius of an object.
  • The root object is an object whose  contains a minimum non-zero number of objects. Assume that this minimum number equals to a predefined value named MinPts.
  • The p object is directly densely accessible from the q object if p is in the  of q and q is the root object.
  • The p object is densely accessible from the q object for the given  and MinPts if there is a sequence of  objects, where  and , such that  is directly densely accessible from , .
  • The p object is densely connected to the q object for the given  and MinPts if there is an o object such that p and q are densely accessible from o.

The DBSCAN algorithm checks the neighborhood of each object to search for clusters. If the  of the p object contains more points than MinPts, then a new cluster is created with the p object as a root object. DBSCAN then iteratively collects objects directly densely accessible from root objects, which can lead to the union of several densely accessible clusters. The process ends when no new objects can be added to any cluster.

Unlike the partition-based methods, DBSCAN does not require the number of clusters to be specified in advance; it only requires the values of the  and MinPts parameters, which directly affect the result of clustering. The optimal values of these parameters are difficult to determine, especially for multidimensional data spaces. Also, the distribution of data in such spaces is often asymmetrical, which makes it impossible to use global density parameters for their clustering. For clustering multidimensional data spaces, there is the SUBCLU (Subspace Clustering) algorithm, which is based on the DBSCAN algorithm.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.126.199