Coding a hierarchical clustering algorithm

Let's learn how we can code a hierarchical algorithm in Python:

  1.  We will first import AgglomerativeClustering from the sklearn.cluster library, along with the pandas and numpy packages:
from sklearn.cluster import AgglomerativeClustering
import pandas as pd
import numpy as np
  1. Then we will create 20 data points in a two-dimensional problem space:
dataset = pd.DataFrame({
'x': [11, 21, 28, 17, 29, 33, 24, 45, 45, 52, 51, 52, 55, 53, 55, 61, 62, 70, 72, 10],
'y': [39, 36, 30, 52, 53, 46, 55, 59, 63, 70, 66, 63, 58, 23, 14, 8, 18, 7, 24, 10]
})



  1. Then we create the hierarchical cluster by specifying the hyperparameters. We use the fit_predict function to actually process the algorithm:
cluster = AgglomerativeClustering(n_clusters=2, affinity='euclidean', linkage='ward') 
cluster.fit_predict(dataset)
  1. Now let's look at the association of each data point to the two clusters that were created:

You can see that the cluster assignment for both hierarchical and k-means algorithms are very similar.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.63.136