Implementing k-means with scikit-learn

Having developed our own k-means clustering model, we can now learn how to use scikit-learn for a quicker solution by performing the following steps:

First, import the KMeans class and initialize a model with three clusters as follows:

>>> from sklearn.cluster import KMeans
>>> kmeans_sk = KMeans(n_clusters=3, random_state=42)

The KMeans class takes in the following important parameters:

We then fit the model on the data:

>>> kmeans_sk.fit(X)

After that, we can obtain the clustering results, including the clusters for data samples and centroids of individual clusters:

>>> clusters_sk = kmeans_sk.labels_
>>> centroids_sk = kmeans_sk.cluster_centers_

Similarly, we plot the clusters along with the centroids:

>>> for i in range(k):
...     cluster_i = np.where(clusters_sk == i)
...     plt.scatter(X[cluster_i, 0], X[cluster_i, 1])
>>> plt.scatter(centroids_sk[:, 0], centroids_sk[:, 1],
                               marker='*', s=200, c='#050505')
>>> plt.show()

This will result in the following output:

Table of Contents for Implementing k-means with scikit-learn

Create new playlist

Sign In

Sign Up

Table of Contents for
Implementing k-means with scikit-learn