Drawing a t-SNE plot for our data

Let's first reorder the data points according to the handwritten numbers:

import numpy as np
X = np.vstack([digits.data[digits.target==i]for i in range(10)])
y = np.hstack([digits.target[digits.target==i] for i in range(10)])

y will become array([0, 0, 0, ..., 9, 9, 9]).

Note that the t-SNE transformation can take minutes to compute on a regular laptop, and the tSNE command can be simply run as follows. We will first try running t-SNE with 250 iterations:

#Here we run tSNE with 250 iterations and time it
%%timeit
tsne_iter_250 = TSNE(init='pca',method='exact',n_components=2,n_iter=250).fit_transform(X)

Let's draw a scatter plot to see how the data cluster:

#We import the pandas and matplotlib libraries
import pandas as pd
import matplotlib
matplotlib.style.use('seaborn')
#Here we plot the tSNE results in a reduced two-dimensional space
df = pd.DataFrame(tsne_iter_250)
plt.scatter(df[0],df[1],c=y,cmap=matplotlib.cm.get_cmap('tab10'))
plt.show()

We can see that the clusters are not well separated at 250 iterations:

Let's now try running with 2000 iterations:

#Here we run tSNE for 2000 iteractions
tsne_iter_2000 = TSNE(init='pca',method='exact',n_components=2,n_iter=2000).fit_transform(X)
#Here we plot the figure
df2 = pd.DataFrame(tsne_iter_2000)
plt.scatter(df2[0],df2[1],c=y,cmap=matplotlib.cm.get_cmap('tab10'))
plt.show()

As seen from the following screenshot, the samples appear as 10 distinct blots of clusters. By running 2000 iterations, we have obtained far more satisfying results:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.109.223