Understanding t-Distributed stochastic neighbor embedding 

The t-SNE method was proposed by van der Maaten and Hinton in 2008 in the publication Visualizing Data using t-SNE. It is a nonlinear dimension reduction method that aims to effectively visualize high-dimensional data. t-SNE is based on probability distributions with random walk on neighborhood graphs to find the structure within the data. The mathematical details of t-SNE are beyond the scope of this book, and readers are advised to read the paper for more details.

In short, t-SNE is a way to capture non-linear relationships in a high-dimensional data. This is particularly useful when we are trying to extract features from a high-dimensional matrix such as image processing, biological data, and network information. It enables us to reduce high-dimensional data to two or three dimensions; one interesting feature of t-SNE is that it is stochastic, indicating that the final results it shows each time will be different, but still they are all equally correct. Therefore, in order to get the best performance in t-SNE dimension reduction, it is advisable to first perform PCA dimension reduction on the big dataset, and then incorporate the PCA dimensions into t-SNE for subsequent dimension reduction. Thus, you get more consistent and replicable results. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.16.152