Bisecting K-means clustering of the neighborhood using Spark MLlib

In the previous section, we saw how to cluster similar houses together to determine the neighborhood. The bisecting K-means is also similar to regular K-means except that the model training that takes different training parameters as follows:

// Cluster the data into two classes using KMeans 
val bkm = new BisectingKMeans()
.setK(5) // Number of clusters of the similar houses
.setMaxIterations(20)// Number of max iteration
.setSeed(12345) // Setting seed to disallow randomness
val model = bkm.run(landRDD)

You should refer to the previous example and just reuse the previous steps to get the trained data. Now let's evaluate clustering by computing WSSSE as follows:

val WCSSS = model.computeCost(landRDD)
println("Within-Cluster Sum of Squares = " + WCSSS) // Less is better

You should observe the following output: Within-Cluster Sum of Squares = 2.096980212594632E11. Now for more analysis, please refer to step 5 in the previous section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.87.152