7.7 Summary

In this chapter we implemented the K-means algorithm—a simple and easy-to-implement cluster analysis technique that is vulnerable to problems in some cases. This approach can be applied to a wide variety of application domains and performs in a fairly efficient manner. Other techniques for cluster analysis do exist, and we urge you to consider some of them for comparison purposes.

As part of the implementation, we revisited the notion of iteration and presented a more detailed view of the while statement. We utilized lists—specifically, parallel lists—and dictionaries as a means of organizing our data. Finally, we created a visualization for our cluster analysis results.

Key Terms

  • centroid

  • change of state

  • cluster analysis

  • clusters

  • condition

  • data mining

  • definite iteration

  • Euclidian distance

  • indefinite iteration

  • infinite loop

  • initialization

  • K-means algorithm

  • outliers

  • visualization

Python Keywords

  • while

Programming Exercises

  1. 7.1     Find another data set that interests you and use the clustering techniques described in this chapter on that data. Possibilities include sports statistics, weather data, and medical data.

  2. 7.2     Research other cluster analysis algorithms, and implement them on the data from this chapter. How do the results differ?

Design Credits: Calculator Icon made by Smashicons from www.flaticon.com

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.231.163