4.2 What Is Data?

Every year, thousands of earthquakes occur around the world. Most of them are so mild that they are barely noticed except by scientists with monitoring equipment. If we consider only those earthquakes that people feel, then perhaps 35 occur on a typical day. Some days, there may be only 10; on other days, there may be 40 or 50.

What we have just described is a way that earthquake occurrence might be reported through data—an assortment of items, many times numeric, that have been observed, measured, or collected by some means. These data items pertain to some experiment, event, or activity that we are interested in exploring. Data, sometimes referred to as “raw data,” represents the starting point for analysis that can be performed in an attempt to discover underlying characteristics. Data can contain information in various forms, and the information can be used in a variety of ways. Sometimes this information allows us to make generalizations about the data items. We may also be able to make predictions about future events based on the data. This kind of analysis is based on the mathematical science of statistics.

This chapter centers on data, information, and statistics. We will focus on ways that a programming language like Python can help us perform some of the basic data processing tasks that are commonly carried out with large amounts of data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.132.107