List of Figures

Chapter 1. Philosophies of data science

Figure 1.1. Some stereotypical perspectives on data science

Figure 1.2. The data science process

Chapter 2. Setting goals by asking good questions

Figure 2.1. The first step of the preparation phase of the data science process: setting goals

Figure 2.2. The recipe for a useful answer in a data science project

Chapter 3. Data all around us: the virtual wilderness

Figure 3.1. The second step of the preparation phase of the data science process: exploring available data

Figure 3.2. We’re currently in the refinement phase of big data collection and use and in the widespread adoption phase of statistical analysis of big data.

Figure 3.3. Three ways a data scientist might access data: from a file system, database, or API

Chapter 4. Data wrangling: from capture to domestication

Figure 4.1. The third step of the preparation phase of the data science process: data wrangling

Chapter 5. Data assessment: poking and prodding

Figure 5.1. The fourth and final step of the preparation phase of the data science process: assessing available data and progress so far

Figure 5.2. Two graphs redrawn from Klimt and Yang’s “The Enron Corpus: A New Dataset for Email Classification Research” (published by Springer in Machine Learning: ECML 2004).

Figure 5.3. The logarithms of the best men’s 400 m performances of all time seem to fit the tail of a normal distribution.

Figure 5.4. A plot of three classes, given by shape, in two dimensions

Chapter 6. Developing a plan

Figure 6.1. The first step of the build phase of the data science process: planning

Figure 6.2. A flowchart showing a possible plan for developing a beer recommendation application

Figure 6.3. A flowchart showing the basic plan for my gene interaction project

Figure 6.4. A flowchart showing the basic plan for my project involving the analysis of track and field performances

Figure 6.5. A flowchart showing the basic plan for the Enron project

Chapter 7. Statistics and modeling: concepts and foundations

Figure 7.1. An important aspect of the build phase of the data science process: statistical data analysis

Figure 7.2. A representation of a linear model (line) and some data (dots) that the model attempts to describe. The line is a mathematical model, and its optimal parameters—slope and intercept—can be found using statistical modeling techniques.

Chapter 8. Software: statistics in action

Figure 8.1. An important aspect of the build phase of the data science process: statistical software and engineering

Figure 8.2. First page of the spreadsheet I used to simulate the management of an apartment building in my college finance class

Chapter 9. Supplementary software: bigger, faster, more efficient

Figure 9.1. An important aspect of the build phase of the data science process: using supplementary software to optimize the product

Chapter 10. Plan execution: putting it all together

Figure 10.1. The final step of the build phase of the data science process: executing the plan efficiently and carefully

Figure 10.2. A directed acyclic graph (DAG) representing a model of the comparison of gene expression measurements based on different laboratory techniques

Figure 10.3. The main table of results for the microarray protocol comparison project, as clipped from a draft submitted to a scientific journal

Chapter 11. Delivering a product

Figure 11.1. The first step of the finishing phase of the data science process: product delivery

Figure 11.2. The inverted pyramid of journalism

Chapter 12. After product delivery: problems and revisions

Figure 12.1. The second step of the finishing phase of the data science process: revising the product after initial delivery to the customer

Chapter 13. Wrapping up: putting the project away

Figure 13.1. The final step of the finishing phase and the final step overall of the data science process: wrapping up the project neatly

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.248.37