Title Page Copyright and Credits Hands-On Exploratory Data Analysis with Python About Packt Why subscribe? Contributors About the authors About the reviewer Packt is searching for authors like you Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Reviews Section 1: The Fundamentals of EDA Exploratory Data Analysis Fundamentals Understanding data science The significance of EDA Steps in EDA Making sense of data Numerical data Discrete data Continuous data Categorical data Measurement scales Nominal Ordinal  Interval Ratio Comparing EDA with classical and Bayesian analysis Software tools available for EDA Getting started with EDA NumPy Pandas SciPy Matplotlib Summary Further reading Visual Aids for EDA Technical requirements Line chart Steps involved Bar charts Scatter plot Bubble chart Scatter plot using seaborn Area plot and stacked plot Pie chart Table chart Polar chart Histogram Lollipop chart Choosing the best chart Other libraries to explore Summary Further reading EDA with Personal Email Technical requirements Loading the dataset Data transformation Data cleansing Loading the CSV file Converting the date Removing NaN values Applying descriptive statistics Data refactoring Dropping columns Refactoring timezones Data analysis Number of emails Time of day Average emails per day and hour Number of emails per day Most frequently used words Summary Further reading Data Transformation Technical requirements Background Merging database-style dataframes Concatenating along with an axis Using df.merge with an inner join Using the pd.merge() method with a left join Using the pd.merge() method with a right join Using pd.merge() methods with outer join Merging on index Reshaping and pivoting Transformation techniques Performing data deduplication Replacing values Handling missing data NaN values in pandas objects Dropping missing values Dropping by rows Dropping by columns Mathematical operations with NaN Filling missing values Backward and forward filling Interpolating missing values Renaming axis indexes Discretization and binning Outlier detection and filtering Permutation and random sampling Random sampling without replacement Random sampling with replacement Computing indicators/dummy variables String manipulation Benefits of data transformation Challenges Summary Further reading Section 2: Descriptive Statistics Descriptive Statistics Technical requirements Understanding statistics Distribution function Uniform distribution Normal distribution Exponential distribution Binomial distribution Cumulative distribution function Descriptive statistics Measures of central tendency Mean/average Median Mode Measures of dispersion Standard deviation Variance Skewness Kurtosis Types of kurtosis Calculating percentiles Quartiles Visualizing quartiles Summary Further reading Grouping Datasets Technical requirements Understanding groupby()  Groupby mechanics Selecting a subset of columns Max and min Mean Data aggregation Group-wise operations Renaming grouped aggregation columns Group-wise transformations Pivot tables and cross-tabulations Pivot tables Cross-tabulations Summary Further reading Correlation Technical requirements Introducing correlation Types of analysis Understanding univariate analysis Understanding bivariate analysis Understanding multivariate analysis Discussing multivariate analysis using the Titanic dataset Outlining Simpson's paradox Correlation does not imply causation Summary Further reading Time Series Analysis Technical requirements Understanding the time series dataset Fundamentals of TSA Univariate time series Characteristics of time series data TSA with Open Power System Data Data cleaning Time-based indexing Visualizing time series Grouping time series data Resampling time series data Summary Further reading Section 3: Model Development and Evaluation Hypothesis Testing and Regression Technical requirements Hypothesis testing Hypothesis testing principle statsmodels library Average reading time  Types of hypothesis testing T-test p-hacking Understanding regression Types of regression Simple linear regression Multiple linear regression Nonlinear regression Model development and evaluation Constructing a linear regression model Model evaluation Computing accuracy Understanding accuracy Implementing a multiple linear regression model Summary Further reading Model Development and Evaluation Technical requirements Types of machine learning Understanding supervised learning Regression Classification Understanding unsupervised learning Applications of unsupervised learning  Clustering using MiniBatch K-means clustering  Extracting keywords Plotting clusters Word cloud Understanding reinforcement learning Difference between supervised and reinforcement learning Applications of reinforcement learning Unified machine learning workflow  Data preprocessing Data collection Data analysis Data cleaning, normalization, and transformation Data preparation Training sets and corpus creation Model creation and training Model evaluation Best model selection and evaluation Model deployment Summary Further reading EDA on Wine Quality Data Analysis Technical requirements Disclosing the wine quality dataset Loading the dataset Descriptive statistics Data wrangling Analyzing red wine Finding correlated columns Alcohol versus quality Alcohol versus pH Analyzing white wine Red wine versus white wine  Adding a new attribute Converting into a categorical column Concatenating dataframes Grouping columns Univariate analysis Multivariate analysis on the combined dataframe Discrete categorical attributes 3-D visualization Model development and evaluation Summary Further reading Appendix String manipulation Creating strings Accessing characters in Python  String slicing Deleting/updating from a string Escape sequencing in Python Formatting strings Using pandas vectorized string functions Using string functions with a pandas DataFrame Using regular expressions Further reading Other Books You May Enjoy Leave a review - let other readers know what you think