Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Title Page Copyright and Credits
Hands-On Exploratory Data Analysis with Python
About Packt
Why subscribe?
Contributors
About the authors About the reviewer Packt is searching for authors like you
Preface
Who this book is for What this book covers To get the most out of this book
Download the example code files Download the color images Conventions used
Get in touch
Reviews
Section 1: The Fundamentals of EDA Exploratory Data Analysis Fundamentals
Understanding data science The significance of EDA
Steps in EDA
Making sense of data
Numerical data
Discrete data Continuous data
Categorical data Measurement scales
Nominal Ordinal  Interval Ratio
Comparing EDA with classical and Bayesian analysis Software tools available for EDA Getting started with EDA
NumPy Pandas SciPy Matplotlib
Summary Further reading
Visual Aids for EDA
Technical requirements Line chart
Steps involved
Bar charts Scatter plot
Bubble chart Scatter plot using seaborn
Area plot and stacked plot Pie chart Table chart Polar chart Histogram Lollipop chart Choosing the best chart Other libraries to explore Summary Further reading
EDA with Personal Email
Technical requirements Loading the dataset Data transformation
Data cleansing Loading the CSV file Converting the date Removing NaN values Applying descriptive statistics Data refactoring Dropping columns Refactoring timezones
Data analysis
Number of emails Time of day Average emails per day and hour Number of emails per day Most frequently used words
Summary Further reading
Data Transformation
Technical requirements Background Merging database-style dataframes
Concatenating along with an axis Using df.merge with an inner join Using the pd.merge() method with a left join Using the pd.merge() method with a right join Using pd.merge() methods with outer join Merging on index Reshaping and pivoting
Transformation techniques
Performing data deduplication Replacing values Handling missing data
NaN values in pandas objects Dropping missing values
Dropping by rows Dropping by columns
Mathematical operations with NaN Filling missing values Backward and forward filling Interpolating missing values
Renaming axis indexes Discretization and binning Outlier detection and filtering Permutation and random sampling
Random sampling without replacement Random sampling with replacement
Computing indicators/dummy variables String manipulation
Benefits of data transformation
Challenges
Summary Further reading
Section 2: Descriptive Statistics Descriptive Statistics
Technical requirements Understanding statistics
Distribution function
Uniform distribution Normal distribution Exponential distribution Binomial distribution
Cumulative distribution function Descriptive statistics
Measures of central tendency
Mean/average Median Mode
Measures of dispersion
Standard deviation Variance Skewness Kurtosis
Types of kurtosis
Calculating percentiles Quartiles
Visualizing quartiles
Summary Further reading
Grouping Datasets
Technical requirements Understanding groupby()  Groupby mechanics
Selecting a subset of columns Max and min Mean
Data aggregation
Group-wise operations
Renaming grouped aggregation columns
Group-wise transformations
Pivot tables and cross-tabulations
Pivot tables Cross-tabulations
Summary Further reading
Correlation
Technical requirements Introducing correlation Types of analysis
Understanding univariate analysis Understanding bivariate analysis Understanding multivariate analysis
Discussing multivariate analysis using the Titanic dataset Outlining Simpson's paradox Correlation does not imply causation Summary Further reading
Time Series Analysis
Technical requirements Understanding the time series dataset
Fundamentals of TSA
Univariate time series
Characteristics of time series data
TSA with Open Power System Data
Data cleaning Time-based indexing Visualizing time series Grouping time series data Resampling time series data
Summary Further reading
Section 3: Model Development and Evaluation Hypothesis Testing and Regression
Technical requirements Hypothesis testing
Hypothesis testing principle statsmodels library Average reading time  Types of hypothesis testing T-test
p-hacking Understanding regression
Types of regression
Simple linear regression Multiple linear regression Nonlinear regression
Model development and evaluation
Constructing a linear regression model
Model evaluation Computing accuracy Understanding accuracy
Implementing a multiple linear regression model
Summary Further reading
Model Development and Evaluation
Technical requirements Types of machine learning Understanding supervised learning
Regression Classification
Understanding unsupervised learning
Applications of unsupervised learning  Clustering using MiniBatch K-means clustering 
Extracting keywords Plotting clusters Word cloud
Understanding reinforcement learning
Difference between supervised and reinforcement learning Applications of reinforcement learning
Unified machine learning workflow 
Data preprocessing
Data collection Data analysis Data cleaning, normalization, and transformation
Data preparation Training sets and corpus creation Model creation and training Model evaluation Best model selection and evaluation Model deployment
Summary Further reading
EDA on Wine Quality Data Analysis
Technical requirements Disclosing the wine quality dataset
Loading the dataset Descriptive statistics Data wrangling
Analyzing red wine
Finding correlated columns Alcohol versus quality Alcohol versus pH
Analyzing white wine
Red wine versus white wine  Adding a new attribute Converting into a categorical column Concatenating dataframes Grouping columns Univariate analysis Multivariate analysis on the combined dataframe Discrete categorical attributes 3-D visualization
Model development and evaluation Summary Further reading
Appendix
String manipulation
Creating strings Accessing characters in Python  String slicing Deleting/updating from a string Escape sequencing in Python Formatting strings
Using pandas vectorized string functions
Using string functions with a pandas DataFrame
Using regular expressions Further reading
Other Books You May Enjoy
Leave a review - let other readers know what you think
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion