Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Cover
Table of Contents
Data Science with Python
Data Science with Python
Meet Your Course Guide
What's so cool about Data Science?
Course Structure
Course Journey
The Course Roadmap and Timeline
1. Course Module 1: Python Fundamentals
1. Introduction and First Steps – Take a Deep Breath
Enter the Python
About Python
What are the drawbacks?
Who is using Python today?
Setting up the environment
What you need for this course
How you can run a Python program
How is Python code organized
Python's execution model
Guidelines on how to write good code
The Python culture
A note on the IDEs
2. Object-oriented Design
Objects and classes
Specifying attributes and behaviors
Hiding details and creating the public interface
Composition
Inheritance
Case study
3. Objects in Python
Modules and packages
Organizing module contents
Who can access my data?
Third-party libraries
Case study
4. When Objects Are Alike
Multiple inheritance
Polymorphism
Abstract base classes
Case study
5. Expecting the Unexpected
Case study
6. When to Use Object-oriented Programming
Adding behavior to class data with properties
Manager objects
Case study
7. Python Data Structures
Tuples and named tuples
Dictionaries
Lists
Sets
Extending built-ins
Queues
Case study
8. Python Object-oriented Shortcuts
An alternative to method overloading
Functions are objects too
Case study
9. Strings and Serialization
Regular expressions
Serializing objects
Case study
10. The Iterator Pattern
Iterators
Comprehensions
Generators
Coroutines
Case study
11. Python Design Patterns I
The observer pattern
The strategy pattern
The state pattern
The singleton pattern
The template pattern
12. Python Design Patterns II
The facade pattern
The flyweight pattern
The command pattern
The abstract factory pattern
The composite pattern
13. Testing Object-oriented Programs
Unit testing
Testing with py.test
Imitating expensive objects
How much testing is enough?
Case study
14. Concurrency
Multiprocessing
Futures
AsyncIO
Case study
2. Course Module 2: Data Analysis
1. Introducing Data Analysis and Libraries
An overview of the libraries in data analysis
Python libraries in data analysis
2. NumPy Arrays and Vectorized Computation
Array functions
Data processing using arrays
Linear algebra with NumPy
NumPy random numbers
3. Data Analysis with pandas
The pandas data structure
The essential basic functionality
Indexing and selecting data
Computational tools
Working with missing data
Advanced uses of pandas for data analysis
4. Data Visualization
Exploring plot types
Legends and annotations
Plotting functions with pandas
Additional Python data visualization tools
5. Time Series
Working with date and time objects
Resampling time series
Downsampling time series data
Upsampling time series data
Timedeltas
Time series plotting
6. Interacting with Databases
Interacting with data in binary format
Interacting with data in MongoDB
Interacting with data in Redis
7. Data Analysis Application Examples
Data aggregation
Grouping data
3. Course Module 3: Data Mining
1. Getting Started with Data Mining
A simple affinity analysis example
A simple classification example
What is classification?
2. Classifying with scikit-learn Estimators
Preprocessing using pipelines
Pipelines
3. Predicting Sports Winners with Decision Trees
Decision trees
Sports outcome prediction
Random forests
4. Recommending Movies Using Affinity Analysis
The movie recommendation problem
The Apriori implementation
Extracting association rules
5. Extracting Features with Transformers
Feature selection
Feature creation
Creating your own transformer
6. Social Media Insight Using Naive Bayes
Text transformers
Naive Bayes
Application
7. Discovering Accounts to Follow Using Graph Mining
Finding subgraphs
8. Beating CAPTCHAs with Neural Networks
Creating the dataset
Training and classifying
Improving accuracy using a dictionary
9. Authorship Attribution
Function words
Support vector machines
Character n-grams
Using the Enron dataset
10. Clustering News Articles
Extracting text from arbitrary websites
Grouping news articles
Clustering ensembles
Online learning
11. Classifying Objects in Images Using Deep Learning
Application scenario and goals
Deep neural networks
GPU optimization
Setting up the environment
Application
12. Working with Big Data
Application scenario and goals
MapReduce
Application
13. Next Steps…
Chapter 2 – Classifying with scikit-learn Estimators
Chapter 3: Predicting Sports Winners with Decision Trees
Chapter 4 – Recommending Movies Using Affinity Analysis
Chapter 5 – Extracting Features with Transformers
Chapter 6 – Social Media Insight Using Naive Bayes
Chapter 7 – Discovering Accounts to Follow Using Graph Mining
Chapter 8 – Beating CAPTCHAs with Neural Networks
Chapter 9 – Authorship Attribution
Chapter 10 – Clustering News Articles
Chapter 11 – Classifying Objects in Images Using Deep Learning
Chapter 12 – Working with Big Data
More resources
4. Course Module 4: Machine Learning
1. Giving Computers the Ability to Learn from Data
The three different types of machine learning
An introduction to the basic terminology and notations
A roadmap for building machine learning systems
Using Python for machine learning
2. Training Machine Learning Algorithms for Classification
Implementing a perceptron learning algorithm in Python
Adaptive linear neurons and the convergence of learning
3. A Tour of Machine Learning Classifiers Using scikit-learn
First steps with scikit-learn
Modeling class probabilities via logistic regression
Maximum margin classification with support vector machines
Solving nonlinear problems using a kernel SVM
Decision tree learning
K-nearest neighbors – a lazy learning algorithm
4. Building Good Training Sets – Data Preprocessing
Handling categorical data
Partitioning a dataset in training and test sets
Bringing features onto the same scale
Selecting meaningful features
Assessing feature importance with random forests
5. Compressing Data via Dimensionality Reduction
Supervised data compression via linear discriminant analysis
Using kernel principal component analysis for nonlinear mappings
6. Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Using k-fold cross-validation to assess model performance
Debugging algorithms with learning and validation curves
Fine-tuning machine learning models via grid search
Looking at different performance evaluation metrics
7. Combining Different Models for Ensemble Learning
Implementing a simple majority vote classifier
Evaluating and tuning the ensemble classifier
Bagging – building an ensemble of classifiers from bootstrap samples
Leveraging weak learners via adaptive boosting
8. Predicting Continuous Target Variables with Regression Analysis
Exploring the Housing Dataset
Implementing an ordinary least squares linear regression model
Fitting a robust regression model using RANSAC
Evaluating the performance of linear regression models
Using regularized methods for regression
Turning a linear regression model into a curve – polynomial regression
A. Reflect and Test Yourself! Answers
Module 3: Data Mining
Module 4: Machine Learning
B. Bibliography
Index
← Prev
Back
Next →
← Prev
Back
Next →