Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Preface
Conventions Used in This Book Using Code Examples O’Reilly Safari How to Contact Us Acknowledgments
1. Probably Approximately Correct Software
Writing Software Right
SOLID
Single Responsibility Principle Open/Closed Principle Liskov Substitution Principle Interface Segregation Principle Dependency Inversion Principle
Testing or TDD Refactoring
Writing the Right Software
Writing the Right Software with Machine Learning What Exactly Is Machine Learning? The High Interest Credit Card Debt of Machine Learning SOLID Applied to Machine Learning
SRP OCP LSP ISP DIP
Machine Learning Code Is Complex but Not Impossible TDD: Scientific Method 2.0 Refactoring Our Way to Knowledge
The Plan for the Book
2. A Quick Introduction to Machine Learning
What Is Machine Learning? Supervised Learning Unsupervised Learning Reinforcement Learning What Can Machine Learning Accomplish? Mathematical Notation Used Throughout the Book Conclusion
3. K-Nearest Neighbors
How Do You Determine Whether You Want to Buy a House? How Valuable Is That House? Hedonic Regression What Is a Neighborhood? K-Nearest Neighbors Mr. K’s Nearest Neighborhood Distances
Triangle Inequality Geometrical Distance
Cosine similarity
Computational Distances
Manhattan distance Levenshtein distance
Statistical Distances
Mahalanobis distance Jaccard distance
Curse of Dimensionality How Do We Pick K?
Guessing K Heuristics for Picking K
Use coprime class and K combinations Choose a K that is greater or equal to the number of classes plus one Choose a K that is low enough to avoid noise Algorithms for picking K
Valuing Houses in Seattle
About the Data General Strategy Coding and Testing Design KNN Regressor Construction KNN Testing
Conclusion
4. Naive Bayesian Classification
Using Bayes’ Theorem to Find Fraudulent Orders Conditional Probabilities Probability Symbols Inverse Conditional Probability (aka Bayes’ Theorem) Naive Bayesian Classifier
The Chain Rule
Naiveté in Bayesian Reasoning Pseudocount Spam Filter
Setup Notes Coding and Testing Design Data Source Email Class Tokenization and Context SpamTrainer
Storing training data Building the Bayesian classifier Calculating a classification
Error Minimization Through Cross-Validation
Minimizing false positives Building the two folds Cross-validation and error measuring
Conclusion
5. Decision Trees and Random Forests
The Nuances of Mushrooms Classifying Mushrooms Using a Folk Theorem Finding an Optimal Switch Point
Information Gain GINI Impurity Variance Reduction
Pruning Trees
Ensemble Learning
Bagging Random forests
Writing a Mushroom Classifier
Coding and testing design MushroomProblem Testing
Conclusion
6. Hidden Markov Models
Tracking User Behavior Using State Machines Emissions/Observations of Underlying States Simplification Through the Markov Assumption
Using Markov Chains Instead of a Finite State Machine
Hidden Markov Model Evaluation: Forward-Backward Algorithm
Mathematical Representation of the Forward-Backward Algorithm Using User Behavior
The Decoding Problem Through the Viterbi Algorithm The Learning Problem Part-of-Speech Tagging with the Brown Corpus
Setup Notes Coding and Testing Design The Seam of Our Part-of-Speech Tagger: CorpusParser Writing the Part-of-Speech Tagger Cross-Validating to Get Confidence in the Model How to Make This Model Better
Conclusion
7. Support Vector Machines
Customer Happiness as a Function of What They Say
Sentiment Classification Using SVMs
The Theory Behind SVMs
Decision Boundary Maximizing Boundaries Kernel Trick: Feature Transformation Optimizing with Slack
Sentiment Analyzer
Setup Notes Coding and Testing Design SVM Testing Strategies Corpus Class CorpusSet Class Model Validation and the Sentiment Classifier
Aggregating Sentiment
Exponentially Weighted Moving Average
Mapping Sentiment to Bottom Line Conclusion
8. Neural Networks
What Is a Neural Network? History of Neural Nets Boolean Logic Perceptrons How to Construct Feed-Forward Neural Nets
Input Layer
Standard inputs Symmetric inputs
Hidden Layers Neurons Activation Functions Output Layer Training Algorithms The Delta Rule Back Propagation QuickProp RProp
Building Neural Networks
How Many Hidden Layers? How Many Neurons for Each Layer? Tolerance for Error and Max Epochs
Using a Neural Network to Classify a Language
Setup Notes Coding and Testing Design The Data Writing the Seam Test for Language Cross-Validating Our Way to a Network Class Tuning the Neural Network Precision and Recall for Neural Networks Wrap-Up of Example Conclusion
9. Clustering
Studying Data Without Any Bias User Cohorts Testing Cluster Mappings
Fitness of a Cluster Silhouette Coefficient Comparing Results to Ground Truth
K-Means Clustering
The K-Means Algorithm Downside of K-Means Clustering
EM Clustering
Algorithm
Expectation Maximization
The Impossibility Theorem Example: Categorizing Music
Setup Notes Gathering the Data Coding Design Analyzing the Data with K-Means EM Clustering Our Data The Results from the EM Jazz Clustering
Conclusion
10. Improving Models and Data Extraction
Debate Club Picking Better Data
Feature Selection Exhaustive Search Random Feature Selection A Better Feature Selection Algorithm Minimum Redundancy Maximum Relevance Feature Selection
Feature Transformation and Matrix Factorization
Principal Component Analysis Independent Component Analysis
Ensemble Learning
Bagging Boosting
Conclusion
11. Putting It Together: Conclusion
Machine Learning Algorithms Revisited How to Use This Information to Solve Problems What’s Next for You?
Index
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion