Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Halftitle page
Title page
Copyright page
Contents
Preface
1 Data Mining
1.1 What is Data Mining?
1.2 Statistical Limits on Data Mining
1.3 Things Useful to Know
1.4 Outline of the Book
1.5 Summary of Chapter 1
1.6 References for Chapter 1
2 MapReduce and the New Software Stack
2.1 Distributed File Systems
2.2 MapReduce
2.3 Algorithms Using MapReduce
2.4 Extensions to MapReduce
2.5 The Communication Cost Model
2.6 Complexity Theory for MapReduce
2.7 Summary of Chapter 2
2.8 References for Chapter 2
3 Finding Similar Items
3.1 Applications of Near-Neighbor Search
3.2 Shingling of Documents
3.3 Similarity-Preserving Summaries of Sets
3.4 Locality-Sensitive Hashing for Documents
3.5 Distance Measures
3.6 The Theory of Locality-Sensitive Functions
3.7 LSH Families for Other Distance Measures
3.8 Applications of Locality-Sensitive Hashing
3.9 Methods for High Degrees of Similarity
3.10 Summary of Chapter 3
3.11 References for Chapter 3
4 Mining Data Streams
4.1 The Stream Data Model
4.2 Sampling Data in a Stream
4.3 Filtering Streams
4.4 Counting Distinct Elements in a Stream
4.5 Estimating Moments
4.6 Counting Ones in a Window
4.7 Decaying Windows
4.8 Summary of Chapter 4
4.9 References for Chapter 4
5 Link Analysis
5.1 PageRank
5.2 Efficient Computation of PageRank
5.3 Topic-Sensitive PageRank
5.4 Link Spam
5.5 Hubs and Authorities
5.6 Summary of Chapter 5
5.7 References for Chapter 5
6 Frequent Itemsets
6.1 The Market-Basket Model
6.2 Market Baskets and the A-Priori Algorithm
6.3 Handling Larger Datasets in Main Memory
6.4 Limited-Pass Algorithms
6.5 Counting Frequent Items in a Stream
6.6 Summary of Chapter 6
6.7 References for Chapter 6
7 Clustering
7.1 Introduction to Clustering Techniques
7.2 Hierarchical Clustering
7.3 K-means Algorithms
7.4 The CURE Algorithm
7.5 Clustering in Non-Euclidean Spaces
7.6 Clustering for Streams and Parallelism
7.7 Summary of Chapter 7
7.8 References for Chapter 7
8 Advertising on the Web
8.1 Issues in On-Line Advertising
8.2 On-Line Algorithms
8.3 The Matching Problem
8.4 The Adwords Problem
8.5 Adwords Implementation
8.6 Summary of Chapter 8
8.7 References for Chapter 8
9 Recommendation Systems
9.1 A Model for Recommendation Systems
9.2 Content-Based Recommendations
9.3 Collaborative Filtering
9.4 Dimensionality Reduction
9.5 The NetFlix Challenge
9.6 Summary of Chapter 9
9.7 References for Chapter 9
10 Mining Social-Network Graphs
10.1 Social Networks as Graphs
10.2 Clustering of Social-Network Graphs
10.3 Direct Discovery of Communities
10.4 Partitioning of Graphs
10.5 Finding Overlapping Communities
10.6 Simrank
10.7 Counting Triangles
10.8 Neighborhood Properties of Graphs
10.9 Summary of Chapter 10
10.10 References for Chapter 10
11 Dimensionality Reduction
11.1 Eigenvalues and Eigenvectors
11.2 Principal-Component Analysis
11.3 Singular-Value Decomposition
11.4 CUR Decomposition
11.5 Summary of Chapter 11
11.6 References for Chapter 11
12 Large-Scale Machine Learning
12.1 The Machine-Learning Model
12.2 Perceptrons
12.3 Support-Vector Machines
12.4 Learning from Nearest Neighbors
12.5 Comparison of Learning Methods
12.6 Summary of Chapter 12
12.7 References for Chapter 12
Index
← Prev
Back
Next →
← Prev
Back
Next →