Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Cover
Table of Contents
Title Page
Copyright
List of Contributors
Preface
Acknowledgments
Acronyms
Introduction
1 An Introduction: What's a Modern Big Data Platform
1.1 Defining Modern Big Data Platform
1.2 Fundamentals of a Modern Big Data Platform
2 A Bird's Eye View on Big Data
2.1 A Bit of History
2.2 What Makes Big Data
2.3 Components of Big Data Architecture
2.4 Making Use of Big Data
3 A Minimal Data Processing and Management System
3.1 Problem Definition
3.2 Processing Large Data with Linux Commands
3.3 Processing Large Data with PostgreSQL
3.4 Cost of Big Data
4 Big Data Storage
4.1 Big Data Storage Patterns
4.2 On‐Premise Storage Solutions
4.3 Cloud Storage Solutions
4.4 Hybrid Storage Solutions
5 Offline Big Data Processing
5.1 Defining Offline Data Processing
5.2 MapReduce Technologies
5.3 Apache Spark
5.4 Apache Flink
5.5 Presto
6 Stream Big Data Processing
6.1 The Need for Stream Processing
6.2 Defining Stream Data Processing
6.3 Streams via Message Brokers
6.4 Streams via Stream Engines
7 Data Analytics
7.1 Log Collection
7.2 Transferring Big Data Sets
7.3 Aggregating Big Data Sets
7.4 Data Pipeline Scheduler
7.5 Patterns and Practices
7.6 Exploring Data Visually
8 Data Science
8.1 Data Science Applications
8.2 Data Science Life Cycle
8.3 Data Science Toolbox
8.4 Productionalizing Data Science
9 Data Discovery
9.1 Need for Data Discovery
9.2 Data Governance
9.3 Data Discovery Tools
10 Data Security
10.1 Infrastructure Security
10.2 Data Privacy
10.3 Law Enforcement
10.4 Data Security Tools
11 Putting All Together
11.1 Platforms
11.2 Big Data Systems and Tools
11.3 Challenges
12 An Ideal Platform
12.1 Event Sourcing
12.2 Kappa Architecture
12.3 Data Mesh
12.4 Data Reservoirs
12.5 Data Catalog
12.6 Self‐service Platform
12.7 Abstraction
12.8 Data Guild
12.9 Trade‐offs
12.10 Data Ethics
Appendix A: Further Systems and Patterns
A.1 Lambda Architecture
A.2 Apache Cassandra
A.3 Apache Beam
Appendix B: Recipes
B.1 Activity Tracking Recipe
B.2 Data Quality Assurance
B.3 Estimating Time to Delivery
B.4 Incident Response Recipe
B.5 Leveraging Spark SQL Metrics
B.6 Airbnb Price Prediction
Bibliography
Index
End User License Agreement
← Prev
Back
Next →
← Prev
Back
Next →