Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
About This eBook
Title Page
Copyright Page
Dedication Page
About the Author
Contents
Preface
Acknowledgments
Introduction
Who Should Read This Book?
The Purpose of This Book
How to Read This Book
How This Book Is Organized
Part I: Introduction
Part II: Key-Value Databases
Part III: Document Databases
Part IV: Column Family Databases
Part V: Graph Databases
Part VI: Choosing a Database for Your Application
Part VII: Appendices
Part I: Introduction
1. Different Databases for Different Requirements
Relational Database Design
E-commerce Application
Early Database Management Systems
Flat File Data Management Systems
Hierarchical Data Model Systems
Network Data Management Systems
Summary of Early Database Management Systems
The Relational Database Revolution
Relational Database Management Systems
Motivations for Not Just/No SQL (NoSQL) Databases
Scalability
Cost
Flexibility
Availability
Summary
Case Study
Review Questions
References
Bibliography
2. Variety of NoSQL Databases
Data Management with Distributed Databases
Store Data Persistently
Maintain Data Consistency
Ensure Data Availability
Balancing Response Times, Consistency, and Durability
Consistency, Availability, and Partitioning: The CAP Theorem
ACID and BASE
ACID: Atomicity, Consistency, Isolation, and Durability
BASE: Basically Available, Soft State, Eventually Consistent
Types of Eventual Consistency
Four Types of NoSQL Databases
Key-Value Pair Databases
Document Databases
Column Family Databases
Graph Databases
Summary
Review Questions
References
Bibliography
Part II: Key-Value Databases
3. Introduction to Key-Value Databases
From Arrays to Key-Value Databases
Arrays: Key Value Stores with Training Wheels
Associative Arrays: Taking Off the Training Wheels
Caches: Adding Gears to the Bike
In-Memory and On-Disk Key-Value Database: From Bikes to Motorized Vehicles
Essential Features of Key-Value Databases
Simplicity: Who Needs Complicated Data Models Anyway?
Speed: There Is No Such Thing as Too Fast
Scalability: Keeping Up with the Rush
Keys: More Than Meaningless Identifiers
How to Construct a Key
Using Keys to Locate Values
Values: Storing Just About Any Data You Want
Values Do Not Require Strong Typing
Limitations on Searching for Values
Summary
Review Questions
References
Bibliography
4. Key-Value Database Terminology
Key-Value Database Data Modeling Terms
Key
Value
Namespace
Partition
Partition Key
Schemaless
Key-Value Architecture Terms
Cluster
Ring
Replication
Key-Value Implementation Terms
Hash Function
Collision
Compression
Summary
Review Questions
References
5. Designing for Key-Value Databases
Key Design and Partitioning
Keys Should Follow a Naming Convention
Well-Designed Keys Save Code
Dealing with Ranges of Values
Keys Must Take into Account Implementation Limitations
How Keys Are Used in Partitioning
Designing Structured Values
Structured Data Types Help Reduce Latency
Large Values Can Lead to Inefficient Read and Write Operations
Limitations of Key-Value Databases
Look Up Values by Key Only
Key-Value Databases Do Not Support Range Queries
No Standard Query Language Comparable to SQL for Relational Databases
Design Patterns for Key-Value Databases
Time to Live (TTL) Keys
Emulating Tables
Aggregates
Atomic Aggregates
Enumerable Keys
Indexes
Summary
Case Study: Key-Value Databases for Mobile Application Configuration
Review Questions
References
Part III: Document Databases
6. Introduction to Document Databases
What Is a Document?
Documents Are Not So Simple After All
Documents and Key-Value Pairs
Managing Multiple Documents in Collections
Avoid Explicit Schema Definitions
Basic Operations on Document Databases
Inserting Documents into a Collection
Deleting Documents from a Collection
Updating Documents in a Collection
Retrieving Documents from a Collection
Summary
Review Questions
References
7. Document Database Terminology
Document and Collection Terms
Document
Collection
Embedded Document
Schemaless
Polymorphic Schema
Types of Partitions
Vertical Partitioning
Horizontal Partitioning or Sharding
Data Modeling and Query Processing
Normalization
Denormalization
Query Processor
Summary
Review Questions
References
8. Designing for Document Databases
Normalization, Denormalization, and the Search for Proper Balance
One-to-Many Relations
Many-to-Many Relations
The Need for Joins
Executing Joins: The Heavy Lifting of Relational Databases
What Would a Document Database Modeler Do?
Planning for Mutable Documents
Avoid Moving Oversized Documents
The Goldilocks Zone of Indexes
Read-Heavy Applications
Write-Heavy Applications
Modeling Common Relations
One-to-Many Relations in Document Databases
Many-to-Many Relations in Document Databases
Modeling Hierarchies in Document Databases
Summary
Case Study: Customer Manifests
Embed or Not Embed?
Choosing Indexes
Separate Collections by Type?
Review Questions
References
Part IV: Column Family Databases
9. Introduction to Column Family Databases
In the Beginning, There Was Google BigTable
Utilizing Dynamic Control over Columns
Indexing by Row, Column Name, and Time Stamp
Controlling Location of Data
Reading and Writing Atomic Rows
Maintaining Rows in Sorted Order
Differences and Similarities to Key-Value and Document Databases
Column Family Database Features
Column Family Database Similarities to and Differences from Document Databases
Column Family Database Versus Relational Databases
Architectures Used in Column Family Databases
HBase Architecture: Variety of Nodes
Cassandra Architecture: Peer-to-Peer
Getting the Word Around: Gossip Protocol
Thermodynamics and Distributed Database: Why We Need Anti-Entropy
Hold This for Me: Hinted Handoff
When to Use Column Family Databases
Summary
Review Questions
References
10. Column Family Database Terminology
Basic Components of Column Family Databases
Keyspace
Row Key
Column
Column Families
Structures and Processes: Implementing Column Family Databases
Internal Structures and Configuration Parameters of Column Family Databases
Old Friends: Clusters and Partitions
Taking a Look Under the Hood: More Column Family Database Components
Processes and Protocols
Replication
Anti-Entropy
Gossip Protocol
Hinted Handoff
Summary
Review Questions
References
11. Designing for Column Family Databases
Guidelines for Designing Tables
Denormalize Instead of Join
Make Use of Valueless Columns
Use Both Column Names and Column Values to Store Data
Model an Entity with a Single Row
Avoid Hotspotting in Row Keys
Keep an Appropriate Number of Column Value Versions
Avoid Complex Data Structures in Column Values
Guidelines for Indexing
When to Use Secondary Indexes Managed by the Column Family Database System
When to Create and Manage Secondary Indexes Using Tables
Tools for Working with Big Data
Extracting, Transforming, and Loading Big Data
Analyzing Big Data
Tools for Monitoring Big Data
Summary
Case Study: Customer Data Analysis
Understanding User Needs
Review Questions
References
Part V: Graph Databases
12. Introduction to Graph Databases
What Is a Graph?
Graphs and Network Modeling
Modeling Geographic Locations
Modeling Infectious Diseases
Modeling Abstract and Concrete Entities
Modeling Social Media
Advantages of Graph Databases
Query Faster by Avoiding Joins
Simplified Modeling
Multiple Relations Between Entities
Summary
Review Questions
References
13. Graph Database Terminology
Elements of Graphs
Vertex
Edge
Path
Loop
Operations on Graphs
Union of Graphs
Intersection of Graphs
Graph Traversal
Properties of Graphs and Nodes
Isomorphism
Order and Size
Degree
Closeness
Betweenness
Types of Graphs
Undirected and Directed Graphs
Flow Network
Bipartite Graph
Multigraph
Weighted Graph
Summary
Review Questions
References
14. Designing for Graph Databases
Getting Started with Graph Design
Designing a Social Network Graph Database
Queries Drive Design (Again)
Querying a Graph
Cypher: Declarative Querying
Gremlin: Query by Graph Traversal
Tips and Traps of Graph Database Design
Use Indexes to Improve Retrieval Time
Use Appropriate Types of Edges
Watch for Cycles When Traversing Graphs
Consider the Scalability of Your Graph Database
Summary
Case Study: Optimizing Transportation Routes
Understanding User Needs
Designing a Graph Analysis Solution
Review Questions
References
Part VI: Choosing a Database for Your Application
15. Guidelines for Selecting a Database
Choosing a NoSQL Database
Criteria for Selecting Key-Value Databases
Use Cases and Criteria for Selecting Document Databases
Use Cases and Criteria for Selecting Column Family Databases
Use Cases and Criteria for Selecting Graph Databases
Using NoSQL and Relational Databases Together
Summary
Review Questions
References
Part VII: Appendices
A. Answers to Chapter Review Questions
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Chapter 14
Chapter 15
B. List of NoSQL Databases
Glossary
Index
Code Snippets
← Prev
Back
Next →
← Prev
Back
Next →