Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Foreword
Foreword
Preface
Why Apache Cassandra?
Is This Book for You?
What’s in This Book?
New for the Second Edition
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Acknowledgments
1. Beyond Relational Databases
What’s Wrong with Relational Databases?
A Quick Review of Relational Databases
RDBMSs: The Awesome and the Not-So-Much
Transactions, ACID-ity, and two-phase commit
Schema
Sharding and shared-nothing architecture
Web Scale
The Rise of NoSQL
Summary
2. Introducing Cassandra
The Cassandra Elevator Pitch
Cassandra in 50 Words or Less
Distributed and Decentralized
Elastic Scalability
High Availability and Fault Tolerance
Tuneable Consistency
Brewer’s CAP Theorem
Row-Oriented
High Performance
Where Did Cassandra Come From?
Release History
Is Cassandra a Good Fit for My Project?
Large Deployments
Lots of Writes, Statistics, and Analysis
Geographical Distribution
Evolving Applications
Getting Involved
Summary
3. Installing Cassandra
Installing the Apache Distribution
Extracting the Download
What’s In There?
Building from Source
Additional Build Targets
Running Cassandra
On Windows
On Linux
Starting the Server
Stopping Cassandra
Other Cassandra Distributions
Running the CQL Shell
Basic cqlsh Commands
cqlsh Help
Describing the Environment in cqlsh
Creating a Keyspace and Table in cqlsh
Writing and Reading Data in cqlsh
Summary
4. The Cassandra Query Language
The Relational Data Model
Cassandra’s Data Model
Clusters
Keyspaces
Tables
Columns
Timestamps
Time to live (TTL)
CQL Types
Numeric Data Types
Textual Data Types
Time and Identity Data Types
Other Simple Data Types
Collections
User-Defined Types
Secondary Indexes
Summary
5. Data Modeling
Conceptual Data Modeling
RDBMS Design
Design Differences Between RDBMS and Cassandra
No joins
No referential integrity
Denormalization
Query-first design
Designing for optimal storage
Sorting is a design decision
Defining Application Queries
Logical Data Modeling
Hotel Logical Data Model
Reservation Logical Data Model
Physical Data Modeling
Hotel Physical Data Model
Reservation Physical Data Model
Materialized Views
Evaluating and Refining
Calculating Partition Size
Calculating Size on Disk
Breaking Up Large Partitions
Defining Database Schema
DataStax DevCenter
Summary
6. The Cassandra Architecture
Data Centers and Racks
Gossip and Failure Detection
Snitches
Rings and Tokens
Virtual Nodes
Partitioners
Replication Strategies
Consistency Levels
Queries and Coordinator Nodes
Memtables, SSTables, and Commit Logs
Caching
Hinted Handoff
Lightweight Transactions and Paxos
Tombstones
Bloom Filters
Compaction
Anti-Entropy, Repair, and Merkle Trees
Staged Event-Driven Architecture (SEDA)
Managers and Services
Cassandra Daemon
Storage Engine
Storage Service
Storage Proxy
Messaging Service
Stream Manager
CQL Native Transport Server
System Keyspaces
Summary
7. Configuring Cassandra
Cassandra Cluster Manager
Creating a Cluster
Seed Nodes
Partitioners
Murmur3 Partitioner
Random Partitioner
Order-Preserving Partitioner
ByteOrderedPartitioner
Snitches
Simple Snitch
Property File Snitch
Gossiping Property File Snitch
Rack Inferring Snitch
Cloud Snitches
Dynamic Snitch
Node Configuration
Tokens and Virtual Nodes
Network Interfaces
Data Storage
Startup and JVM Settings
Adding Nodes to a Cluster
Dynamic Ring Participation
Replication Strategies
SimpleStrategy
NetworkTopologyStrategy
Changing the Replication Factor
Summary
8. Clients
Hector, Astyanax, and Other Legacy Clients
DataStax Java Driver
Development Environment Configuration
Clusters and Contact Points
Protocol version
Compression
Authentication and encryption
Sessions and Connection Pooling
Statements
Simple statement
Asynchronous execution
Prepared statement
Bound statement
Built statement and the Query Builder
Object mapper
Policies
Load balancing policy
Retry policy
Speculative execution policy
Address translator
Metadata
Node discovery
Schema access
Debugging and Monitoring
Logging
Metrics
DataStax Python Driver
DataStax Node.js Driver
DataStax Ruby Driver
DataStax C# Driver
DataStax C/C++ Driver
DataStax PHP Driver
Summary
9. Reading and Writing Data
Writing
Write Consistency Levels
The Cassandra Write Path
Writing Files to Disk
Commit log files
SSTable files
Lightweight Transactions
Batches
Reading
Read Consistency Levels
The Cassandra Read Path
Read Repair
Range Queries, Ordering and Filtering
Functions and Aggregates
User-defined functions
User-defined aggregates
Built-in functions and aggregates
Paging
Speculative Retry
Deleting
Summary
10. Monitoring
Logging
Tailing
Examining Log Files
Monitoring Cassandra with JMX
Connecting to Cassandra via JConsole
Overview of MBeans
Cassandra’s MBeans
Database MBeans
Storage Service MBean
Storage Proxy MBean
ColumnFamilyStoreMBean
CacheServiceMBean
CommitLogMBean
Compaction Manager MBean
Snitch MBeans
HintedHandoffManagerMBean
Networking MBeans
FailureDetectorMBean
GossiperMBean
StreamManagerMBean
Metrics MBeans
Threading MBeans
Service MBeans
Security MBeans
Monitoring with nodetool
Getting Cluster Information
describecluster
status
info
ring
Getting Statistics
Using tpstats
Using tablestats
Summary
11. Maintenance
Health Check
Basic Maintenance
Flush
Cleanup
Repair
Full repair, incremental repair, and anti-compaction
Sequential and parallel repair
Partitioner range repair
Subrange repair
Rebuilding Indexes
Moving Tokens
Adding Nodes
Adding Nodes to an Existing Data Center
Adding a Data Center to a Cluster
Handling Node Failure
Repairing Nodes
Recovering from disk failure
Replacing Nodes
Removing Nodes
Decommissioning a node
Removing a node
Assassinating a node
Upgrading Cassandra
Backup and Recovery
Taking a Snapshot
Clearing a Snapshot
Enabling Incremental Backup
Restoring from Snapshot
SSTable Utilities
Maintenance Tools
DataStax OpsCenter
Netflix Priam
Summary
12. Performance Tuning
Managing Performance
Setting Performance Goals
Monitoring Performance
Analyzing Performance Issues
Tracing
Tuning Methodology
Caching
Key Cache
Row Cache
Counter Cache
Saved Cache Settings
Memtables
Commit Logs
SSTables
Hinted Handoff
Compaction
Concurrency and Threading
Networking and Timeouts
JVM Settings
Memory
Garbage Collection
Using cassandra-stress
Summary
13. Security
Authentication and Authorization
Password Authenticator
Configuring the authenticator
Additional authentication providers
Adding users
Authenticating via the DataStax Java driver
Using CassandraAuthorizer
Role-Based Access Control
Encryption
SSL, TLS, and Certificates
Node-to-Node Encryption
Client-to-Node Encryption
JMX Security
Securing JMX Access
Security MBeans
PermissionsCacheMBean
Summary
14. Deploying and Integrating
Planning a Cluster Deployment
Sizing Your Cluster
Selecting Instances
Storage
Network
Cloud Deployment
Amazon Web Services
Microsoft Azure
Google Cloud Platform
Integrations
Apache Lucene, SOLR, and Elasticsearch
Apache Hadoop
Apache Spark
Use cases for Spark with Cassandra
Deploying Spark with Cassandra
The spark-cassandra-connector
Summary
Index
← Prev
Back
Next →
← Prev
Back
Next →