Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Mastering Apache Cassandra
Table of Contents Mastering Apache Cassandra Credits About the Author Acknowledgments About the Reviewers www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe? Free Access for Packt account holders
Preface
What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support
Downloading the example code Errata Piracy Questions
1. Quick Start
Introduction to Cassandra
Distributed database High availability Replication Multiple data centers
A brief introduction to a data model Installing Cassandra locally CRUD with cassandra-cli Cassandra in action
Modeling data Writing code
Setting up Application
Summary
2. Cassandra Architecture
Problems in the RDBMS world Enter NoSQL
The CAP theorem
Consistency Availability Partition-tolerance
Significance of the CAP theorem
Cassandra Cassandra architecture
Ring representation How Cassandra works
Write in action Read in action
Components of Cassandra
Messaging service Gossip Failure detection Partitioner Replication Log Structured Merge tree CommitLog MemTable SSTable
Bloom filter Index files Datafiles
Compaction Tombstones Hinted handoff Read repair and Anti-entropy
Merkle tree
Summary
3. Design Patterns
The Cassandra data model
The counter column The expiring column The super column The column family Keyspaces Data types – comparators and validators
Writing a custom comparator The primary index The wide-row index Simple groups Sorting for free, free as in speech An inverse index with a super column family An inverse index with composite keys The secondary index
Patterns and antipatterns
Avoid storing an entity in a single column (wherever possible) Atomic update Managing time series data
Wide-row time series High throughput rows and hotspots Advanced time series
Avoid super columns Transaction woes Use expiring columns batch_mutate
Summary
4. Deploying a Cluster
Evaluating requirements
Hard disk capacity
RAM CPU Nodes Network
System configurations
Optimizing user limits Swapping memory Clock synchronization Disk readahead
The required software
Installing Oracle Java 6
RHEL and CentOS systems Debian and Ubuntu systems
Installing the Java Native Access (JNA) library
Installing Cassandra
Installing from a tarball Installing from ASFRepository for Debian/Ubuntu Anatomy of the installation
Cassandra binaries Configuration files
Setting up Cassandra's data directory and commit log directory
Configuring a Cassandra cluster
The cluster name The seed node
Listen, broadcast, and RPC addresses
Initial token Partitioners
The random partitioner The byte-ordered partitioner The Murmur3 partitioner
Snitches
SimpleSnitch PropertyFileSnitch GossipingPropertyFileSnitch RackInferringSnitch EC2Snitch EC2MultiRegionSnitch
Replica placement strategies
SimpleStrategy NetworkTopologyStrategy
NetworkTopologyStrategy and multiple data center setups
Launching a cluster with a script Creating a keyspace
Authorization and authentication Summary
5. Performance Tuning
Stress testing Performance tuning
Write performance Read performance
Choosing the right compaction strategy Size tiered compaction strategy Leveled compaction Row cache Key cache Cache settings Enabling compression Tuning the bloom filter
More tuning via cassandra.yaml
index_interval commitlog_sync column_index_size_in_kb commitlog_total_space_in_mb
Tweaking JVM
Java heap Garbage collection Other JVM options
Scaling horizontally and vertically Network
Summary
6. Managing a Cluster – Scaling, Node Repair, and Backup
Scaling
Adding nodes to a cluster Removing nodes from a cluster
Removing a live node Removing a dead node
Replacing a node Backup and restoration
Using Cassandra bulk loader to restore the data
Load balancing Priam – managing large clusters on AWS Summary
7. Monitoring
Cassandra JMX interface
Accessing MBeans using JConsole
Cassandra nodetool
Monitoring with nodetool
cfstats netstats ring and describering tpstats compactionstats info
Administrating with nodetool
drain decommission move removetoken repair upgradesstable snapshot
DataStax OpsCenter
OpsCenter Features Installing OpsCenter and an agent
Prerequisites Running a Cassandra cluster Installing OpsCenter from Tarball Setting up an OpsCenter agent
Monitoring and administrating with OpsCenter Other features of OpsCenter
Nagios – monitoring and notification
Installing Nagios
Prerequisites Preparation Installation
Installing Nagios Configuring Apache httpd Installing Nagios plugins Setting up Nagios as a service
Nagios plugins
Nagios plugins for Cassandra Executing remote plugins via an NRPE plugin
Installing NRPE on host machines Installing NRPE plugin on a Nagios machine
Setting things up to monitor Monitoring and notification using Nagios
Cassandra log
Enabling Java Options for GC Logging
Troubleshooting
High CPU usage High memory usage Hotspots OpenJDK may behave erratically Disk performance Slow snapshot Getting help from the mailing list
Summary
8. Integration
Using Hadoop Hadoop and Cassandra
Introduction to Hadoop
HDFS – Hadoop Distributed File System Data management
NameNode DataNodes
Hadoop MapReduce
JobTracker TaskTracker
Reliability of data and process in Hadoop
Setting up local Hadoop Testing the installation
Cassandra with Hadoop MapReduce
ColumnFamilyInputFormat ColumnFamilyOutputFormat ConfigHelper
Wide-row support Bulk loading Secondary index support
Cassandra and Hadoop in action
Executing, debugging, monitoring, and looking at results
Hadoop in Cassandra cluster
Cassandra filesystem
Integration with Pig
Installing Pig Integrating Pig and Cassandra
Cassandra and Solr
Development note on Solandra
DataStax Enterprise – the next level Solr integration
Summary
9. Introduction to CQL 3 and Cassandra 1.2
CQL – the Cassandra Query Language CQL 3 for Thrift refugees
Wide rows Composite columns
CQL 3 basics
The CREATE KEYSPACE query The CREATE TABLE query Compact storage Creating a secondary index The INSERT query The SELECT query select expression The WHERE clause The ORDER BY clause The LIMIT clause The USING CONSISTENCY clause The UPDATE query The DELETE query The TRUNCATE query The ALTER TABLE query
Adding a new column Dropping an existing column Modifying the data type of an existing column Altering table options
The ALTER KEYSPACE query BATCH querying The DROP INDEX query The DROP TABLE query The DROP KEYSPACE query The USE statement
What's new in Cassandra 1.2?
Virtual Nodes Off-heap Bloom filters JBOD improvements Parallel leveled compaction Murmur3 partitioner Atomic batches Query profiling Collections support
Sets Lists Maps
Support for programming languages Summary
Index
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion