Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
HBase: The Definitive Guide
Foreword Preface
General Information
HBase Version Building the Examples Hush: The HBase URL Shortener Running Hush
Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us Acknowledgments
1. Introduction
The Dawn of Big Data The Problem with Relational Database Systems Nonrelational Database Systems, Not-Only SQL or NoSQL?
Dimensions Scalability Database (De-)Normalization
Building Blocks
Backdrop Tables, Rows, Columns, and Cells Auto-Sharding Storage API Implementation Summary
HBase: The Hadoop Database
History Nomenclature Summary
2. Installation
Quick-Start Guide Requirements
Hardware
Servers Networking
Software
Operating system Filesystem Java Hadoop SSH Domain Name Service Synchronized time File handles and process limits Datanode handlers Swappiness Windows
Filesystems for HBase
Local HDFS S3 Other Filesystems
Installation Choices
Apache Binary Release Building from Source
Run Modes
Standalone Mode Distributed Mode
Pseudodistributed mode Fully distributed mode
Specifying region servers ZooKeeper setup Using the existing ZooKeeper ensemble
Configuration
hbase-site.xml and hbase-default.xml hbase-env.sh regionserver log4j.properties Example Configuration
hbase-site.xml regionservers hbase-env.sh
Client Configuration
Deployment
Script-Based Apache Whirr Puppet and Chef
Operating a Cluster
Running and Confirming Your Installation Web-based UI Introduction Shell Introduction Stopping the Cluster
3. Client API: The Basics
General Notes CRUD Operations
Put Method
Single Puts The KeyValue class Client-side write buffer List of Puts Atomic compare-and-set
Get Method
Single Gets The Result class List of Gets Related retrieval methods
Delete Method
Single Deletes List of Deletes Atomic compare-and-delete
Batch Operations Row Locks Scans
Introduction The ResultScanner Class Caching Versus Batching
Miscellaneous Features
The HTable Utility Methods The Bytes Class
4. Client API: Advanced Features
Filters
Introduction to Filters
The filter hierarchy Comparison operators Comparators
Comparison Filters
RowFilter FamilyFilter QualifierFilter ValueFilter DependentColumnFilter
Dedicated Filters
SingleColumnValueFilter SingleColumnValueExcludeFilter PrefixFilter PageFilter KeyOnlyFilter FirstKeyOnlyFilter InclusiveStopFilter TimestampsFilter ColumnCountGetFilter ColumnPaginationFilter ColumnPrefixFilter RandomRowFilter
Decorating Filters
SkipFilter WhileMatchFilter
FilterList Custom Filters Filters Summary
Counters
Introduction to Counters Single Counters Multiple Counters
Coprocessors
Introduction to Coprocessors The Coprocessor Class Coprocessor Loading
Loading from the configuration Loading from the table descriptor
The RegionObserver Class
Handling region life-cycle events
State: pending open State: open State: pending close
Handling client API events The RegionCoprocessorEnvironment class The ObserverContext class The BaseRegionObserver class
The MasterObserver Class
The MasterCoprocessorEnvironment class The BaseMasterObserver class
Endpoints
The CoprocessorProtocol interface The BaseEndpointCoprocessor class
HTablePool Connection Handling
5. Client API: Administrative Features
Schema Definition
Tables Table Properties Column Families
HBaseAdmin
Basic Operations Table Operations Schema Operations Cluster Operations Cluster Status Information
6. Available Clients
Introduction to REST, Thrift, and Avro Interactive Clients
Native Java REST
Operation Supported formats
Plain (text/plain) XML (text/xml) JSON (application/json) Protocol Buffer (application/x-protobuf) Raw binary (application/octet-stream)
REST Java client
Thrift
Installation Operation Example: PHP
Avro
Installation Operation
Other Clients
Batch Clients
MapReduce
Native Java Clojure
Hive Pig Cascading
Shell
Basics Commands
General Data definition Data manipulation Tools Replication
Scripting
Web-based UI
Master UI
Main page User Table page ZooKeeper page
Region Server UI
Main page
Shared Pages
7. MapReduce Integration
Framework
MapReduce Introduction Classes
InputFormat Mapper Reducer OutputFormat
Supporting Classes MapReduce Locality Table Splits
MapReduce over HBase
Preparation
Static Provisioning Dynamic Provisioning
Data Sink Data Source Data Source and Sink Custom Processing
8. Architecture
Seek Versus Transfer
B+ Trees Log-Structured Merge-Trees
Storage
Overview Write Path Files
Root-level files Table-level files Region-level files Region splits Compactions
HFile Format KeyValue Format
Write-Ahead Log
Overview HLog Class HLogKey Class WALEdit Class LogSyncer Class LogRoller Class Replay
Single log Log splitting Edits recovery
Durability
Read Path Region Lookups The Region Life Cycle ZooKeeper Replication
Life of a Log Edit
Normal processing Non-Responding slave clusters
Internals
Choosing region servers to replicate to Keeping track of logs Reading, filtering, and sending edits Cleaning logs Region server failover
9. Advanced Usage
Key Design
Concepts Tall-Narrow Versus Flat-Wide Tables Partial Key Scans Pagination Time Series Data Time-Ordered Relations
Advanced Schemas Secondary Indexes Search Integration Transactions Bloom Filters Versioning
Implicit Versioning Custom Versioning
10. Cluster Monitoring
Introduction The Metrics Framework
Contexts, Records, and Metrics Master Metrics Region Server Metrics RPC Metrics JVM Metrics Info Metrics
Ganglia
Installation
Ganglia-related steps
Ganglia monitoring daemon Ganglia meta daemon Ganglia web frontend
HBase-related steps
Usage
JMX
JConsole JMX Remote API
Nagios
11. Performance Tuning
Garbage Collection Tuning Memstore-Local Allocation Buffer Compression
Available Codecs
Snappy LZO GZIP
Verifying Installation
Compression test tool Startup check
Enabling Compression
Optimizing Splits and Compactions
Managed Splitting Region Hotspotting Presplitting Regions
Load Balancing Merging Regions Client API: Best Practices Configuration Load Tests
Performance Evaluation YCSB
12. Cluster Administration
Operational Tasks
Node Decommissioning Rolling Restarts Adding Servers
Pseudodistributed mode
Adding a local backup master Adding a local region server
Fully distributed cluster
Adding a backup master Adding a region server
Data Tasks
Import and Export Tools CopyTable Tool Bulk Import
Bulk load procedure Using the importtsv tool Using the completebulkload Tool Advanced usage
Replication
Additional Tasks
Coexisting Clusters Required Ports
Changing Logging Levels Troubleshooting
HBase Fsck Analyzing the Logs Common Issues
Basic setup checklist
File handles DataNode connections Compression Garbage collection/memory tuning
Stability issues
ZooKeeper problems “Could not obtain block” errors
A. HBase Configuration Properties B. Road Map
HBase 0.92.0 HBase 0.94.0
C. Upgrade from Previous Releases
Upgrading to HBase 0.90.x
From 0.20.x or 0.89.x Within 0.90.x
Upgrading to HBase 0.92.0
D. Distributions
Cloudera’s Distribution Including Apache Hadoop
E. Hush SQL Schema F. HBase Versus Bigtable Index About the Author Colophon
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion