Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
HBase - The Definitive Guide - 2nd Edition
Foreword: Michael Stack
Foreword: Carter Page
Preface
General Information
HBase Version
What is in this Book?
Target Audience
What is New in the Second Edition?
Conventions Used in This Book
Using Code Examples
SafariĀ® Books Online
How to Contact Us
Acknowledgments
1. Introduction
The Dawn of Big Data
The Problem with Relational Database Systems
Nonrelational Database Systems, Not-Only SQL or NoSQL?
Dimensions
Scalability
Database (De-)Normalization
Building Blocks
Backdrop
Namespaces, Tables, Rows, Columns, and Cells
Auto-Sharding
Storage API
Implementation
Summary
HBase: The Hadoop Database
History
Nomenclature
Summary
2. Installation
Quick-Start Guide
Requirements
Hardware
Servers
Networking
Software
Operating system
Filesystem
Java
Hadoop
ZooKeeper
SSH
Domain Name Service
Synchronized time
File handles and process limits
Datanode handlers
Swappiness
Filesystems for HBase
Local
HDFS
S3
Other Filesystems
Installation Choices
Apache Binary Release
Building from Source
Run Modes
Standalone Mode
Distributed Mode
Pseudo-distributed mode
Fully distributed mode
Configuration
hbase-site.xml and hbase-default.xml
hbase-env.sh and hbase-env.cmd
regionserver
log4j.properties
Example Configuration
hbase-site.xml
regionservers
hbase-env.sh
Client Configuration
Deployment
Script-Based
Apache Whirr
Puppet and Chef
Operating a Cluster
Running and Confirming Your Installation
Web-based UI Introduction
Shell Introduction
Stopping the Cluster
3. Client API: The Basics
General Notes
Data Types and Hierarchy
Generic Attributes
Operations: Fingerprint and ID
Query versus Mutation
Durability, Consistency, and Isolation
The Cell
API Building Blocks
Resource Sharing
CRUD Operations
Put Method
Single Puts
Client-side Write Buffer
List of Puts
Atomic Check-and-Put
Get Method
Single Gets
The Result class
List of Gets
Delete Method
Single Deletes
List of Deletes
Atomic Check-and-Delete
Append Method
Mutate Method
Single Mutations
Atomic Check-and-Mutate
Batch Operations
Scans
Introduction
The ResultScanner Class
Scanner Caching
Scanner Batching
Slicing Rows
Load Column Families on Demand
Scanner Metrics
Miscellaneous Features
The Table Utility Methods
The Bytes Class
4. Client API: Advanced Features
Filters
Introduction to Filters
The Filter Hierarchy
Comparison Operators
Comparators
Comparison Filters
RowFilter
FamilyFilter
QualifierFilter
ValueFilter
DependentColumnFilter
Dedicated Filters
PrefixFilter
PageFilter
KeyOnlyFilter
FirstKeyOnlyFilter
FirstKeyValueMatchingQualifiersFilter
InclusiveStopFilter
FuzzyRowFilter
ColumnCountGetFilter
ColumnPaginationFilter
ColumnPrefixFilter
MultipleColumnPrefixFilter
ColumnRangeFilter
SingleColumnValueFilter
SingleColumnValueExcludeFilter
TimestampsFilter
RandomRowFilter
Decorating Filters
SkipFilter
WhileMatchFilter
FilterList
Custom Filters
Custom Filter Loading
Filter Parser Utility
Filters Summary
Counters
Introduction to Counters
Single Counters
Multiple Counters
Coprocessors
Introduction to Coprocessors
The Coprocessor Class Trinity
Coprocessor Loading
Loading from Configuration
Loading from Table Descriptor
Loading from HBase Shell
Endpoints
The Service Interface
Implementing Endpoints
Observers
The ObserverContext Class
The RegionObserver Class
Handling Region Life-Cycle Events
Handling Client API Events
The RegionCoprocessorEnvironment Class
The BaseRegionObserver Class
The MasterObserver Class
The MasterCoprocessorEnvironment Class
The BaseMasterObserver Class
The BaseMasterAndRegionObserver Class
The RegionServerObserver Class
The RegionServerCoprocessorEnvironment Class
The BaseRegionServerObserver Class
The WALObserver Class
The WALCoprocessorEnvironment Class
The BaseWALObserver Class
The BulkLoadObserver Class
The EndPointObserver Class
5. Client API: Administrative Features
Schema Definition
Namespaces
Tables
Serialization
The RegionLocator Class
Server and Region Names
Table Properties
Column Families
HBaseAdmin
Basic Operations
Namespace Operations
Table Operations
Schema Operations
Cluster Operations
Region Operations
Table Operations: Snapshots
Server Operations
Cluster Status Information
ReplicationAdmin
6. Available Clients
Introduction
Gateways
Frameworks
Gateway Clients
Native Java
REST
Operation
Supported Formats
REST Java Client
Thrift
Installation
Thrift Operations
Example: PHP
Example: Java
Thrift2
SQL over NoSQL
Framework Clients
MapReduce
Native Java
Hive
Introduction
Mapping Managed Tables
Mapping Existing Tables
Advanced Column Mapping Features
Mapping Existing Table Snapshots
Block Load Data
Pig
Cascading
Other Clients
Shell
Basics
Commands
General Commands
Namespace and Data Definition Commands
Data Manipulation Commands
Snapshot Commands
Tool Commands
Replication Commands
Security Commands
Scripting
Web-based UI
Master UI Status Page
Main Page
Warning Messages
Region Servers
Dead Region Servers
Backup Masters
Tables
Regions in Transition
Tasks
Software Attributes
Master UI Related Pages
Backup Master UI
Table Information Page
ZooKeeper page
Snapshot
Region Server UI Status Page
Main page
Server Metrics
Block Cache
Regions
Software Attributes
Shared Pages
7. Hadoop Integration
Framework
MapReduce Introduction
Processing Classes
InputFormat
Mapper
Reducer
OutputFormat
Supporting Classes
MapReduce Locality
Table Splits
MapReduce over Tables
Preparation
Static Provisioning
Dynamic Provisioning
Debugging Job Submission Problems
Table as a Data Sink
Table as a Data Source
Table as both Data Source and Sink
Custom Processing
MapReduce over Snapshots
Bulk Loading Data
A. Upgrade from Previous Releases
Upgrading to HBase 0.90.x
From 0.20.x or 0.89.x
Within 0.90.x
Upgrading to HBase 0.92.0
Upgrading to HBase 0.98.x
Migrate API to HBase 1.0.x
Migrate Coprocessors to post HBase 0.96
Migrate Custom Filters to post HBase 0.96
About the Author
Copyright
← Prev
Back
Next →
← Prev
Back
Next →