HBase - the Definitive Guide - 2nd Edition by George, Lars -- Read -- Imperial Library of Trantor

Index

HBase - The Definitive Guide - 2nd Edition Foreword: Michael Stack Foreword: Carter Page Preface

General Information

HBase Version

What is in this Book? Target Audience What is New in the Second Edition? Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us Acknowledgments

1. Introduction

The Dawn of Big Data The Problem with Relational Database Systems Nonrelational Database Systems, Not-Only SQL or NoSQL?

Dimensions Scalability Database (De-)Normalization

Building Blocks

Backdrop Namespaces, Tables, Rows, Columns, and Cells Auto-Sharding Storage API Implementation Summary

HBase: The Hadoop Database

History Nomenclature Summary

2. Installation

Quick-Start Guide Requirements

Hardware

Servers Networking

Software

Operating system Filesystem Java Hadoop ZooKeeper SSH Domain Name Service Synchronized time File handles and process limits Datanode handlers Swappiness

Filesystems for HBase

Local HDFS S3 Other Filesystems

Installation Choices

Apache Binary Release Building from Source

Run Modes

Standalone Mode Distributed Mode

Pseudo-distributed mode Fully distributed mode

Configuration

hbase-site.xml and hbase-default.xml hbase-env.sh and hbase-env.cmd regionserver log4j.properties Example Configuration

hbase-site.xml regionservers hbase-env.sh

Client Configuration

Deployment

Script-Based Apache Whirr Puppet and Chef

Operating a Cluster

Running and Confirming Your Installation Web-based UI Introduction Shell Introduction Stopping the Cluster

3. Client API: The Basics

General Notes Data Types and Hierarchy

Generic Attributes Operations: Fingerprint and ID Query versus Mutation Durability, Consistency, and Isolation The Cell API Building Blocks

Resource Sharing

CRUD Operations

Put Method

Single Puts Client-side Write Buffer List of Puts Atomic Check-and-Put

Get Method

Single Gets The Result class List of Gets

Delete Method

Single Deletes List of Deletes Atomic Check-and-Delete

Append Method Mutate Method

Single Mutations Atomic Check-and-Mutate

Batch Operations Scans

Introduction The ResultScanner Class Scanner Caching Scanner Batching Slicing Rows Load Column Families on Demand Scanner Metrics

Miscellaneous Features

The Table Utility Methods The Bytes Class

4. Client API: Advanced Features

Filters

Introduction to Filters

The Filter Hierarchy Comparison Operators Comparators

Comparison Filters

RowFilter FamilyFilter QualifierFilter ValueFilter DependentColumnFilter

Dedicated Filters

PrefixFilter PageFilter KeyOnlyFilter FirstKeyOnlyFilter FirstKeyValueMatchingQualifiersFilter InclusiveStopFilter FuzzyRowFilter ColumnCountGetFilter ColumnPaginationFilter ColumnPrefixFilter MultipleColumnPrefixFilter ColumnRangeFilter SingleColumnValueFilter SingleColumnValueExcludeFilter TimestampsFilter RandomRowFilter

Decorating Filters

SkipFilter WhileMatchFilter

FilterList Custom Filters

Custom Filter Loading

Filter Parser Utility Filters Summary

Counters

Introduction to Counters Single Counters Multiple Counters

Coprocessors

Introduction to Coprocessors The Coprocessor Class Trinity Coprocessor Loading

Loading from Configuration Loading from Table Descriptor Loading from HBase Shell

Endpoints

The Service Interface Implementing Endpoints

Observers The ObserverContext Class The RegionObserver Class

Handling Region Life-Cycle Events Handling Client API Events The RegionCoprocessorEnvironment Class The BaseRegionObserver Class

The MasterObserver Class

The MasterCoprocessorEnvironment Class The BaseMasterObserver Class The BaseMasterAndRegionObserver Class

The RegionServerObserver Class

The RegionServerCoprocessorEnvironment Class The BaseRegionServerObserver Class

The WALObserver Class

The WALCoprocessorEnvironment Class The BaseWALObserver Class

The BulkLoadObserver Class The EndPointObserver Class

5. Client API: Administrative Features

Schema Definition

Namespaces Tables

Serialization The RegionLocator Class Server and Region Names

Table Properties Column Families

HBaseAdmin

Basic Operations Namespace Operations Table Operations Schema Operations Cluster Operations

Region Operations Table Operations: Snapshots Server Operations

Cluster Status Information

ReplicationAdmin

6. Available Clients

Introduction

Gateways Frameworks

Gateway Clients

Native Java REST

Operation Supported Formats REST Java Client

Thrift

Installation Thrift Operations Example: PHP Example: Java

Thrift2 SQL over NoSQL

Framework Clients

MapReduce

Native Java

Hive

Introduction Mapping Managed Tables

Mapping Existing Tables

Advanced Column Mapping Features

Mapping Existing Table Snapshots

Block Load Data

Pig Cascading Other Clients

Shell

Basics Commands

General Commands Namespace and Data Definition Commands Data Manipulation Commands Snapshot Commands Tool Commands Replication Commands Security Commands

Scripting

Web-based UI

Master UI Status Page

Main Page Warning Messages Region Servers Dead Region Servers Backup Masters Tables Regions in Transition Tasks Software Attributes

Master UI Related Pages

Backup Master UI Table Information Page ZooKeeper page Snapshot

Region Server UI Status Page

Main page Server Metrics Block Cache Regions Software Attributes

Shared Pages

7. Hadoop Integration

Framework

MapReduce Introduction Processing Classes

InputFormat Mapper Reducer OutputFormat

Supporting Classes MapReduce Locality Table Splits

MapReduce over Tables

Preparation

Static Provisioning Dynamic Provisioning Debugging Job Submission Problems

Table as a Data Sink Table as a Data Source Table as both Data Source and Sink Custom Processing

MapReduce over Snapshots Bulk Loading Data

A. Upgrade from Previous Releases

Upgrading to HBase 0.90.x

From 0.20.x or 0.89.x Within 0.90.x

Upgrading to HBase 0.92.0 Upgrading to HBase 0.98.x Migrate API to HBase 1.0.x

Migrate Coprocessors to post HBase 0.96 Migrate Custom Filters to post HBase 0.96

About the Author Copyright

← Prev
Back
Next →

← Prev
Back
Next →