Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Hadoop Operations Dedication SPECIAL OFFER: Upgrade this ebook with O’Reilly Preface
Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us Acknowledgments
1. Introduction 2. HDFS
Goals and Motivation Design Daemons Reading and Writing Data
The Read Path The Write Path
Managing Filesystem Metadata Namenode High Availability Namenode Federation Access and Integration
Command-Line Tools FUSE REST Support
3. MapReduce
The Stages of MapReduce Introducing Hadoop MapReduce
Daemons
Jobtracker Tasktracker
When It All Goes Wrong
Child task failures Tasktracker/worker node failures Jobtracker failures HDFS failures
YARN
4. Planning a Hadoop Cluster
Picking a Distribution and Version of Hadoop
Apache Hadoop Cloudera’s Distribution Including Apache Hadoop Versions and Features What Should I Use?
Hardware Selection
Master Hardware Selection
Namenode considerations Secondary namenode hardware Jobtracker hardware
Worker Hardware Selection Cluster Sizing Blades, SANs, and Virtualization
Operating System Selection and Preparation
Deployment Layout Software Hostnames, DNS, and Identification Users, Groups, and Privileges
Kernel Tuning
vm.swappiness vm.overcommit_memory
Disk Configuration
Choosing a Filesystem
ext3 ext4 xfs
Mount Options
Network Design
Network Usage in Hadoop: A Review
HDFS MapReduce
1 Gb versus 10 Gb Networks Typical Network Topologies
Traditional tree Spine fabric
5. Installation and Configuration
Installing Hadoop
Apache Hadoop
Tarball installation Package installation
CDH
Configuration: An Overview
The Hadoop XML Configuration Files
Environment Variables and Shell Scripts Logging Configuration HDFS
Identification and Location Optimization and Tuning Formatting the Namenode Creating a /tmp Directory
Namenode High Availability
Fencing Options Basic Configuration Automatic Failover Configuration
Initialzing ZooKeeper State
Format and Bootstrap the Namenodes
Namenode Federation MapReduce
Identification and Location Optimization and Tuning
Rack Topology Security
6. Identity, Authentication, and Authorization
Identity Kerberos and Hadoop
Kerberos: A Refresher Kerberos Support in Hadoop
Configuring Hadoop security
Authorization
HDFS MapReduce Other Tools and Systems
Apache Hive Apache HBase Apache Oozie Hue Apache Sqoop Apache Flume Apache ZooKeeper Apache Pig, Cascading, and Crunch
Tying It Together
7. Resource Management
What Is Resource Management? HDFS Quotas MapReduce Schedulers
The FIFO Scheduler
Configuration
The Fair Scheduler
Configuration
The Capacity Scheduler
Configuration
The Future
8. Cluster Maintenance
Managing Hadoop Processes
Starting and Stopping Processes with Init Scripts Starting and Stopping Processes Manually
HDFS Maintenance Tasks
Adding a Datanode Decommissioning a Datanode Checking Filesystem Integrity with fsck Balancing HDFS Block Data Dealing with a Failed Disk
MapReduce Maintenance Tasks
Adding a Tasktracker Decommissioning a Tasktracker Killing a MapReduce Job Killing a MapReduce Task Dealing with a Blacklisted Tasktracker
9. Troubleshooting
Differential Diagnosis Applied to Systems Common Failures and Problems
Humans (You) Misconfiguration Hardware Failure Resource Exhaustion Host Identification and Naming Network Partitions
“Is the Computer Plugged In?”
E-SPORE
Treatment and Care War Stories
A Mystery Bottleneck There’s No Place Like 127.0.0.1
10. Monitoring
An Overview Hadoop Metrics
Apache Hadoop 0.20.0 and CDH3 (metrics1)
JMX Support REST Interface
Using the metrics servlet Using the JMX JSON servlet
Apache Hadoop 0.20.203 and Later, and CDH4 (metrics2) What about SNMP?
Health Monitoring
Host-Level Checks All Hadoop Processes HDFS Checks MapReduce Checks
11. Backup and Recovery
Data Backup
Distributed Copy (distcp) Parallel Data Ingestion
Namenode Metadata
A. Deprecated Configuration Properties Index About the Author Colophon SPECIAL OFFER: Upgrade this ebook with O’Reilly Copyright
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion