Practical Hadoop Ecosystem

Practical Hadoop Ecosystem
Authors
Vohra, Deepak
Publisher
Apress
ISBN
9781484221983
Date
2016-10-01T00:00:00+00:00
Size
11.89 MB
Lang
en
Downloaded: 150 times

Learn how to use the Apache Hadoop projects, including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout, and Apache Solr. From setting up the environment to running sample applications each chapter in this book is a practical tutorial on using an Apache Hadoop ecosystem project.

While several books on Apache Hadoop are available, most are based on the main projects, MapReduce and HDFS, and none discusses the other Apache Hadoop ecosystem projects and how they all work together as a cohesive big data development platform.

What You Will Learn:

Set up the environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5

Run a MapReduce job

Store data with Apache Hive, and Apache HBase

Index data in HDFS with Apache Solr

Develop a Kafka messaging system

Stream Logs to HDFS with Apache Flume

Transfer data from MySQL database to Hive, HDFS, and HBase with Sqoop

Create a Hive table over Apache Solr

Develop a Mahout User Recommender System

Who This Book Is For:

Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.