Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Cover Page
Understanding Big Data
Copyright Page
Contents
Foreword
Acknowledgments
About this Book
Part I Big Data: From the Business Perspective
1 What Is Big Data? Hint: You’re a Part of It Every Day
Characteristics of Big Data
Can There Be Enough? The Volume of Data
Variety Is the Spice of Life
How Fast Is Fast? The Velocity of Data
Data in the Warehouse and Data in Hadoop (It’s Not a Versus Thing)
Wrapping It Up
2 Why Is Big Data Important?
When to Consider a Big Data Solution
Big Data Use Cases: Patterns for Big Data Deployment
IT for IT Log Analytics
The Fraud Detection Pattern
They Said What? The Social Media Pattern
The Call Center Mantra: “This Call May Be Recorded for Quality Assurance Purposes”
Risk: Patterns for Modeling and Management
Big Data and the Energy Sector
3 Why IBM for Big Data?
Big Data Has No Big Brother: It’s Ready, but Still Young
What Can Your Big Data Partner Do for You?
The IBM $100 Million Big Data Investment
A History of Big Data Innovation
Domain Expertise Matters
Part II Big Data: From the Technology Perspective
4 All About Hadoop: The Big Data Lingo Chapter
Just the Facts: The History of Hadoop
Components of Hadoop
The Hadoop Distributed File System
The Basics of MapReduce
Hadoop Common Components
Application Development in Hadoop
Pig and PigLatin
Hive
Jaql
Getting Your Data into Hadoop
Basic Copy Data
Flume
Other Hadoop Components
ZooKeeper
HBase
Oozie
Lucene
Avro
Wrapping It Up
5 InfoSphere BigInsights: Analytics for Big Data at Rest
Ease of Use: A Simple Installation Process
Hadoop Components Included in BigInsights 1.2
A Hadoop-Ready Enterprise-Quality File System: GPFS-SNC
Extending GPFS for Hadoop: GPFS Shared Nothing Cluster
What Does a GPFS-SNC Cluster Look Like?
GPFS-SNC Failover Scenarios
GPFS-SNC POSIX-Compliance
GPFS-SNC Performance
GPFS-SNC Hadoop Gives Enterprise Qualities
Compression
Splittable Compression
Compression and Decompression
Administrative Tooling
Security
Enterprise Integration
Netezza
DB2 for Linux, UNIX, and Windows
JDBC Module
InfoSphere Streams
InfoSphere DataStage
R Statistical Analysis Applications
Improved Workload Scheduling: Intelligent Scheduler
Adaptive MapReduce
Data Discovery and Visualization: BigSheets
Advanced Text Analytics Toolkit
Machine Learning Analytics
Large-Scale Indexing
BigInsights Summed Up
6 IBM InfoSphere Streams: Analytics for Big Data in Motion
InfoSphere Streams Basics
Industry Use Cases for InfoSphere Streams
How InfoSphere Streams Works
What’s a Stream?
The Streams Processing Language
Source and Sink Adapters
Operators
Streams Toolkits
Enterprise Class
High Availability
Consumability: Making the Platform Easy to Use
Integration is the Apex of Enterprise Class Analysis
← Prev
Back
Next →
← Prev
Back
Next →