Table of Contents

Cover image

Title page

Copyright

List of contributors

About the Editors

Preface

Organization of the Book

Part I: Big Data Science

Part II: Big Data Infrastructures and Platforms

Part III: Big Data Security and Privacy

Part IV: Big Data Applications

Acknowledgments

Part I: Big Data Science

Chapter 1: Big Data Analytics = Machine Learning + Cloud Computing

Abstract

1.1 Introduction

1.2 A Historical Review of Big Data

1.3 Historical Interpretation of Big Data

1.4 Defining Big Data From 3Vs to 32Vs

1.5 Big Data Analytics and Machine Learning

1.6 Big Data Analytics and Cloud Computing

1.7 Hadoop, HDFS, MapReduce, Spark, and Flink

1.8 ML + CC → BDA and Guidelines

1.9 Conclusion

Chapter 2: Real-Time Analytics

Abstract

2.1 Introduction

2.2 Computing Abstractions for Real-Time Analytics

2.3 Characteristics of Real-Time Systems

2.4 Real-Time Processing for Big Data — Concepts and Platforms

2.5 Data Stream Processing Platforms

2.6 Data Stream Analytics Platforms

2.7 Data Analysis and Analytic Techniques

2.8 Finance Domain Requirements and a Case Study

2.9 Future Research Challenges

Chapter 3: Big Data Analytics for Social Media

Abstract

Acknowledgments

3.1 Introduction

3.2 NLP and Its Applications

3.3 Text Mining

3.4 Anomaly Detection

Chapter 4: Deep Learning and Its Parallelization

Abstract

4.1 Introduction

4.2 Concepts and Categories of Deep Learning

4.3 Parallel Optimization for Deep Learning

4.4 Discussions

Chapter 5: Characterization and Traversal of Large Real-World Networks

Abstract

Acknowledgments

5.1 Introduction

5.2 Background

5.3 Characterization and Measurement

5.4 Efficient Complex Network Traversal

5.5 k-Core-Based Partitioning for Heterogeneous Graph Processing

5.6 Future Directions

5.7 Conclusions

Part II: Big Data Infrastructures and Platforms

Chapter 6: Database Techniques for Big Data

Abstract

6.1 Introduction

6.2 Background

6.3 NoSQL Movement

6.4 NoSQL Solutions for Big Data Management

6.5 NoSQL Data Models

6.6 Future Directions

6.7 Conclusions

Chapter 7: Resource Management in Big Data Processing Systems

Abstract

7.1 Introduction

7.2 Types of Resource Management

7.3 Big Data Processing Systems and Platforms

7.4 Single-Resource Management in the Cloud

7.5 Multiresource Management in the Cloud

7.6 Related Work on Resource Management

7.7 Open Problems

7.8 Summary

Chapter 8: Local Resource Consumption Shaping: A Case for MapReduce

Abstract

8.1 Introduction

8.2 Motivation

8.3 Local Resource Shaper

8.4 Evaluation

8.5 Related Work

8.6 Conclusions

Appendix CPU Utilization With Different Slot Configurations and LRS

Chapter 9: System Optimization for Big Data Processing

Abstract

9.1 Introduction

9.2 Basic Framework of the Hadoop Ecosystem

9.3 Parallel Computation Framework: MapReduce

9.4 Job Scheduling of Hadoop

9.5 Performance Optimization of HDFS

9.6 Performance Optimization of HBase

9.7 Performance Enhancement of Hadoop System

9.8 Conclusions and Future Directions

Chapter 10: Packing Algorithms for Big Data Replay on Multicore

Abstract

10.1 Introduction

10.2 Performance Bottlenecks

10.3 The Big Data Replay Method

10.4 Packing Algorithms

10.5 Performance Analysis

10.6 Summary and Future Directions

Part III: Big Data Security and Privacy

Chapter 11: Spatial Privacy Challenges in Social Networks

Abstract

Acknowledgments

11.1 Introduction

11.2 Background

11.3 Spatial Aspects of Social Networks

11.4 Cloud-Based Big Data Infrastructure

11.5 Spatial Privacy Case Studies

11.6 Conclusions

Chapter 12: Security and Privacy in Big Data

Abstract

12.1 Introduction

12.2 Secure Queries Over Encrypted Big Data

12.3 Other Big Data Security

12.4 Privacy on Correlated Big Data

12.5 Future Directions

12.6 Conclusions

Chapter 13: Location Inferring in Internet of Things and Big Data

Abstract

Acknowledgements

13.1 Introduction

13.2 Device-based Sensing Using Big Data

13.3 Device-free Sensing Using Big Data

13.4 Conclusion

Part IV: Big Data Applications

Chapter 14: A Framework for Mining Thai Public Opinions

Abstract

Acknowledgments

14.1 Introduction

14.2 XDOM

14.3 Implementation

14.4 Validation

14.5 Case Studies

14.6 Summary and Conclusions

Chapter 15: A Case Study in Big Data Analytics: Exploring Twitter Sentiment Analysis and the Weather

Abstract

Acknowledgments

15.1 Background

15.2 Big Data System Components

15.3 Machine-Learning Methodology

15.4 System Implementation

15.5 Key Findings

15.6 Summary and Conclusions

Chapter 16: Dynamic Uncertainty-Based Analytics for Caching Performance Improvements in Mobile Broadband Wireless Networks

Abstract

16.1 Introduction

16.2 Background

16.3 Related Work

16.4 VoD Architecture

16.5 Overview

16.6 Data Generation

16.7 Edge and Core Components

16.8 INCA Caching Algorithm

16.9 QoE Estimation

16.10 Theoretical Framework

16.11 Experiments and Results

16.12 Synthetic Dataset

16.13 Conclusions and Future Directions

Chapter 17: Big Data Analytics on a Smart Grid: Mining PMU Data for Event and Anomaly Detection

Abstract

Acknowledgments

17.1 Introduction

17.2 Smart Grid With PMUs and PDCs

17.3 Improving Traditional Workflow

17.4 Characterizing Normal Operation

17.5 Identifying Unusual Phenomena

17.6 Identifying Known Events

17.7 Related Efforts

17.8 Conclusion and Future Directions

Chapter 18: eScience and Big Data Workflows in Clouds: A Taxonomy and Survey

Abstract

18.1 Introduction

18.2 Background

18.3 Taxonomy and Review of eScience Services in the Cloud

18.4 Resource Provisioning for eScience Workflows in Clouds

18.5 Open Problems

18.6 Summary

Index