Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Enterprise Data Workflows with Cascading
Preface
Requirements
Enterprise Data Workflows
Complexity, More So Than Bigness
Origins of the Cascading API
Using Code Examples
Safari® Books Online
How to Contact Us
Kudos
1. Getting Started
Programming Environment Setup
Example 1: Simplest Possible App in Cascading
Build and Run
Cascading Taxonomy
Example 2: The Ubiquitous Word Count
Flow Diagrams
Predictability at Scale
2. Extending Pipe Assemblies
Example 3: Customized Operations
Scrubbing Tokens
Example 4: Replicated Joins
Stop Words and Replicated Joins
Comparing with Apache Pig
Comparing with Apache Hive
3. Test-Driven Development
Example 5: TF-IDF Implementation
Example 6: TF-IDF with Testing
A Word or Two About Testing
4. Scalding—A Scala DSL for Cascading
Why Use Scalding?
Getting Started with Scalding
Example 3 in Scalding: Word Count with Customized Operations
A Word or Two about Functional Programming
Example 4 in Scalding: Replicated Joins
Build Scalding Apps with Gradle
Running on Amazon AWS
5. Cascalog—A Clojure DSL for Cascading
Why Use Cascalog?
Getting Started with Cascalog
Example 1 in Cascalog: Simplest Possible App
Example 4 in Cascalog: Replicated Joins
Example 6 in Cascalog: TF-IDF with Testing
Cascalog Technology and Uses
6. Beyond MapReduce
Applications and Organizations
Lingual, a DSL for ANSI SQL
Using the SQL Command Shell
Using the JDBC Driver
Integrating with Desktop Tools
Pattern, a DSL for Predictive Model Markup Language
Getting Started with Pattern
Predefined App for PMML
Integrating Pattern into Cascading Apps
Customer Experiments
Technology Roadmap for Pattern
7. The Workflow Abstraction
Key Insights
Pattern Language
Literate Programming
Separation of Concerns
Functional Relational Programming
Enterprise vs. Start-Ups
8. Case Study: City of Palo Alto Open Data
Why Open Data?
City of Palo Alto
Moving from Raw Sources to Data Products
Calibrating Metrics for the Recommender
Spatial Indexing
Personalization
Recommendations
Build and Run
Key Points of the Recommender Workflow
A. Troubleshooting Workflows
Build and Runtime Problems
Anti-Patterns
Workflow Bottlenecks
Other Resources
Index
About the Author
Colophon
Copyright
← Prev
Back
Next →
← Prev
Back
Next →