Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Agile Data Science Preface
Who This Book Is For How This Book Is Organized Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us
I. Setup
1. Theory
Agile Big Data Big Words Defined Agile Big Data Teams
Recognizing the Opportunity and Problem Adapting to Change
Harnessing the power of generalists Leveraging agile platforms Sharing intermediate results
Agile Big Data Process Code Review and Pair Programming Agile Environments: Engineering Productivity
Collaboration Space Private Space Personal Space
Realizing Ideas with Large-Format Printing
2. Data
Email Working with Raw Data
Raw Email Structured Versus Semistructured Data
SQL NoSQL
Serialization Extracting and Exposing Features in Evolving Schemas Data Pipelines
Data Perspectives
Networks Time Series Natural Language Probability Conclusion
3. Agile Tools
Scalability = Simplicity Agile Big Data Processing Setting Up a Virtual Environment for Python Serializing Events with Avro
Avro for Python
Installation Testing
Collecting Data Data Processing with Pig
Installing Pig
Publishing Data with MongoDB
Installing MongoDB Installing MongoDB’s Java Driver Installing mongo-hadoop Pushing Data to MongoDB from Pig
Searching Data with ElasticSearch
Installation ElasticSearch and Pig with Wonderdog
Installing Wonderdog Wonderdog and Pig Searching our data Python and ElasticSearch with pyelasticsearch
Reflecting on our Workflow Lightweight Web Applications
Python and Flask
Flask Echo ch03/python/flask_echo.py Python and Mongo with pymongo Displaying sent_counts in Flask
Presenting Our Data
Installing Bootstrap Booting Boostrap Visualizing Data with D3.js and nvd3.js
Conclusion
4. To the Cloud!
Introduction GitHub dotCloud
Echo on dotCloud Python Workers
Amazon Web Services
Simple Storage Service Elastic MapReduce MongoDB as a Service
Pushing data from Pig to MongoDB at dotCloud
Instrumentation
Google Analytics Mortar Data
II. Climbing the Pyramid
5. Collecting and Displaying Records
Putting It All Together Collect and Serialize Our Inbox Process and Publish Our Emails Presenting Emails in a Browser
Serving Emails with Flask and pymongo Rendering HTML5 with Jinja2
Agile Checkpoint Listing Emails
Listing Emails with MongoDB Anatomy of a Presentation
Reinventing the wheel? Prototyping back from HTML
Searching Our Email
Indexing Our Email with Pig, ElasticSearch, and Wonderdog Searching Our Email on the Web
Conclusion
6. Visualizing Data with Charts
Good Charts Extracting Entities: Email Addresses
Extracting Emails
Visualizing Time Conclusion
7. Exploring Data with Reports
Building Reports with Multiple Charts Linking Records Extracting Keywords from Emails with TF-IDF Conclusion
8. Making Predictions
Predicting Response Rates to Emails Personalization Conclusion
9. Driving Actions
Properties of Successful Emails Better Predictions with Naive Bayes P(Reply | From & To) P(Reply | Token) Making Predictions in Real Time Logging Events Conclusion
Index About the Author Colophon Copyright
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion