Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Clojure Data Analysis Cookbook
Table of Contents
Clojure Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Importing Data for Analysis
Introduction
Creating a new project
Getting ready
How to do it...
How it works...
Reading CSV data into Incanter datasets
Getting ready
How to do it…
How it works…
There's more…
See also
Reading JSON data into Incanter datasets
Getting ready
How to do it…
How it works…
Reading data from Excel with Incanter
Getting ready
How to do it…
Reading data from JDBC databases
Getting ready
How to do it…
How it works…
See also
Reading XML data into Incanter datasets
Getting ready
How to do it…
How it works…
There's more…
Navigating structures with zippers
Processing in a pipeline
Comparing XML and JSON
Scraping data from tables in web pages
Getting ready
How to do it…
How it works…
See also
Scraping textual data from web pages
Getting ready
How to do it…
How it works…
Reading RDF data
Getting ready
How to do it…
How it works…
See also
Reading RDF data with SPARQL
Getting ready
How to do it…
How it works…
See also
Aggregating data from different formats
Getting ready
How to do it…
Creating the triple store
Scraping exchange rates
Loading currency data and tying it all together
How it works…
2. Cleaning and Validating Data
Introduction
Cleaning data with regular expressions
Getting ready
How to do it…
How it works…
There's more...
See also...
Maintaining consistency with synonym maps
Getting ready
How to do it…
How it works…
See also…
Identifying and removing duplicate data
Getting ready
How to do it…
How it works…
There's more…
Normalizing numbers
Getting ready
How to do it…
How it works…
Rescaling values
Getting ready
How to do it…
How it works…
Normalizing dates and times
Getting ready
How to do it…
There's more…
Lazily processing very large data sets
Getting ready
How to do it…
How it works…
Sampling from very large data sets
How to do it…
Sampling by percentage
Sampling exactly
How it works…
Fixing spelling errors
Getting ready
How to do it…
How it works…
There's more…
Parsing custom data formats
Getting ready
How to do it…
How it works…
Validating data with Valip
Getting ready
How to do it…
How it works…
3. Managing Complexity with Concurrent Programming
Introduction
Managing program complexity with STM
Getting ready
How to do it…
How it works…
See also
Managing program complexity with agents
Getting ready
How to do it…
How it works…
There's more...
See also
Getting better performance with commute
Getting ready
How to do it…
How it works…
Combining agents and STM
Getting ready
How to do it…
How it works…
Maintaining consistency with ensure
Getting ready
How to do it…
How it works…
Introducing safe side effects into the STM
Getting ready
How to do it…
Maintaining data consistency with validators
Getting ready
How to do it…
How it works…
See also
Tracking processing with watchers
Getting ready
How to do it…
How it works…
Debugging concurrent programs with watchers
Getting ready
How to do it…
There's more...
Recovering from errors in agents
How to do it…
Failing on errors
Continuing on errors
Using a custom error handler
There's more...
Managing input with sized queues
How to do it…
How it works...
4. Improving Performance with Parallel Programming
Introduction
Parallelizing processing with pmap
How to do it…
How it works…
There's more…
See also
Parallelizing processing with Incanter
Getting ready
How to do it…
How it works…
Partitioning Monte Carlo simulations for better pmap performance
Getting ready
How to do it…
How it works…
Estimating with Monte Carlo simulations
Chunking data for pmap
Finding the optimal partition size with simulated annealing
Getting ready
How to do it…
How it works…
There's more…
Parallelizing with reducers
Getting ready
How to do it…
How it works…
There's more...
See also
Generating online summary statistics with reducers
Getting ready
How to do it…
Harnessing your GPU with OpenCL and Calx
Getting ready
How to do it…
How it works…
Writing the GPU code in C
Wrapping it in Calx
There's more…
Using type hints
Getting ready
How to do it…
How it works…
See also
Benchmarking with Criterium
Getting ready
How to do it…
How it works…
There's more…
5. Distributed Data Processing with Cascalog
Introduction
Distributed processing with Cascalog and Hadoop
Getting ready
How to do it…
How it works…
See also
Querying data with Cascalog
Getting ready
How to do it…
How it works…
There's more…
Distributing data with Apache HDFS
Getting ready
How to do it…
How it works…
Parsing CSV files with Cascalog
Getting ready
How to do it…
How it works…
There's more…
Complex queries with Cascalog
Getting ready
How to do it…
Aggregating data with Cascalog
Getting ready
How to do it…
There's more…
Defining new Cascalog operators
Getting ready
How to do it…
Creating Map operators
Creating Map concatenation operations
Creating filter operators
Creating buffer operators
Creating aggregate operators
Creating parallel aggregate operators
Composing Cascalog queries
Getting ready
How to do it…
How it works…
Handling errors in Cascalog workflows
Getting ready
How to do it…
Transforming data with Cascalog
Getting ready
How to do it…
How it works…
Executing Cascalog queries in the Cloud with Pallet
Getting ready
How to do it...
How it works...
6. Working with Incanter Datasets
Introduction
Loading Incanter's sample datasets
Getting ready
How to do it…
How it works…
There's more...
Loading Clojure data structures into datasets
Getting ready
How to do it…
How it works…
See also
Viewing datasets interactively with view
Getting ready
How to do it…
How it works…
See also
Converting datasets to matrices
Getting ready
How to do it…
How it works…
There's more…
See also
Using infix formulas in Incanter
Getting ready
How to do it…
How it works…
Selecting columns with $
Getting ready
How to do it…
How it works…
There's more…
See also
Selecting rows with $
Getting ready
How to do it…
How it works…
Filtering datasets with $where
Getting ready
How to do it…
How it works…
There's more…
Grouping data with $group-by
Getting ready
How to do it…
How it works…
Saving datasets to CSV and JSON
Getting ready
How to do it…
Saving data as CSV
Saving data as JSON
How it works…
See also
Projecting from multiple datasets with $join
Getting ready
How to do it…
How it works…
7. Preparing for and Performing Statistical Data Analysis with Incanter
Introduction
Generating summary statistics with $rollup
Getting ready
How to do it…
How it works…
Differencing variables to show changes
Getting ready
How to do it…
How it works…
Scaling variables to simplify variable relationships
Getting ready
How to do it…
How it works…
Working with time series data with Incanter Zoo
Getting ready
How to do it…
There's more...
Smoothing variables to decrease noise
Getting ready
How to do it…
How it works…
Validating sample statistics with bootstrapping
Getting ready
How to do it…
How it works…
There's more…
Modeling linear relationships
Getting ready
How to do it…
How it works…
There's more…
Modeling non-linear relationships
Getting ready
How to do it…
How it works...
Modeling multimodal Bayesian distributions
Getting ready
How to do it…
How it works…
There's more...
Finding data errors with Benford's law
Getting ready
How to do it…
How it works…
There's more…
8. Working with Mathematica and R
Introduction
Setting up Mathematica to talk to Clojuratica for Mac OS X and Linux
Getting ready
How to do it…
How it works…
There's more…
Setting up Mathematica to talk to Clojuratica for Windows
Getting ready
How to do it...
How it works...
Calling Mathematica functions from Clojuratica
Getting ready
How to do it…
How it works…
Sending matrices to Mathematica from Clojuratica
Getting ready
How to do it…
How it works…
Evaluating Mathematica scripts from Clojuratica
Getting ready
How to do it…
How it works…
Creating functions from Mathematica
Getting ready
How to do it…
How it works…
Processing functions in parallel in Mathematica
Getting ready
How to do it…
How it works…
Setting up R to talk to Clojure
Getting ready
How to do it…
Setting up R
Setting up Clojure
How it works…
Calling R functions from Clojure
Getting ready
How to do it…
How it works…
There's more…
Passing vectors into R
Getting ready
How to do it…
How it works…
Evaluating R files from Clojure
Getting ready
How to do it…
How it works…
There's more…
Plotting in R from Clojure
Getting ready
How to do it…
How it works…
There's more…
9. Clustering, Classifying, and Working with Weka
Introduction
Loading CSV and ARFF files into Weka
Getting ready
How to do it…
How it works…
There's more…
See also
Filtering and renaming columns in Weka datasets
Getting ready
How to do it…
Renaming columns
Removing columns
Hiding columns
How it works…
Discovering groups of data using K-means clustering
Getting ready
How to do it…
How it works…
Clustering with K-means
Analyzing the results
Building macros
See also
Finding hierarchical clusters in Weka
Getting ready
How to do it…
How it works…
There's more…
Clustering with SOMs in Incanter
Getting ready
How to do it…
How it works…
There's more…
Classifying data with decision trees
Getting ready
How to do it…
How it works…
There's more…
Classifying data with the Naive Bayesian classifier
Getting ready
How to do it…
How it works…
There's more…
Classifying data with support vector machines
Getting ready
How to do it…
How it works…
There's more…
Finding associations in data with the Apriori algorithm
Getting ready
How to do it…
How it works…
There's more…
10. Graphing in Incanter
Introduction
Creating scatter plots with Incanter
Getting ready
How to do it…
How it works…
There's more…
See also
Creating bar charts with Incanter
Getting ready
How to do it…
How it works…
Graphing non-numeric data in bar charts
Getting ready
How to do it…
How it works…
Creating histograms with Incanter
Getting ready
How to do it…
How it works…
Creating function plots with Incanter
Getting ready
How to do it…
How it works…
See also
Adding equations to Incanter charts
Getting ready
How to do it…
There's more…
Adding lines to scatter charts
Getting ready
How to do it…
How it works…
See also
Customizing charts with JFreeChart
Getting ready
How to do it…
How it works…
See also
Saving Incanter graphs to PNG
Getting ready
How to do it…
How it works…
Using PCA to graph multi-dimensional data
Getting ready
How to do it…
How it works…
There's more…
Creating dynamic charts with Incanter
Getting ready
How to do it…
How it works…
11. Creating Charts for the Web
Introduction
Serving data with Ring and Compojure
Getting ready
How to do it…
Configuring and setting up the web application
Serving data
Defining routes and handlers
Running the server
How it works…
There's more…
Creating HTML with Hiccup
Getting ready
How to do it…
How it works…
There's more…
Setting up to use ClojureScript
Getting ready
How to do it…
How it works…
There's more…
Creating scatter plots with NVD3
Getting ready
How to do it…
How it works…
There's more…
Creating bar charts with NVD3
Getting ready
How to do it…
How it works…
Creating histograms with NVD3
Getting ready
How to do it…
How it works…
Visualizing graphs with force-directed layouts
Getting ready
How to do it…
How it works…
There's more…
Creating interactive visualizations with D3
Getting ready
How to do it…
How it works…
There's more…
Index
← Prev
Back
Next →
← Prev
Back
Next →