Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Title Page Copyright and Credits
Hands-On Data Analysis with Scala
Dedication About Packt
Why subscribe? Packt.com
Contributors
About the author About the reviewer Packt is searching for authors like you
Preface
Who this book is for What this book covers To get the most out of this book
Download the example code files Download the color images Conventions used
Get in touch
Reviews
Section 1: Scala and Data Analysis Life Cycle Scala Overview
Getting started with Scala
Running Scala code online
Scastie ScalaFiddle
Installing Scala on your computer
Installing command-line tools Installing IDE
Overview of object-oriented and functional programming
Object-oriented programming using Scala Functional programming using Scala
Scala case classes and the collection API
Scala case classes Scala collection API
Array List Map
Overview of Scala libraries for data analysis
Apache Spark Breeze Breeze-viz DeepLearning Epic Saddle Scalalab Smile Vegas
Summary
Data Analysis Life Cycle
Data journey Sourcing data
Data formats
XML JSON CSV
Understanding data
Using statistical methods for data exploration
Using Scala Other Scala tools
Using data visualization for data exploration
Using the vegas-viz library for data visualization Other libraries for data visualization
Using ML to learn from data
Setting up Smile Running Smile
Creating a data pipeline Summary
Data Ingestion
Data extraction
Pull-oriented data extraction Push-oriented data delivery
Data staging
Why is the staging important?
Cleaning and normalizing Enriching Organizing and storing Summary
Data Exploration and Visualization
Sampling data
Selecting the sample
Selecting samples using Saddle
Performing ad hoc analysis Finding a relationship between data elements Visualizing data
Vegas viz for data visualization Spark Notebook for data visualization
Downloading and installing Spark Notebook Creating a Spark Notebook with simple visuals More charts with Spark Notebook
Box plot Histogram Bubble chart
Summary
Applying Statistics and Hypothesis Testing
Basics of statistics
Summary level statistics Correlation statistics
Vector level statistics Random data generation
Pseudorandom numbers Random numbers with normal distribution Random numbers with Poisson distribution
Hypothesis testing Summary
Section 2: Advanced Data Analysis and Machine Learning Introduction to Spark for Distributed Data Analysis
Spark setup and overview
Spark core concepts
Spark Datasets and DataFrames Sourcing data using Spark
Parquet file format Avro file format Spark JDBC integration
Using Spark to explore data Summary
Traditional Machine Learning for Data Analysis
ML overview
Characteristics of ML Categories or types of ML
Decision trees
Implementing decision trees
Decision tree algorithms
Implementing decision tree algorithms in our example Evaluating the results
Using our model with a decision tree
Random forest
Random forest algorithms
Ridge and lasso regression
Characteristics of ridge regression Characteristics of lasso regression
k-means cluster analysis Natural language processing for data analysis Algorithm selections Summary
Section 3: Real-Time Data Analysis and Scalability Near Real-Time Data Analysis Using Streaming
Overview of streaming Spark Streaming overview
Word count using pure Scala Word count using Scala and Spark Word count using Scala and Spark Streaming Deep dive into the Spark Streaming solution
Streaming a k-means clustering algorithm using Spark Streaming linear regression using Spark Summary
Working with Data at Scale
Working with data at scale Cost considerations
Data storage Data governance
Reliability considerations
Input data errors Processing failures
Summary
Another Book You May Enjoy
Leave a review - let other readers know what you think
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion