Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Chapter 1. Preparing Your Data Science EnvironmentA traditional cookbook contains culinary recipes of interest to the authors, and helps readers expand their repertoire of foods to prepare. Many might believe that the end product of a recipe is the dish itself and one can read this book, in much the same way. Every chapter guides the reader through the application of the stages of the data science pipeline to different datasets with various goals. Also, just as in cooking, the final product can simply be the analysis applied to a particular set.We hope that you will take a broader view, however. Data scientists learn by doing, ensuring that every iteration and hypothesis improves the practioner's knowledge base. By taking multiple datasets through the data science pipeline using two different programming languages (R and Python), we hope that you will start to abstract out the analysis patterns, see the bigger picture, and achieve a deeper understanding of this rather ambiguous field o
Chapter 1. Preparing Your Data Science Environment
Chapter 2. Driving Visual Analysis with Automobile Data with RIn this chapter, we will cover the following:Acquiring automobile fuel efficiency dataPreparing R for your first projectImporting automobile fuel efficiency data into RExploring and describing fuel efficiency dataAnalyzing automobile fuel efficiency over timeInvestigating the makes and models of automobiles
Chapter 2. Driving Visual Analysis with Automobile Data with R
Chapter 3. Creating Application-Oriented Analyses Using Tax Data and PythonIn this chapter, we will cover:Preparing for the analysis of top incomesImporting and exploring the world's top incomes datasetAnalyzing and visualizing the top income data of the USFurthering the analysis of the top income groups of the USReporting with Jinja2Repeating the analysis in R
Chapter 3. Creating Application-Oriented Analyses Using Tax Data and Python
Chapter 4. Modeling Stock Market DataIn this chapter, we will cover:Acquiring stock market dataSummarizing the dataCleaning and exploring the dataGenerating relative valuationsScreening stocks and analyzing historical prices
Chapter 4. Modeling Stock Market Data
Chapter 5. Visually Exploring Employment DataIn this chapter, we will cover:Preparing for analysisImporting employment data into RExploring the employment dataObtaining and merging additional dataAdding geographical informationExtracting state- and county-level wage and employment informationVisualizing geographical distributions of payExploring where the jobs are, by industryAnimating maps for a geospatial time seriesBenchmarking performance for some common tasks
Chapter 5. Visually Exploring Employment Data
Chapter 6. Driving Visual Analyses with Automobile DataIn this chapter, we will cover:Getting started with JupyterExploring Jupyter NotebookPreparing to analyze automobile fuel efficienciesExploring and describing fuel efficiency data with PythonAnalyzing automobile fuel efficiency over time with PythonInvestigating the makes and models of automobiles with Python
Chapter 6. Driving Visual Analyses with Automobile Data
Chapter 7. Working with Social GraphsIn this chapter, we will cover:Preparing to work with social networks in PythonImporting networksExploring subgraphs within a heroic networkFinding strong tiesFinding key playersExploring characteristics of entire networksClustering and community detection in social networksVisualizing graphsSocial networks in R
Chapter 7. Working with Social Graphs
Chapter 8. Recommending Movies at Scale (Python)In this chapter, we will cover the following recipes:Modeling preference expressionsUnderstanding the dataIngesting the movie review dataFinding the highest-scoring moviesImproving the movie-rating systemMeasuring the distance between users in the preference spaceComputing the correlation between usersFinding the best critic for a userPredicting movie ratings for usersCollaboratively filtering item by itemBuilding a non-negative matrix factorization modelLoading the entire dataset into the memoryDumping the SVD-based model to the diskTraining the SVD-based modelTesting the SVD-based model
Chapter 8. Recommending Movies at Scale (Python)
Chapter 9. Harvesting and Geolocating Twitter Data (Python)In this chapter, we will cover the following recipes:Creating a Twitter applicationUnderstanding the Twitter API v1.1Determining your Twitter followers and friendsPulling Twitter user profilesMaking requests without running afoul of Twitter's rate limitsStoring JSON data to the diskSetting up MongoDB for storing Twitter dataStoring user profiles in MongoDB using PyMongoExploring the geographic information available in profilesPlotting geospatial data in Python
Chapter 9. Harvesting and Geolocating Twitter Data (Python)
Chapter 10. Forecasting New Zealand Overseas VisitorsIn this chapter, we will cover the following recipes:Creating time series objectsVisualizing time series dataExploratory methods and insightsTrend and season analysisARIMA modelingAccuracy assessmentFitting Seasonal ARIMA modeling
Chapter 10. Forecasting New Zealand Overseas Visitors
Chapter 11. German Credit Data AnalysisIn this chapter, we will cover the following recipes:Transforming the dataVisualizing categorical dataDiscriminant analysis for identifying defaultsFitting logistic regression modelA decision tree for the German DataFiner aspects of decision trees
Chapter 11. German Credit Data Analysis
← Prev
Back
Next →
← Prev
Back
Next →