Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Cover Page Title Copyright Contents Preface: Data Cleaning Pocket Primer
What Is the Goal? Is This Book is for Me and What Will I Learn? How Were the Code Samples Created? What You Need to Know for This Book Which bash Commands are Excluded? How Do I Set Up a Command Shell? What Are the “Next Steps” after Finishing This Book?
Chapter 1: Introduction
What Is Unix?
Available Shell Types
What Is bash?
Getting Help for bash Commands Navigating Around Directories The history Command
Listing Filenames with the ls Command Displaying Contents of Files
The cat Command The head and tail Commands The Pipe Symbol The fold Command
File Ownership: Owner, Group, and World Hidden Files Handling Problematic Filenames Working with Environment Variables
The env Command Useful Environment Variables Setting the PATH Environment Variable Specifying Aliases and Environment Variables
Finding Executable Files What Are Shell Scripts?
A Simple Shell Script
Using a Semicolon to Separate Commands The printf Command and the echo Command The echo Command and Whitespaces Command Substitution (“back tick”) Setting Environment Variables via Shell Scripts
Sourcing or “Dotting” a Shell Script
Working with Arrays Working with Nested Loops The paste Command
Inserting Blank Lines with the paste Command
The cut Command Working with Metacharacters Working with Character Classes The “pipe” Symbol and Multiple Commands A Simple Use Case Another Simple Use Case Summary
Chapter 2: Useful Commands
The join Command The fold Command The split Command The sort Command The uniq Command How to Compare Files The od Command The tr Command A Simple Use Case The find Command The tee Command File Compression Commands
The tar command The cpio Command The gzip and gunzip Commands The bunzip2 Command The zip Command
Commands for zip Files and bz Files Internal Field Separator (IFS) Data from a Range of Columns in a Dataset Working with Uneven Rows in Datasets Working with Functions in Shell Scripts Recursion and Shell Scripts Iterative Solutions for Factorial Values Summary
Chapter 3: Filtering Data with grep
What Is the grep Command? Metacharacters and the grep Command Escaping Metacharacters with the grep Command Useful Options for the grep Command
Character Classes and the grep Command
Working with the –c Option in grep Matching a Range of Lines Using Back References in the grep Command Finding Empty Lines in Datasets Using Keys to Search Datasets The Backslash Character and the grep Command Multiple Matches in the grep Command The grep Command and the xargs Command
Searching zip Files for a String
Checking for a Unique Key Value
Redirecting Error Messages
The egrep Command and the fgrep Command
Displaying “Pure” Words in a Dataset with egrep The fgrep Command
A Simple Use Case Summary
Chapter 4: Transforming Data with sed
What Is the sed Command?
The sed Execution Cycle
Matching String Patterns Using sed Substituting String Patterns Using sed
Replacing Vowels from a String or a File Deleting Multiple Digits and Letters from a String
Search and Replace with sed Datasets with Multiple Delimiters Useful Switches in sed Working with Datasets
Printing Lines Character Classes and sed Removing Control Characters
Counting Words in a Dataset Back References in sed Displaying Only “Pure” Words in a Dataset One-Line sed Commands Summary
Chapter 5: Doing Everything Else with awk
The awk Command
Built-in Variables That Control awk How Does the awk Command Work?
Aligning Text with the printf Command Conditional Logic and Control Statements
The while Statement A for loop in awk A for loop with a break Statement The next and continue Statements
Deleting Alternate Lines in Datasets Merging Lines in Datasets
Printing File Contents as a Single Line Joining Groups of Lines in a Text File Joining Alternate Lines in a Text File
Matching with Metacharacters and Character Sets Printing Lines Using Conditional Logic Splitting Filenames with awk Working with Postfix Arithmetic Operators Numeric Functions in awk One-Line awk Commands Useful Short awk Scripts Printing the Words in a Text String in awk Count Occurrences of a String in Specific Rows Printing a String in a Fixed Number of Columns Printing a Dataset in a Fixed Number of Columns Aligning Columns in Datasets Aligning Columns and Multiple Rows in Datasets Removing a Column from a Text File Subsets of Columns of Even Rows in Datasets Counting Word Frequency in Datasets Displaying Only “Pure” Words in a Dataset Working with Multiline Records in awk A Simple Use Case Another Use Case Summary
Appendix: Other Code Samples
Examples for Chapter 1 Examples for Chapter 2 Calculating Fibonacci Numbers Calculating the GCD of Two Positive Integers Calculating the LCM of Two Positive Integers Calculating Prime Divisors Examples for Chapter 3 Simulating Relational Data with the grep Command Checking Updates in a Logfile Examples for Chapter 4 Examples for Chapter 5 Processing Multiline Records Adding the Contents of Records Using the split Function in awk Scanning Diagonal Elements in Datasets Adding Values from Multiple Datasets (1) Adding Values from Multiple Datasets (2) Adding Values from Multiple Datasets (3) Calculating Combinations of Field Values Summary
Index
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion