Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Regular Expressions Cookbook Preface
Caught in the Snarls of Different Versions Intended Audience Technology Covered Organization of This Book Conventions Used in This Book Using Code Examples Safari® Books Online How to Contact Us Acknowledgments
1. Introduction to Regular Expressions
Regular Expressions Defined
Many Flavors of Regular Expressions Regex Flavors Covered by This Book
Search and Replace with Regular Expressions
Many Flavors of Replacement Text
Tools for Working with Regular Expressions
RegexBuddy RegexPal RegexMagic More Online Regex Testers
RegexPlanet regex.larsolavtorvik.com Nregex Rubular myregexp.com
More Desktop Regular Expression Testers
Expresso The Regulator SDL Regex Fuzzer
grep
PowerGREP Windows Grep RegexRenamer
Popular Text Editors
2. Basic Regular Expression Skills
2.1. Match Literal Text
Problem Solution Discussion Variations
Block escape Case-insensitive matching
See Also
2.2. Match Nonprintable Characters
Problem Solution Discussion Variations on Representations of Nonprinting Characters
The 26 control characters The 7-bit character set
See Also
2.3. Match One of Many Characters
Problem Solution
Calendar with misspellings Hexadecimal character Nonhexadecimal character
Discussion Variations
Shorthands Case insensitivity
Flavor-Specific Features
.NET character class subtraction Java character class union, intersection, and subtraction
See Also
2.4. Match Any Character
Problem Solution
Any character except line breaks Any character including line breaks
Discussion
Any character except line breaks Any character including line breaks Dot abuse
Variations See Also
2.5. Match Something at the Start and/or the End of a Line
Problem Solution
Start of the subject End of the subject Start of a line End of a line
Discussion
Anchors and lines Start of the subject End of the subject Start of a line End of a line Zero-length matches
Variations See Also
2.6. Match Whole Words
Problem Solution
Word boundaries Nonboundaries
Discussion
Word boundaries Nonboundaries
Word Characters See Also
2.7. Unicode Code Points, Categories, Blocks, and Scripts
Problem Solution
Unicode code point Unicode category Unicode block Unicode script Unicode grapheme
Discussion
Unicode code point Unicode category Unicode block Unicode script Unicode grapheme
Variations
Negated variant Character classes Listing all characters
See Also
2.8. Match One of Several Alternatives
Problem Solution Discussion See Also
2.9. Group and Capture Parts of the Match
Problem Solution Discussion Variations
Noncapturing groups Group with mode modifiers
See Also
2.10. Match Previously Matched Text Again
Problem Solution Discussion See Also
2.11. Capture and Name Parts of the Match
Problem Solution
Named capture Named backreferences
Discussion
Named capture Named backreferences Groups with the same name
See Also
2.12. Repeat Part of the Regex a Certain Number of Times
Problem Solution
Googol Hexadecimal number Hexadecimal number with optional suffix Floating-point number
Discussion
Fixed repetition Variable repetition Infinite repetition Making something optional Repeating groups
See Also
2.13. Choose Minimal or Maximal Repetition
Problem Solution Discussion See Also
2.14. Eliminate Needless Backtracking
Problem Solution Discussion See Also
2.15. Prevent Runaway Repetition
Problem Solution Discussion Variations See Also
2.16. Test for a Match Without Adding It to the Overall Match
Problem Solution Discussion
Lookaround Negative lookaround Different levels of lookbehind Matching the same text twice Lookaround is atomic
Alternative to Lookbehind Solution Without Lookbehind See Also
2.17. Match One of Two Alternatives Based on a Condition
Problem Solution Discussion See Also
2.18. Add Comments to a Regular Expression
Problem Solution Discussion
Free-spacing mode Java has free-spacing character classes
Variations
2.19. Insert Literal Text into the Replacement Text
Problem Solution Discussion
When and how to escape characters in replacement text .NET and JavaScript Java PHP Perl Python and Ruby More escape rules for string literals
See Also
2.20. Insert the Regex Match into the Replacement Text
Problem Solution
Regular expression Replacement
Discussion See Also
2.21. Insert Part of the Regex Match into the Replacement Text
Problem Solution
Regular expression Replacement
Discussion
Replacements using capturing groups $10 and higher References to nonexistent groups
Solution Using Named Capture
Regular expression Replacement Flavors that support named capture
See Also
2.22. Insert Match Context into the Replacement Text
Problem Solution Discussion See Also
3. Programming with Regular Expressions
Programming Languages and Regex Flavors
Languages Covered in This Chapter More Programming Languages
3.1. Literal Regular Expressions in Source Code
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
See Also
3.2. Import the Regular Expression Library
Problem Solution
C# VB.NET XRegExp Java Python
Discussion
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
3.3. Create Regular Expression Objects
Problem Solution
C# VB.NET Java JavaScript XRegExp Perl Python Ruby
Discussion
.NET Java JavaScript XRegExp PHP Perl Python Ruby
Compiling a Regular Expression Down to CIL
C# VB.NET
Discussion See Also
3.4. Set Regular Expression Options
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion
.NET Java JavaScript XRegExp PHP Perl Python Ruby
Additional Language-Specific Options
.NET Java JavaScript XRegExp PHP Perl Python Ruby
See Also
3.5. Test If a Match Can Be Found Within a Subject String
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
C# and VB.NET Java JavaScript PHP Perl Python Ruby
See Also
3.6. Test Whether a Regex Matches the Subject String Entirely
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
C# and VB.NET Java JavaScript PHP Perl Python Ruby
See Also
3.7. Retrieve the Matched Text
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
See Also
3.8. Determine the Position and Length of the Match
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
See Also
3.9. Retrieve Part of the Matched Text
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
Named Capture
C# VB.NET Java XRegExp PHP Perl Python Ruby
See Also
3.10. Retrieve a List of All Matches
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
See Also
3.11. Iterate over All Matches
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion
.NET Java JavaScript XRegExp PHP Perl Python Ruby
See Also
3.12. Validate Matches in Procedural Code
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion See Also
3.13. Find a Match Within Another Match
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion See Also
3.14. Replace All Matches
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
See Also
3.15. Replace Matches Reusing Parts of the Match
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
.NET Java JavaScript PHP Perl Python Ruby
Named Capture
C# VB.NET Java 7 XRegExp PHP Perl Python Ruby
See Also
3.16. Replace Matches with Replacements Generated in Code
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
C# VB.NET Java JavaScript PHP Perl Python Ruby
See Also
3.17. Replace All Matches Within the Matches of Another Regex
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion See Also
3.18. Replace All Matches Between the Matches of Another Regex
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion
Perl and Ruby Python
See Also
3.19. Split a String
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion
C# and VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
See Also
3.20. Split a String, Keeping the Regex Matches
Problem Solution
C# VB.NET Java JavaScript XRegExp PHP Perl Python Ruby
Discussion
.NET Java JavaScript XRegExp PHP Perl Python Ruby
See Also
3.21. Search Line by Line
Problem Solution
C# VB.NET Java JavaScript PHP Perl Python Ruby
Discussion See Also
Construct a Parser
Problem Solution
C# VB.NET Java JavaScript XRegExp Perl Python PHP Ruby
Discussion See Also
4. Validation and Formatting
4.1. Validate Email Addresses
Problem Solution
Simple Simple, with restrictions on characters Simple, with all valid local part characters No leading, trailing, or consecutive dots Top-level domain has two to six letters
Discussion
About email addresses Regular expression syntax Building a regex step-by-step
Variations See Also
4.2. Validate and Format North American Phone Numbers
Problem Solution
Regular expression Replacement C# example JavaScript example Other programming languages
Discussion Variations
Eliminate invalid phone numbers Find phone numbers in documents Allow a leading “1” Allow seven-digit phone numbers
See Also
4.3. Validate International Phone Numbers
Problem Solution
Regular expression JavaScript example
Discussion Variations
Validate international phone numbers in EPP format
See Also
4.4. Validate Traditional Date Formats
Problem Solution Discussion Variations See Also
4.5. Validate Traditional Date Formats, Excluding Invalid Dates
Problem Solution
C# Perl Pure regular expression
Discussion
Regex with procedural code Pure regular expression
Variations See Also
4.6. Validate Traditional Time Formats
Problem Solution Discussion Variations See Also
4.7. Validate ISO 8601 Dates and Times
Problem Solution
Dates Weeks Times Date and time XML Schema dates and times
Discussion See Also
4.8. Limit Input to Alphanumeric Characters
Problem Solution
Regular expression Ruby example
Discussion Variations
Limit input to ASCII characters Limit input to ASCII noncontrol characters and line breaks Limit input to shared ISO-8859-1 and Windows-1252 characters Limit input to alphanumeric characters in any language
See Also
4.9. Limit the Length of Text
Problem Solution
Regular expression Perl example
Discussion Variations
Limit the length of an arbitrary pattern Limit the number of nonwhitespace characters Limit the number of words
See Also
4.10. Limit the Number of Lines in Text
Problem Solution
Regular expression PHP (PCRE) example
Discussion Variations
Working with esoteric line separators
See Also
4.11. Validate Affirmative Responses
Problem Solution
Regular expression JavaScript example
Discussion See Also
4.12. Validate Social Security Numbers
Problem Solution
Regular expression Python example
Discussion Variations
Find Social Security numbers in documents
See Also
4.13. Validate ISBNs
Problem Solution
Regular expressions JavaScript example, with checksum validation Python example, with checksum validation
Discussion
ISBN-10 checksum ISBN-13 checksum
Variations
Find ISBNs in documents Eliminate incorrect ISBN identifiers
See Also
4.14. Validate ZIP Codes
Problem Solution
Regular expression VB.NET example
Discussion See Also
4.15. Validate Canadian Postal Codes
Problem Solution Discussion See Also
4.16. Validate U.K. Postcodes
Problem Solution Discussion See Also
4.17. Find Addresses with Post Office Boxes
Problem Solution
Regular expression C# example
Discussion See Also
4.18. Reformat Names From “FirstName LastName” to “LastName, FirstName”
Problem Solution
Regular expression Replacement JavaScript example
Discussion Variations
List surname particles at the beginning of the name
See Also
4.19. Validate Password Complexity
Problem Solution
Length between 8 and 32 characters ASCII visible and space characters only One or more uppercase letters One or more lowercase letters One or more numbers One or more special characters Disallow three or more sequential identical characters Example JavaScript solution, basic Example JavaScript solution, with x out of y validation Example JavaScript solution, with password security ranking
Discussion
Example JavaScript solutions
Variations
Validate multiple password rules with a single regex
See Also
4.20. Validate Credit Card Numbers
Problem Solution
Strip spaces and hyphens Validate the number Example web page with JavaScript
Discussion
Strip spaces and hyphens Validate the number Incorporating the solution into a web page
Extra Validation with the Luhn Algorithm See Also
4.21. European VAT Numbers
Problem Solution
Strip whitespace and punctuation Validate the number
Discussion
Strip whitespace and punctuation Validate the number
Variations See Also
5. Words, Lines, and Special Characters
5.1. Find a Specific Word
Problem Solution Discussion See Also
5.2. Find Any of Multiple Words
Problem Solution
Using alternation Example JavaScript solution
Discussion
Using alternation Example JavaScript solution
See Also
5.3. Find Similar Words
Problem Solution
Color or colour Bat, cat, or rat Words ending with “phobia” Steve, Steven, or Stephen Variations of “regular expression”
Discussion
Use word boundaries to match complete words Color or colour Bat, cat, or rat Words ending with “phobia” Steve, Steven, or Stephen Variations of “regular expression”
See Also
5.4. Find All Except a Specific Word
Problem Solution Discussion Variations
Find words that don’t contain another word
See Also
5.5. Find Any Word Not Followed by a Specific Word
Problem Solution Discussion Variations See Also
5.6. Find Any Word Not Preceded by a Specific Word
Problem Solution
Lookbehind you Words not preceded by “cat” Simulate lookbehind
Discussion
Fixed, finite, and infinite length lookbehind Simulate lookbehind
Variations See Also
5.7. Find Words Near Each Other
Problem Solution Discussion Variations
Using a conditional Match three or more words near each other
Exponentially increasing permutations The ugly solution Exploiting empty backreferences JavaScript backreferences by its own rules
Multiple words, any distance from each other
See Also
5.8. Find Repeated Words
Problem Solution Discussion Variations See Also
5.9. Remove Duplicate Lines
Problem Solution
Option 1: Sort lines and remove adjacent duplicates Option 2: Keep the last occurrence of each duplicate line in an unsorted file Option 3: Keep the first occurrence of each duplicate line in an unsorted file
Discussion
Option 1: Sort lines and remove adjacent duplicates Option 2: Keep the last occurrence of each duplicate line in an unsorted file Option 3: Keep the first occurrence of each duplicate line in an unsorted file
See Also
5.10. Match Complete Lines That Contain a Word
Problem Solution Discussion Variations See Also
5.11. Match Complete Lines That Do Not Contain a Word
Problem Solution Discussion See Also
5.12. Trim Leading and Trailing Whitespace
Problem Solution Discussion Variations See Also
5.13. Replace Repeated Whitespace with a Single Space
Problem Solution
Clean any whitespace characters Clean horizontal whitespace characters
Discussion
Clean any whitespace characters Clean horizontal whitespace characters
See Also
5.14. Escape Regular Expression Metacharacters
Problem Solution
Built-in solutions Regular expression Replacement Example JavaScript function
Discussion Variations See Also
6. Numbers
6.1. Integer Numbers
Problem Solution Discussion See Also
6.2. Hexadecimal Numbers
Problem Solution Discussion See Also
6.3. Binary Numbers
Problem Solution Discussion See Also
6.4. Octal Numbers
Problem Solution Discussion See Also
6.5. Decimal Numbers
Problem Solution Discussion See Also
6.6. Strip Leading Zeros
Problem Solution
Regular expression Replacement Getting the numbers in Perl Stripping leading zeros in PHP
Discussion See Also
6.7. Numbers Within a Certain Range
Problem Solution Discussion See Also
6.8. Hexadecimal Numbers Within a Certain Range
Problem Solution Discussion See Also
6.9. Integer Numbers with Separators
Problem Solution Discussion See Also
6.10. Floating-Point Numbers
Problem Solution Discussion See Also
6.11. Numbers with Thousand Separators
Problem Solution Discussion See Also
6.12. Add Thousand Separators to Numbers
Problem Solution
Basic solution Match separator positions only, using lookbehind
Discussion
Introduction Basic solution Match separator positions only, using lookbehind
Variations
Don’t add commas after a decimal point
Use infinite lookbehind Search-and-replace within matched numbers
See Also
6.13. Roman Numerals
Problem Solution Discussion Convert Roman Numerals to Decimal See Also
7. Source Code and Log Files
Keywords
Problem Solution Discussion Variations See Also
Identifiers
Problem Solution Discussion See Also
Numeric Constants
Problem Solution Discussion See Also
Operators
Problem Solution Discussion
Single-Line Comments
Problem Solution Discussion See Also
Multiline Comments
Problem Solution Discussion Variations See Also
All Comments
Problem Solution Discussion See Also
Strings
Problem Solution Discussion Variations See Also
Strings with Escapes
Problem Solution Discussion Variations See Also
Regex Literals
Problem Solution Discussion See Also
Here Documents
Problem Solution Discussion See Also
Common Log Format
Problem Solution Discussion Variations See Also
Combined Log Format
Problem Solution Discussion See Also
Broken Links Reported in Web Logs
Problem Solution Discussion See Also
8. URLs, Paths, and Internet Addresses
8.1. Validating URLs
Problem Solution Discussion See Also
8.2. Finding URLs Within Full Text
Problem Solution Discussion See Also
8.3. Finding Quoted URLs in Full Text
Problem Solution Discussion See Also
8.4. Finding URLs with Parentheses in Full Text
Problem Solution Discussion See Also
8.5. Turn URLs into Links
Problem Solution Discussion See Also
8.6. Validating URNs
Problem Solution Discussion See Also
8.7. Validating Generic URLs
Problem Solution Discussion See Also
8.8. Extracting the Scheme from a URL
Problem Solution
Extract the scheme from a URL known to be valid Extract the scheme while validating the URL
Discussion See Also
8.9. Extracting the User from a URL
Problem Solution
Extract the user from a URL known to be valid Extract the user while validating the URL
Discussion See Also
8.10. Extracting the Host from a URL
Problem Solution
Extract the host from a URL known to be valid Extract the host while validating the URL
Discussion See Also
8.11. Extracting the Port from a URL
Problem Solution
Extract the port from a URL known to be valid Extract the port while validating the URL
Discussion See Also
8.12. Extracting the Path from a URL
Problem Solution Discussion See Also
8.13. Extracting the Query from a URL
Problem Solution Discussion See Also
8.14. Extracting the Fragment from a URL
Problem Solution Discussion See Also
8.15. Validating Domain Names
Problem Solution Discussion See Also
8.16. Matching IPv4 Addresses
Problem Solution
Regular expression Perl
Discussion See Also
8.17. Matching IPv6 Addresses
Problem Solution
Standard notation Mixed notation Standard or mixed notation Compressed notation Compressed mixed notation Standard, mixed, or compressed notation
Discussion
Standard notation Mixed notation Standard or mixed notation Compressed notation Compressed mixed notation Standard, mixed, or compressed notation
See Also
8.18. Validate Windows Paths
Problem Solution
Drive letter paths Drive letter and UNC paths Drive letter, UNC, and relative paths
Discussion
Drive letter paths Drive letter and UNC paths Drive letter, UNC, and relative paths
See Also
8.19. Split Windows Paths into Their Parts
Problem Solution
Drive letter paths Drive letter and UNC paths Drive letter, UNC, and relative paths
Discussion
Drive letter paths Drive letter and UNC paths Drive letter, UNC, and relative paths
See Also
8.20. Extract the Drive Letter from a Windows Path
Problem Solution Discussion See Also
8.21. Extract the Server and Share from a UNC Path
Problem Solution Discussion See Also
8.22. Extract the Folder from a Windows Path
Problem Solution Discussion See Also
8.23. Extract the Filename from a Windows Path
Problem Solution Discussion See Also
8.24. Extract the File Extension from a Windows Path
Problem Solution Discussion See Also
8.25. Strip Invalid Characters from Filenames
Problem Solution
Regular expression Replacement
Discussion See Also
9. Markup and Data Formats
Processing Markup and Data Formats with Regular Expressions
Basic Rules for Formats Covered in This Chapter
9.1. Find XML-Style Tags
Problem Solution
Quick and dirty Allow > in attribute values (X)HTML tags (loose) (X)HTML tags (strict) XML tags (strict)
Discussion
A few words of caution Quick and dirty Allow > in attribute values (X)HTML tags (loose) (X)HTML tags (strict) XML tags (strict)
Skip Tricky (X)HTML and XML Sections
Outer regex for (X)HTML Outer regex for XML
See Also
9.2. Replace <b> Tags with <strong>
Problem Solution Discussion Variations
Replace a list of tags
See Also
9.3. Remove All XML-Style Tags Except <em> and <strong>
Problem Solution
Solution 1: Match tags except <em> and <strong> Solution 2: Match tags except <em> and <strong>, and any tags that contain attributes
Discussion Variations
Whitelist specific attributes
See Also
9.4. Match XML Names
Problem Solution
XML 1.0 names (approximate) XML 1.1 names (exact)
Discussion
XML 1.0 names XML 1.1 names
Variations See Also
9.5. Convert Plain Text to HTML by Adding <p> and <br> Tags
Problem Solution
Step 1: Replace HTML special characters with named character references Step 2: Replace all line breaks with <br> Step 3: Replace double <br> tags with </p><p> Step 4: Wrap the entire string with <p>⋯</p> Example JavaScript solution
Discussion
Step 1: Replace HTML special characters with named character references Step 2: Replace all line breaks with <br> Step 3: Replace double <br> tags with </p><p> Step 4: Wrap the entire string with <p>⋯</p>
See Also
9.6. Decode XML Entities
Problem Solution
Regular expression Replace matches with their corresponding literal characters Example JavaScript solution
Discussion See Also
9.7. Find a Specific Attribute in XML-Style Tags
Problem Solution
Tags that contain an id attribute (quick and dirty) Tags that contain an id attribute (more reliable) <div> tags that contain an id attribute Tags that contain an id attribute with the value “my-id” Tags that contain “my-class” within their class attribute value
Discussion See Also
9.8. Add a cellspacing Attribute to <table> Tags That Do Not Already Include It
Problem Solution
Solution 1, simplistic Solution 2, more reliable Insert the new attribute
Discussion See Also
9.9. Remove XML-Style Comments
Problem Solution Discussion
How it works When comments can’t be removed
Variations
Find valid XML comments Find valid HTML comments
See Also
9.10. Find Words Within XML-Style Comments
Problem Solution
Two-step approach Single-step approach
Discussion
Two-step approach Single-step approach
Variations See Also
9.11. Change the Delimiter Used in CSV Files
Problem Solution
Example web page with JavaScript
Discussion See Also
9.12. Extract CSV Fields from a Specific Column
Problem Solution
Example web page with JavaScript
Discussion Variations
Match a CSV record and capture the field in column 1 to backreference 1 Match a CSV record and capture the field in column 2 to backreference 1 Match a CSV record and capture the field in column 3 or higher to backreference 1 Replacement string
See Also
9.13. Match INI Section Headers
Problem Solution Discussion Variations See Also
9.14. Match INI Section Blocks
Problem Solution Discussion See Also
9.15. Match INI Name-Value Pairs
Problem Solution Discussion See Also
Index About the Authors Colophon Copyright
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion