Regular expressions cheatsheet

A quick reference for some common regular expressions.

13 July 2023

Code

Alternation

Find this OR that – either or both return true

# | = OR

this|that

Matches this OR that

Character sets

Character sets, denoted by a pair of brackets [], let us match one character from a series of characters, allowing for matches with incorrect or different spellings.

con[sc]en[sc]us

Matches consensus, concensus, consencus, concencus.

[cat]

Matches c, a, or t, but not the text cat.

# ^ = NOT
[^cat]

Matches anything NOT c, a, or t, e.g. dog

Wildcards

Wildcards will match any single character (letter, number, symbol or whitespace) in a piece of text. They are useful when we do not care about the specific value of a character, but only that a character exists.

# . = Wildcard
# \. = Escape wildcard
......... 

Matches any 9 character text

I have . cats

Matches I have 2 cats, I have 8 cats, I have X cats etc.

Ranges

Ranges allow us to specify a range of characters in which we can make a match without having to type out each individual character.

[0-9]

Matches any one number between 0-9

[a-z]

Matches any one lowercase letter between a-z

[A-Za-z]

Matches any one uppercase or lowercase letter between A-z

Shorthand Character Classes

\w # Word character

Represents the regex range [A-Za-z0-9_], and it matches a single uppercase character, lowercase character, digit or underscore

\d # Digit character

Represents the regex range [0-9], and it matches a single digit character

\s # Whitespace character

Represents the regex range [ \t\r\n\f\v], matching a single space, tab, carriage return, line break, form feed, or vertical tab.

\d\s\w\w\w\w\w\w\w

Matches a digit character, followed by a whitespace character, followed by 7 word characters, eg. 3 monkeys

\W # Non-word character
\D # Non-digit character
\S # Non-whitespace character

Grouping

Put it (here|there)

Matches Put it here OR Put it there

Quantifiers

\w{3} 

Matches any three character words

\w{4,7}

Match any words with a minimum of 4 characters and max 7 characters.

roa{3}r
# Matches ro aaa r
roa{3,7}r 

Matches min 3 a's, max 7 a's. Will always match with the largest quantity.

? # Optional quantifier 
colou?r

Matches both color and colour as the u is marked as optional
Need to escape ? if want to use it in sentance.

The monkey ate a (rotten )?banana

Matches both The monkey ate a rotten banana and The monkey ate a banana.

Kleene star - matches 0 or more times

coo*l 

Matches co followed by 0 or more o's, followed by l. So, col, cool, coooooooooooooool all match.

+ # Kleene plus - matches 1 or more times.
coo*l 

Matches co followed by 1 or more o's, followed by l
So, cool, coool, coooooooooooooool all match.

Anchors

Anchor metacharacters ensure we don't match unintended text. ^ and $ are used to mark the start and the end of a string.

^Red is my favourite colour$

Matches Red is my favourite colour, but not Orange Red is my favourite colour of car.