ESC
Cheat Sheet Regex

Anchors

Pattern Description Example
^ Start of string (or start of line in multiline mode) ^Hello matches 'Hello world' but not 'Say Hello'
$ End of string (or end of line in multiline mode) world$ matches 'Hello world' but not 'world peace'
\A Start of string (never matches at line breaks) \AHello matches only if string starts with 'Hello'
\Z End of string (before optional final newline) world\Z matches 'Hello world' at end
\z Absolute end of string end\z matches absolute end of string
\b Word boundary (between \w and \W) \bcat\b matches 'cat' in 'the cat sat' but not 'catch'
\B Non-word boundary \Bcat\B matches 'cat' in 'concatenate' but not standalone 'cat'
\G Start of current match (useful with global flag) \Gfoo matches consecutive 'foo' occurrences from last match
(?m)^ Start of each line in multiline mode (?m)^line matches 'line' at start of any line

Quantifiers

Pattern Description Example
* Match 0 or more times (greedy) ab*c matches 'ac', 'abc', 'abbc', 'abbbc'
+ Match 1 or more times (greedy) ab+c matches 'abc', 'abbc' but not 'ac'
? Match 0 or 1 time (greedy) colou?r matches 'color' and 'colour'
{n} Match exactly n times a{3} matches 'aaa' only
{n,} Match n or more times a{2,} matches 'aa', 'aaa', 'aaaa', etc.
{n,m} Match between n and m times (inclusive) a{2,4} matches 'aa', 'aaa', 'aaaa'
*? Match 0 or more times (lazy/non-greedy) <.*?> matches '<b>' in '<b>bold</b>' instead of whole string
+? Match 1 or more times (lazy/non-greedy) a+? matches single 'a' as few times as possible
?? Match 0 or 1 time (lazy/non-greedy) colou??r prefers 'color' over 'colour'
{n,m}? Match between n and m times (lazy) a{2,4}? matches 'aa' preferring fewer repetitions
*+ Possessive: match 0 or more, never backtrack a*+b — possessive, won't give back matched a's
++ Possessive: match 1 or more, never backtrack \d++[abc] — possessive digit matching

Character Classes

Pattern Description Example
. Any character except newline (by default) a.b matches 'axb', 'a2b', 'a b' but not 'a\nb'
\d Any digit [0-9] \d+ matches '123', '42', '0'
\D Any non-digit [^0-9] \D+ matches 'abc', 'foo bar'
\w Any word character [a-zA-Z0-9_] \w+ matches 'hello', 'foo_bar', 'Test123'
\W Any non-word character [^a-zA-Z0-9_] \W+ matches '!@#', ' ', '->'
\s Any whitespace character (space, tab, newline, etc.) \s+ matches spaces, tabs, newlines between words
\S Any non-whitespace character \S+ matches words or tokens without spaces
[abc] Character class: matches a, b, or c [aeiou] matches any vowel
[^abc] Negated class: matches any char except a, b, c [^0-9] matches any non-digit character
[a-z] Range: matches any lowercase letter a through z [a-z]+ matches 'hello', 'world'
[a-zA-Z] Range: matches any letter (upper or lower) [a-zA-Z]+ matches alphabetic strings
[0-9a-fA-F] Hexadecimal digit [0-9a-fA-F]{6} matches hex color like 'FF5733'
\p{L} Unicode letter (PCRE/Unicode mode) \p{L}+ matches letters in any language
\p{N} Unicode number \p{N}+ matches numeric characters including non-ASCII digits

Groups & References

Pattern Description Example
(abc) Capturing group — captures matched text (\d{4})-(\d{2})-(\d{2}) captures year, month, day
(?:abc) Non-capturing group — groups without capturing (?:foo|bar)+ matches 'foo', 'bar', 'foofoo', 'foobar'
(?P<name>abc) Named capturing group (Python/PCRE syntax) (?P<year>\d{4}) captures year by name
(?<name>abc) Named capturing group (.NET/PCRE2 syntax) (?<year>\d{4}) captures year by name
\1 Backreference to group 1 (\w+) \1 matches repeated words like 'hello hello'
\k<name> Named backreference (?<word>\w+) \k<word> matches repeated named word
(?|...) Branch reset group — subgroups share numbers (?|(a)|(b)) both alternatives use group 1
(?>abc) Atomic group — no backtracking inside (?>a|ab)c — atomic, won't retry alternatives
\g{1} Backreference using \g syntax (PCRE) (\w+) \g{1} same as \1 but clearer syntax
\g<name> Recursive reference to named group (?<balanced>\((?:[^()]|\g<balanced>)*\)) recursive match

Lookarounds

Pattern Description Example
(?=abc) Positive lookahead — matches if followed by abc \d+(?= dollars) matches number only if followed by ' dollars'
(?!abc) Negative lookahead — matches if NOT followed by abc \d+(?! dollars) matches number not followed by ' dollars'
(?<=abc) Positive lookbehind — matches if preceded by abc (?<=\$)\d+ matches digits preceded by '$'
(?<!abc) Negative lookbehind — matches if NOT preceded by abc (?<!\$)\d+ matches digits NOT preceded by '$'
(?=.*abc) Lookahead that allows characters before the target ^(?=.*\d)(?=.*[a-z]).{8,}$ password with digit and lowercase
(?<=\b)\w+ Lookbehind at word boundary (?<=\bpre)\w+ matches suffix after 'pre'
(?=(?:...)*$) Lookahead for repeated pattern to end of string ^(?=(?:\d{3})*$)\d+ divisible block of 3 digits
(?<!\\)" Match quote not preceded by backslash (?<!\\)" matches unescaped double quotes

Flags / Modifiers

Pattern Description Example
i Case-insensitive matching /hello/i matches 'Hello', 'HELLO', 'hello'
g Global: find all matches (not just first) /\d+/g finds all numbers in a string
m Multiline: ^ and $ match start/end of each line /^foo/m matches 'foo' at start of any line
s Dotall: . matches newline characters too /a.b/s matches 'a\nb' with dot matching newline
u Unicode: treat pattern and string as Unicode /\p{Emoji}/u matches emoji characters
y Sticky: match only from lastIndex position /foo/y matches 'foo' only at current position
x Extended: ignore whitespace and allow comments /hello # greeting/x ignores space and comment
(?i) Inline flag for case-insensitive (embedded in pattern) (?i)hello matches case-insensitively from that point
(?m) Inline multiline flag (?m)^start matches 'start' at beginning of each line
(?s) Inline dotall flag (?s)begin.*end matches across newlines

Common Patterns

Pattern Description Example
^[\w.-]+@[\w.-]+\.\w{2,}$ Basic email address validation [email protected], [email protected]
^https?:\/\/[\w\-.]+(?:\.[\w\-.]+)+[\/\w\-.?=%&#]*$ URL validation (http and https) https://example.com/path?q=1&r=2
^(?:\d{1,3}\.){3}\d{1,3}$ IPv4 address (basic) 192.168.1.1, 10.0.0.255
^([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$ MAC address 00:1A:2B:3C:4D:5E
^#?([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$ Hex color code #FF5733 or #F57 or FF5733
^\+?[1-9]\d{1,14}$ International phone number (E.164) +14155552671, +442071838750
^\d{4}-\d{2}-\d{2}$ Date in YYYY-MM-DD format 2024-01-31, 2000-12-25
^([01]\d|2[0-3]):[0-5]\d(:[0-5]\d)?$ Time in HH:MM or HH:MM:SS (24-hour) 14:30, 09:05:00, 23:59:59
^[a-zA-Z0-9_-]{3,16}$ Username: 3-16 chars, letters, digits, underscore, hyphen john_doe, user-123, MyName
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$ Strong password: min 8 chars, upper, lower, digit, special P@ssw0rd!, Secure#123
^\d{5}(-\d{4})?$ US ZIP code (5 digit or ZIP+4) 90210, 10001-1234
^[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}$ IBAN bank account number GB29NWBK60161331926819
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1> Basic HTML tag with content (non-nested) <b>bold</b>, <span class="x">text</span>
^([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))$ Strict date YYYY-MM-DD with range validation 2024-01-31, 1999-12-25

Domande Frequenti

Un'espressione regolare (regex) è una sequenza di caratteri che definisce un pattern di ricerca. Le regex vengono usate per cercare, trovare e manipolare testo nei linguaggi di programmazione, negli editor di testo e negli strumenti da riga di comando. Ad esempio, il pattern \d{3}-\d{4} corrisponde a un formato di numero di telefono come "555-1234".

La maggior parte dei linguaggi di programmazione ha supporto regex integrato. In JavaScript usa /pattern/flags con metodi come .test(), .match() o .replace(). In Python importa il modulo re e usa re.search(), re.findall() o re.sub(). In PHP usa preg_match(), preg_match_all() o preg_replace(). Copia qualsiasi pattern da questo cheat sheet e incollalo direttamente in queste funzioni.

I quantificatori greedy (*, +, {n,}) corrispondono alla maggiore quantità di testo possibile. I quantificatori lazy (non-greedy) (*?, +?, {n,}?) corrispondono alla minore quantità possibile. Ad esempio, con l'input "grassetto e corsivo", il pattern greedy <.+> corrisponde all'intera stringa dal primo < all'ultimo >, mentre il lazy <.+?> corrisponde solo a "" e a ogni singolo tag.

Il lookahead (?=...) verifica che ciò che segue la posizione corrente corrisponda a un pattern, senza includerlo nella corrispondenza. Il lookbehind (?<=...) verifica che ciò che precede corrisponda a un pattern. Le versioni negative (?!...) e (?

Le regex sono ampiamente usate per: validare formati di input (indirizzi email, numeri di telefono, codici postali), cercare e sostituire testo negli editor di codice, analizzare file di log ed estrarre dati specifici, routing URL nei framework web, validazione di form lato client e server, e web scraping. La sezione "Pattern Comuni" di questo cheat sheet fornisce esempi pronti all'uso per molti di questi scenari.