Cheat Sheet Regex
Riferimento rapido per le espressioni regolari — cerca, copia e usa
Anchors
| Pattern | Description | Example |
|---|---|---|
^
|
Start of string (or start of line in multiline mode) | ^Hello matches 'Hello world' but not 'Say Hello' |
$
|
End of string (or end of line in multiline mode) | world$ matches 'Hello world' but not 'world peace' |
\A
|
Start of string (never matches at line breaks) | \AHello matches only if string starts with 'Hello' |
\Z
|
End of string (before optional final newline) | world\Z matches 'Hello world' at end |
\z
|
Absolute end of string | end\z matches absolute end of string |
\b
|
Word boundary (between \w and \W) | \bcat\b matches 'cat' in 'the cat sat' but not 'catch' |
\B
|
Non-word boundary | \Bcat\B matches 'cat' in 'concatenate' but not standalone 'cat' |
\G
|
Start of current match (useful with global flag) | \Gfoo matches consecutive 'foo' occurrences from last match |
(?m)^
|
Start of each line in multiline mode | (?m)^line matches 'line' at start of any line |
Quantifiers
| Pattern | Description | Example |
|---|---|---|
*
|
Match 0 or more times (greedy) | ab*c matches 'ac', 'abc', 'abbc', 'abbbc' |
+
|
Match 1 or more times (greedy) | ab+c matches 'abc', 'abbc' but not 'ac' |
?
|
Match 0 or 1 time (greedy) | colou?r matches 'color' and 'colour' |
{n}
|
Match exactly n times | a{3} matches 'aaa' only |
{n,}
|
Match n or more times | a{2,} matches 'aa', 'aaa', 'aaaa', etc. |
{n,m}
|
Match between n and m times (inclusive) | a{2,4} matches 'aa', 'aaa', 'aaaa' |
*?
|
Match 0 or more times (lazy/non-greedy) | <.*?> matches '<b>' in '<b>bold</b>' instead of whole string |
+?
|
Match 1 or more times (lazy/non-greedy) | a+? matches single 'a' as few times as possible |
??
|
Match 0 or 1 time (lazy/non-greedy) | colou??r prefers 'color' over 'colour' |
{n,m}?
|
Match between n and m times (lazy) | a{2,4}? matches 'aa' preferring fewer repetitions |
*+
|
Possessive: match 0 or more, never backtrack | a*+b — possessive, won't give back matched a's |
++
|
Possessive: match 1 or more, never backtrack | \d++[abc] — possessive digit matching |
Character Classes
| Pattern | Description | Example |
|---|---|---|
.
|
Any character except newline (by default) | a.b matches 'axb', 'a2b', 'a b' but not 'a\nb' |
\d
|
Any digit [0-9] | \d+ matches '123', '42', '0' |
\D
|
Any non-digit [^0-9] | \D+ matches 'abc', 'foo bar' |
\w
|
Any word character [a-zA-Z0-9_] | \w+ matches 'hello', 'foo_bar', 'Test123' |
\W
|
Any non-word character [^a-zA-Z0-9_] | \W+ matches '!@#', ' ', '->' |
\s
|
Any whitespace character (space, tab, newline, etc.) | \s+ matches spaces, tabs, newlines between words |
\S
|
Any non-whitespace character | \S+ matches words or tokens without spaces |
[abc]
|
Character class: matches a, b, or c | [aeiou] matches any vowel |
[^abc]
|
Negated class: matches any char except a, b, c | [^0-9] matches any non-digit character |
[a-z]
|
Range: matches any lowercase letter a through z | [a-z]+ matches 'hello', 'world' |
[a-zA-Z]
|
Range: matches any letter (upper or lower) | [a-zA-Z]+ matches alphabetic strings |
[0-9a-fA-F]
|
Hexadecimal digit | [0-9a-fA-F]{6} matches hex color like 'FF5733' |
\p{L}
|
Unicode letter (PCRE/Unicode mode) | \p{L}+ matches letters in any language |
\p{N}
|
Unicode number | \p{N}+ matches numeric characters including non-ASCII digits |
Groups & References
| Pattern | Description | Example |
|---|---|---|
(abc)
|
Capturing group — captures matched text | (\d{4})-(\d{2})-(\d{2}) captures year, month, day |
(?:abc)
|
Non-capturing group — groups without capturing | (?:foo|bar)+ matches 'foo', 'bar', 'foofoo', 'foobar' |
(?P<name>abc)
|
Named capturing group (Python/PCRE syntax) | (?P<year>\d{4}) captures year by name |
(?<name>abc)
|
Named capturing group (.NET/PCRE2 syntax) | (?<year>\d{4}) captures year by name |
\1
|
Backreference to group 1 | (\w+) \1 matches repeated words like 'hello hello' |
\k<name>
|
Named backreference | (?<word>\w+) \k<word> matches repeated named word |
(?|...)
|
Branch reset group — subgroups share numbers | (?|(a)|(b)) both alternatives use group 1 |
(?>abc)
|
Atomic group — no backtracking inside | (?>a|ab)c — atomic, won't retry alternatives |
\g{1}
|
Backreference using \g syntax (PCRE) | (\w+) \g{1} same as \1 but clearer syntax |
\g<name>
|
Recursive reference to named group | (?<balanced>\((?:[^()]|\g<balanced>)*\)) recursive match |
Lookarounds
| Pattern | Description | Example |
|---|---|---|
(?=abc)
|
Positive lookahead — matches if followed by abc | \d+(?= dollars) matches number only if followed by ' dollars' |
(?!abc)
|
Negative lookahead — matches if NOT followed by abc | \d+(?! dollars) matches number not followed by ' dollars' |
(?<=abc)
|
Positive lookbehind — matches if preceded by abc | (?<=\$)\d+ matches digits preceded by '$' |
(?<!abc)
|
Negative lookbehind — matches if NOT preceded by abc | (?<!\$)\d+ matches digits NOT preceded by '$' |
(?=.*abc)
|
Lookahead that allows characters before the target | ^(?=.*\d)(?=.*[a-z]).{8,}$ password with digit and lowercase |
(?<=\b)\w+
|
Lookbehind at word boundary | (?<=\bpre)\w+ matches suffix after 'pre' |
(?=(?:...)*$)
|
Lookahead for repeated pattern to end of string | ^(?=(?:\d{3})*$)\d+ divisible block of 3 digits |
(?<!\\)"
|
Match quote not preceded by backslash | (?<!\\)" matches unescaped double quotes |
Flags / Modifiers
| Pattern | Description | Example |
|---|---|---|
i
|
Case-insensitive matching | /hello/i matches 'Hello', 'HELLO', 'hello' |
g
|
Global: find all matches (not just first) | /\d+/g finds all numbers in a string |
m
|
Multiline: ^ and $ match start/end of each line | /^foo/m matches 'foo' at start of any line |
s
|
Dotall: . matches newline characters too | /a.b/s matches 'a\nb' with dot matching newline |
u
|
Unicode: treat pattern and string as Unicode | /\p{Emoji}/u matches emoji characters |
y
|
Sticky: match only from lastIndex position | /foo/y matches 'foo' only at current position |
x
|
Extended: ignore whitespace and allow comments | /hello # greeting/x ignores space and comment |
(?i)
|
Inline flag for case-insensitive (embedded in pattern) | (?i)hello matches case-insensitively from that point |
(?m)
|
Inline multiline flag | (?m)^start matches 'start' at beginning of each line |
(?s)
|
Inline dotall flag | (?s)begin.*end matches across newlines |
Common Patterns
| Pattern | Description | Example |
|---|---|---|
^[\w.-]+@[\w.-]+\.\w{2,}$
|
Basic email address validation | [email protected], [email protected] |
^https?:\/\/[\w\-.]+(?:\.[\w\-.]+)+[\/\w\-.?=%&#]*$
|
URL validation (http and https) | https://example.com/path?q=1&r=2 |
^(?:\d{1,3}\.){3}\d{1,3}$
|
IPv4 address (basic) | 192.168.1.1, 10.0.0.255 |
^([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
|
MAC address | 00:1A:2B:3C:4D:5E |
^#?([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
|
Hex color code | #FF5733 or #F57 or FF5733 |
^\+?[1-9]\d{1,14}$
|
International phone number (E.164) | +14155552671, +442071838750 |
^\d{4}-\d{2}-\d{2}$
|
Date in YYYY-MM-DD format | 2024-01-31, 2000-12-25 |
^([01]\d|2[0-3]):[0-5]\d(:[0-5]\d)?$
|
Time in HH:MM or HH:MM:SS (24-hour) | 14:30, 09:05:00, 23:59:59 |
^[a-zA-Z0-9_-]{3,16}$
|
Username: 3-16 chars, letters, digits, underscore, hyphen | john_doe, user-123, MyName |
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
|
Strong password: min 8 chars, upper, lower, digit, special | P@ssw0rd!, Secure#123 |
^\d{5}(-\d{4})?$
|
US ZIP code (5 digit or ZIP+4) | 90210, 10001-1234 |
^[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}$
|
IBAN bank account number | GB29NWBK60161331926819 |
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
|
Basic HTML tag with content (non-nested) | <b>bold</b>, <span class="x">text</span> |
^([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))$
|
Strict date YYYY-MM-DD with range validation | 2024-01-31, 1999-12-25 |
Domande Frequenti
Un'espressione regolare (regex) è una sequenza di caratteri che definisce un pattern di ricerca. Le regex vengono usate per cercare, trovare e manipolare testo nei linguaggi di programmazione, negli editor di testo e negli strumenti da riga di comando. Ad esempio, il pattern \d{3}-\d{4} corrisponde a un formato di numero di telefono come "555-1234".
La maggior parte dei linguaggi di programmazione ha supporto regex integrato. In JavaScript usa /pattern/flags con metodi come .test(), .match() o .replace(). In Python importa il modulo re e usa re.search(), re.findall() o re.sub(). In PHP usa preg_match(), preg_match_all() o preg_replace(). Copia qualsiasi pattern da questo cheat sheet e incollalo direttamente in queste funzioni.
I quantificatori greedy (*, +, {n,}) corrispondono alla maggiore quantità di testo possibile. I quantificatori lazy (non-greedy) (*?, +?, {n,}?) corrispondono alla minore quantità possibile. Ad esempio, con l'input "grassetto e corsivo", il pattern greedy <.+> corrisponde all'intera stringa dal primo < all'ultimo >, mentre il lazy <.+?> corrisponde solo a "" e a ogni singolo tag.
Il lookahead (?=...) verifica che ciò che segue la posizione corrente corrisponda a un pattern, senza includerlo nella corrispondenza. Il lookbehind (?<=...) verifica che ciò che precede corrisponda a un pattern. Le versioni negative (?!...) e (?
Le regex sono ampiamente usate per: validare formati di input (indirizzi email, numeri di telefono, codici postali), cercare e sostituire testo negli editor di codice, analizzare file di log ed estrarre dati specifici, routing URL nei framework web, validazione di form lato client e server, e web scraping. La sezione "Pattern Comuni" di questo cheat sheet fornisce esempi pronti all'uso per molti di questi scenari.