ESC
正規表現チートシート

Anchors

Pattern Description Example
^ Start of string (or start of line in multiline mode) ^Hello matches 'Hello world' but not 'Say Hello'
$ End of string (or end of line in multiline mode) world$ matches 'Hello world' but not 'world peace'
\A Start of string (never matches at line breaks) \AHello matches only if string starts with 'Hello'
\Z End of string (before optional final newline) world\Z matches 'Hello world' at end
\z Absolute end of string end\z matches absolute end of string
\b Word boundary (between \w and \W) \bcat\b matches 'cat' in 'the cat sat' but not 'catch'
\B Non-word boundary \Bcat\B matches 'cat' in 'concatenate' but not standalone 'cat'
\G Start of current match (useful with global flag) \Gfoo matches consecutive 'foo' occurrences from last match
(?m)^ Start of each line in multiline mode (?m)^line matches 'line' at start of any line

Quantifiers

Pattern Description Example
* Match 0 or more times (greedy) ab*c matches 'ac', 'abc', 'abbc', 'abbbc'
+ Match 1 or more times (greedy) ab+c matches 'abc', 'abbc' but not 'ac'
? Match 0 or 1 time (greedy) colou?r matches 'color' and 'colour'
{n} Match exactly n times a{3} matches 'aaa' only
{n,} Match n or more times a{2,} matches 'aa', 'aaa', 'aaaa', etc.
{n,m} Match between n and m times (inclusive) a{2,4} matches 'aa', 'aaa', 'aaaa'
*? Match 0 or more times (lazy/non-greedy) <.*?> matches '<b>' in '<b>bold</b>' instead of whole string
+? Match 1 or more times (lazy/non-greedy) a+? matches single 'a' as few times as possible
?? Match 0 or 1 time (lazy/non-greedy) colou??r prefers 'color' over 'colour'
{n,m}? Match between n and m times (lazy) a{2,4}? matches 'aa' preferring fewer repetitions
*+ Possessive: match 0 or more, never backtrack a*+b — possessive, won't give back matched a's
++ Possessive: match 1 or more, never backtrack \d++[abc] — possessive digit matching

Character Classes

Pattern Description Example
. Any character except newline (by default) a.b matches 'axb', 'a2b', 'a b' but not 'a\nb'
\d Any digit [0-9] \d+ matches '123', '42', '0'
\D Any non-digit [^0-9] \D+ matches 'abc', 'foo bar'
\w Any word character [a-zA-Z0-9_] \w+ matches 'hello', 'foo_bar', 'Test123'
\W Any non-word character [^a-zA-Z0-9_] \W+ matches '!@#', ' ', '->'
\s Any whitespace character (space, tab, newline, etc.) \s+ matches spaces, tabs, newlines between words
\S Any non-whitespace character \S+ matches words or tokens without spaces
[abc] Character class: matches a, b, or c [aeiou] matches any vowel
[^abc] Negated class: matches any char except a, b, c [^0-9] matches any non-digit character
[a-z] Range: matches any lowercase letter a through z [a-z]+ matches 'hello', 'world'
[a-zA-Z] Range: matches any letter (upper or lower) [a-zA-Z]+ matches alphabetic strings
[0-9a-fA-F] Hexadecimal digit [0-9a-fA-F]{6} matches hex color like 'FF5733'
\p{L} Unicode letter (PCRE/Unicode mode) \p{L}+ matches letters in any language
\p{N} Unicode number \p{N}+ matches numeric characters including non-ASCII digits

Groups & References

Pattern Description Example
(abc) Capturing group — captures matched text (\d{4})-(\d{2})-(\d{2}) captures year, month, day
(?:abc) Non-capturing group — groups without capturing (?:foo|bar)+ matches 'foo', 'bar', 'foofoo', 'foobar'
(?P<name>abc) Named capturing group (Python/PCRE syntax) (?P<year>\d{4}) captures year by name
(?<name>abc) Named capturing group (.NET/PCRE2 syntax) (?<year>\d{4}) captures year by name
\1 Backreference to group 1 (\w+) \1 matches repeated words like 'hello hello'
\k<name> Named backreference (?<word>\w+) \k<word> matches repeated named word
(?|...) Branch reset group — subgroups share numbers (?|(a)|(b)) both alternatives use group 1
(?>abc) Atomic group — no backtracking inside (?>a|ab)c — atomic, won't retry alternatives
\g{1} Backreference using \g syntax (PCRE) (\w+) \g{1} same as \1 but clearer syntax
\g<name> Recursive reference to named group (?<balanced>\((?:[^()]|\g<balanced>)*\)) recursive match

Lookarounds

Pattern Description Example
(?=abc) Positive lookahead — matches if followed by abc \d+(?= dollars) matches number only if followed by ' dollars'
(?!abc) Negative lookahead — matches if NOT followed by abc \d+(?! dollars) matches number not followed by ' dollars'
(?<=abc) Positive lookbehind — matches if preceded by abc (?<=\$)\d+ matches digits preceded by '$'
(?<!abc) Negative lookbehind — matches if NOT preceded by abc (?<!\$)\d+ matches digits NOT preceded by '$'
(?=.*abc) Lookahead that allows characters before the target ^(?=.*\d)(?=.*[a-z]).{8,}$ password with digit and lowercase
(?<=\b)\w+ Lookbehind at word boundary (?<=\bpre)\w+ matches suffix after 'pre'
(?=(?:...)*$) Lookahead for repeated pattern to end of string ^(?=(?:\d{3})*$)\d+ divisible block of 3 digits
(?<!\\)" Match quote not preceded by backslash (?<!\\)" matches unescaped double quotes

Flags / Modifiers

Pattern Description Example
i Case-insensitive matching /hello/i matches 'Hello', 'HELLO', 'hello'
g Global: find all matches (not just first) /\d+/g finds all numbers in a string
m Multiline: ^ and $ match start/end of each line /^foo/m matches 'foo' at start of any line
s Dotall: . matches newline characters too /a.b/s matches 'a\nb' with dot matching newline
u Unicode: treat pattern and string as Unicode /\p{Emoji}/u matches emoji characters
y Sticky: match only from lastIndex position /foo/y matches 'foo' only at current position
x Extended: ignore whitespace and allow comments /hello # greeting/x ignores space and comment
(?i) Inline flag for case-insensitive (embedded in pattern) (?i)hello matches case-insensitively from that point
(?m) Inline multiline flag (?m)^start matches 'start' at beginning of each line
(?s) Inline dotall flag (?s)begin.*end matches across newlines

Common Patterns

Pattern Description Example
^[\w.-]+@[\w.-]+\.\w{2,}$ Basic email address validation [email protected], [email protected]
^https?:\/\/[\w\-.]+(?:\.[\w\-.]+)+[\/\w\-.?=%&#]*$ URL validation (http and https) https://example.com/path?q=1&r=2
^(?:\d{1,3}\.){3}\d{1,3}$ IPv4 address (basic) 192.168.1.1, 10.0.0.255
^([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$ MAC address 00:1A:2B:3C:4D:5E
^#?([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$ Hex color code #FF5733 or #F57 or FF5733
^\+?[1-9]\d{1,14}$ International phone number (E.164) +14155552671, +442071838750
^\d{4}-\d{2}-\d{2}$ Date in YYYY-MM-DD format 2024-01-31, 2000-12-25
^([01]\d|2[0-3]):[0-5]\d(:[0-5]\d)?$ Time in HH:MM or HH:MM:SS (24-hour) 14:30, 09:05:00, 23:59:59
^[a-zA-Z0-9_-]{3,16}$ Username: 3-16 chars, letters, digits, underscore, hyphen john_doe, user-123, MyName
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$ Strong password: min 8 chars, upper, lower, digit, special P@ssw0rd!, Secure#123
^\d{5}(-\d{4})?$ US ZIP code (5 digit or ZIP+4) 90210, 10001-1234
^[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}$ IBAN bank account number GB29NWBK60161331926819
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1> Basic HTML tag with content (non-nested) <b>bold</b>, <span class="x">text</span>
^([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))$ Strict date YYYY-MM-DD with range validation 2024-01-31, 1999-12-25

よくある質問

正規表現(regex)は検索パターンを定義する文字列です。プログラミング言語、テキストエディタ、コマンドラインツールでテキストの検索・照合・加工に使われます。たとえば \d{3}-\d{4} というパターンは「555-1234」のような電話番号の形式にマッチします。

ほとんどのプログラミング言語に正規表現の組み込みサポートがあります。JavaScript では /パターン/フラグ を .test()、.match()、.replace() などのメソッドで使います。Python では re モジュールをインポートして re.search()、re.findall()、re.sub() を使います。PHP では preg_match()、preg_match_all()、preg_replace() を使います。このチートシートからパターンをコピーしてそのまま関数に貼り付けられます。

欲張り量指定子(*, +, {n,})はできるだけ多くのテキストにマッチしようとします。怠惰(非欲張り)量指定子(*?, +?, {n,}?)はできるだけ少ないテキストにマッチしようとします。たとえば「太字斜体」という入力に対して、欲張りパターン <.+> は最初の < から最後の > まで全体にマッチしますが、怠惰な <.+?> は「」などの個々のタグだけにマッチします。

先読み (?=...) は現在位置の後がパターンに一致することを確認しますが、マッチには含めません。後読み (?<=...) は現在位置の前がパターンに一致することを確認します。否定版の (?!...) と (?

正規表現は次のような用途に広く使われます:入力形式の検証(メールアドレス・電話番号・郵便番号)、コードエディタでのテキスト検索・置換、ログファイルの解析とデータ抽出、Webフレームワークでの URLルーティング、クライアント・サーバー両方のフォームバリデーション、データスクレイピング。このチートシートの「よく使うパターン」セクションには、これらのシナリオに対応した使いやすいサンプルが揃っています。