正規表現チートシート

Q: コードで正規表現パターンを使うにはどうすればいいですか？

ほとんどのプログラミング言語に正規表現の組み込みサポートがあります。JavaScript では /パターン/フラグ を .test()、.match()、.replace() などのメソッドで使います。Python では re モジュールをインポートして re.search()、re.findall()、re.sub() を使います。PHP では preg_match()、preg_match_all()、preg_replace() を使います。このチートシートからパターンをコピーしてそのまま関数に貼り付けられます。

Q: 正規表現の実際の使い道で最も多いのは何ですか？

正規表現は次のような用途に広く使われます：入力形式の検証（メールアドレス・電話番号・郵便番号）、コードエディタでのテキスト検索・置換、ログファイルの解析とデータ抽出、Webフレームワークでの URLルーティング、クライアント・サーバー両方のフォームバリデーション、データスクレイピング。このチートシートの「よく使うパターン」セクションには、これらのシナリオに対応した使いやすいサンプルが揃っています。

正規表現のクイックリファレンス — 検索・コピー・活用

Anchors

Pattern	Description	Example
`^`	Start of string (or start of line in multiline mode)	`^Hello matches 'Hello world' but not 'Say Hello'`
`$`	End of string (or end of line in multiline mode)	`world$ matches 'Hello world' but not 'world peace'`
`\A`	Start of string (never matches at line breaks)	`\AHello matches only if string starts with 'Hello'`
`\Z`	End of string (before optional final newline)	`world\Z matches 'Hello world' at end`
`\z`	Absolute end of string	`end\z matches absolute end of string`
`\b`	Word boundary (between \w and \W)	`\bcat\b matches 'cat' in 'the cat sat' but not 'catch'`
`\B`	Non-word boundary	`\Bcat\B matches 'cat' in 'concatenate' but not standalone 'cat'`
`\G`	Start of current match (useful with global flag)	`\Gfoo matches consecutive 'foo' occurrences from last match`
`(?m)^`	Start of each line in multiline mode	`(?m)^line matches 'line' at start of any line`

Quantifiers

Pattern	Description	Example
`*`	Match 0 or more times (greedy)	`ab*c matches 'ac', 'abc', 'abbc', 'abbbc'`
`+`	Match 1 or more times (greedy)	`ab+c matches 'abc', 'abbc' but not 'ac'`
`?`	Match 0 or 1 time (greedy)	`colou?r matches 'color' and 'colour'`
`{n}`	Match exactly n times	`a{3} matches 'aaa' only`
`{n,}`	Match n or more times	`a{2,} matches 'aa', 'aaa', 'aaaa', etc.`
`{n,m}`	Match between n and m times (inclusive)	`a{2,4} matches 'aa', 'aaa', 'aaaa'`
`*?`	Match 0 or more times (lazy/non-greedy)	`<.*?> matches '<b>' in '<b>bold</b>' instead of whole string`
`+?`	Match 1 or more times (lazy/non-greedy)	`a+? matches single 'a' as few times as possible`
`??`	Match 0 or 1 time (lazy/non-greedy)	`colou??r prefers 'color' over 'colour'`
`{n,m}?`	Match between n and m times (lazy)	`a{2,4}? matches 'aa' preferring fewer repetitions`
`*+`	Possessive: match 0 or more, never backtrack	`a*+b — possessive, won't give back matched a's`
`++`	Possessive: match 1 or more, never backtrack	`\d++[abc] — possessive digit matching`

Character Classes

Pattern	Description	Example
`.`	Any character except newline (by default)	`a.b matches 'axb', 'a2b', 'a b' but not 'a\nb'`
`\d`	Any digit [0-9]	`\d+ matches '123', '42', '0'`
`\D`	Any non-digit [^0-9]	`\D+ matches 'abc', 'foo bar'`
`\w`	Any word character [a-zA-Z0-9_]	`\w+ matches 'hello', 'foo_bar', 'Test123'`
`\W`	Any non-word character [^a-zA-Z0-9_]	`\W+ matches '!@#', ' ', '->'`
`\s`	Any whitespace character (space, tab, newline, etc.)	`\s+ matches spaces, tabs, newlines between words`
`\S`	Any non-whitespace character	`\S+ matches words or tokens without spaces`
`[abc]`	Character class: matches a, b, or c	`[aeiou] matches any vowel`
`[^abc]`	Negated class: matches any char except a, b, c	`[^0-9] matches any non-digit character`
`[a-z]`	Range: matches any lowercase letter a through z	`[a-z]+ matches 'hello', 'world'`
`[a-zA-Z]`	Range: matches any letter (upper or lower)	`[a-zA-Z]+ matches alphabetic strings`
`[0-9a-fA-F]`	Hexadecimal digit	`[0-9a-fA-F]{6} matches hex color like 'FF5733'`
`\p{L}`	Unicode letter (PCRE/Unicode mode)	`\p{L}+ matches letters in any language`
`\p{N}`	Unicode number	`\p{N}+ matches numeric characters including non-ASCII digits`

Groups & References

Pattern	Description	Example
`(abc)`	Capturing group — captures matched text	`(\d{4})-(\d{2})-(\d{2}) captures year, month, day`
`(?:abc)`	Non-capturing group — groups without capturing	`(?:foo\|bar)+ matches 'foo', 'bar', 'foofoo', 'foobar'`
`(?P<name>abc)`	Named capturing group (Python/PCRE syntax)	`(?P<year>\d{4}) captures year by name`
`(?<name>abc)`	Named capturing group (.NET/PCRE2 syntax)	`(?<year>\d{4}) captures year by name`
`\1`	Backreference to group 1	`(\w+) \1 matches repeated words like 'hello hello'`
`\k<name>`	Named backreference	`(?<word>\w+) \k<word> matches repeated named word`
`(?\|...)`	Branch reset group — subgroups share numbers	`(?\|(a)\|(b)) both alternatives use group 1`
`(?>abc)`	Atomic group — no backtracking inside	`(?>a\|ab)c — atomic, won't retry alternatives`
`\g{1}`	Backreference using \g syntax (PCRE)	`(\w+) \g{1} same as \1 but clearer syntax`
`\g<name>`	Recursive reference to named group	`(?<balanced>$(?:[^()]\|\g<balanced>)*$) recursive match`

Lookarounds

Pattern	Description	Example
`(?=abc)`	Positive lookahead — matches if followed by abc	`\d+(?= dollars) matches number only if followed by ' dollars'`
`(?!abc)`	Negative lookahead — matches if NOT followed by abc	`\d+(?! dollars) matches number not followed by ' dollars'`
`(?<=abc)`	Positive lookbehind — matches if preceded by abc	`(?<=\$)\d+ matches digits preceded by '$'`
`(?<!abc)`	Negative lookbehind — matches if NOT preceded by abc	`(?<!\$)\d+ matches digits NOT preceded by '$'`
`(?=.*abc)`	Lookahead that allows characters before the target	`^(?=.\d)(?=.[a-z]).{8,}$ password with digit and lowercase`
`(?<=\b)\w+`	Lookbehind at word boundary	`(?<=\bpre)\w+ matches suffix after 'pre'`
`(?=(?:...)*$)`	Lookahead for repeated pattern to end of string	`^(?=(?:\d{3})*$)\d+ divisible block of 3 digits`
`(?<!\\)"`	Match quote not preceded by backslash	`(?<!\\)" matches unescaped double quotes`

Flags / Modifiers

Pattern	Description	Example
`i`	Case-insensitive matching	`/hello/i matches 'Hello', 'HELLO', 'hello'`
`g`	Global: find all matches (not just first)	`/\d+/g finds all numbers in a string`
`m`	Multiline: ^ and $ match start/end of each line	`/^foo/m matches 'foo' at start of any line`
`s`	Dotall: . matches newline characters too	`/a.b/s matches 'a\nb' with dot matching newline`
`u`	Unicode: treat pattern and string as Unicode	`/\p{Emoji}/u matches emoji characters`
`y`	Sticky: match only from lastIndex position	`/foo/y matches 'foo' only at current position`
`x`	Extended: ignore whitespace and allow comments	`/hello # greeting/x ignores space and comment`
`(?i)`	Inline flag for case-insensitive (embedded in pattern)	`(?i)hello matches case-insensitively from that point`
`(?m)`	Inline multiline flag	`(?m)^start matches 'start' at beginning of each line`
`(?s)`	Inline dotall flag	`(?s)begin.*end matches across newlines`

Common Patterns

Pattern	Description	Example
`^[\w.-]+@[\w.-]+\.\w{2,}$`	Basic email address validation	`[email protected], [email protected]`
`^https?:\/\/[\w\-.]+(?:\.[\w\-.]+)+[\/\w\-.?=%&#]*$`	URL validation (http and https)	`https://example.com/path?q=1&r=2`
`^(?:\d{1,3}\.){3}\d{1,3}$`	IPv4 address (basic)	`192.168.1.1, 10.0.0.255`
`^([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$`	MAC address	`00:1A:2B:3C:4D:5E`
`^#?([A-Fa-f0-9]{6}\|[A-Fa-f0-9]{3})$`	Hex color code	`#FF5733 or #F57 or FF5733`
`^\+?[1-9]\d{1,14}$`	International phone number (E.164)	`+14155552671, +442071838750`
`^\d{4}-\d{2}-\d{2}$`	Date in YYYY-MM-DD format	`2024-01-31, 2000-12-25`
`^([01]\d\|2[0-3]):[0-5]\d(:[0-5]\d)?$`	Time in HH:MM or HH:MM:SS (24-hour)	`14:30, 09:05:00, 23:59:59`
`^[a-zA-Z0-9_-]{3,16}$`	Username: 3-16 chars, letters, digits, underscore, hyphen	`john_doe, user-123, MyName`
`^(?=.[a-z])(?=.[A-Z])(?=.\d)(?=.[@$!%?&])[A-Za-z\d@$!%?&]{8,}$`	Strong password: min 8 chars, upper, lower, digit, special	`P@ssw0rd!, Secure#123`
`^\d{5}(-\d{4})?$`	US ZIP code (5 digit or ZIP+4)	`90210, 10001-1234`
`^[A-Z]{2}\d{2}[A-Z0-9]{4}\d{7}([A-Z0-9]?){0,16}$`	IBAN bank account number	`GB29NWBK60161331926819`
`<([a-z][a-z0-9])\b[^>]>(.*?)<\/\1>`	Basic HTML tag with content (non-nested)	`<b>bold</b>, <span class="x">text</span>`
`^([12]\d{3}-(0[1-9]\|1[0-2])-(0[1-9]\|[12]\d\|3[01]))$`	Strict date YYYY-MM-DD with range validation	`2024-01-31, 1999-12-25`

よくある質問

正規表現（regex）は検索パターンを定義する文字列です。プログラミング言語、テキストエディタ、コマンドラインツールでテキストの検索・照合・加工に使われます。たとえば \d{3}-\d{4} というパターンは「555-1234」のような電話番号の形式にマッチします。

ほとんどのプログラミング言語に正規表現の組み込みサポートがあります。JavaScript では /パターン/フラグを .test()、.match()、.replace() などのメソッドで使います。Python では re モジュールをインポートして re.search()、re.findall()、re.sub() を使います。PHP では preg_match()、preg_match_all()、preg_replace() を使います。このチートシートからパターンをコピーしてそのまま関数に貼り付けられます。

欲張り量指定子（*, +, {n,}）はできるだけ多くのテキストにマッチしようとします。怠惰（非欲張り）量指定子（*?, +?, {n,}?）はできるだけ少ないテキストにマッチしようとします。たとえば「太字と斜体」という入力に対して、欲張りパターン <.+> は最初の < から最後の > まで全体にマッチしますが、怠惰な <.+?> は「」などの個々のタグだけにマッチします。

先読み (?=...) は現在位置の後がパターンに一致することを確認しますが、マッチには含めません。後読み (?<=...) は現在位置の前がパターンに一致することを確認します。否定版の (?!...) と (?

正規表現は次のような用途に広く使われます：入力形式の検証（メールアドレス・電話番号・郵便番号）、コードエディタでのテキスト検索・置換、ログファイルの解析とデータ抽出、Webフレームワークでの URLルーティング、クライアント・サーバー両方のフォームバリデーション、データスクレイピング。このチートシートの「よく使うパターン」セクションには、これらのシナリオに対応した使いやすいサンプルが揃っています。

正規表現チートシート

Anchors

Quantifiers

Character Classes

Groups & References

Lookarounds

Flags / Modifiers

Common Patterns

よくある質問

正規表現（regex）とは何ですか？

コードで正規表現パターンを使うにはどうすればいいですか？

欲張り量指定子と怠惰量指定子の違いは何ですか？

先読み（lookahead）と後読み（lookbehind）の違いは何ですか？

正規表現の実際の使い道で最も多いのは何ですか？