Complete reference guide for Java regex patterns based on Oracle's official documentation. All constructs from java.util.regex.Pattern class.
Construct | Matches | Category | Examples | Action |
---|---|---|---|---|
x | The character x | Characters | a Z 5 | |
\\ | The backslash character | Characters | \ | |
\0n | The character with octal value 0n (0 <= n <= 7) | Characters | \01 \07 | |
\0nn | The character with octal value 0nn (0 <= n <= 7) | Characters | \012 \077 | |
\0mnn | The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) | Characters | \0123 \0377 | |
\xhh | The character with hexadecimal value 0xhh | Characters | \x41 \xFF | |
\uhhhh | The character with hexadecimal value 0xhhhh | Characters | \u0041 \u4E2D | |
\x{h...h} | The character with hexadecimal value 0xh...h | Characters | \x{41} \x{1F600} | |
\t | The tab character ('\u0009') | Characters | \t | |
\n | The newline (line feed) character ('\u000A') | Characters | \n | |
\r | The carriage-return character ('\u000D') | Characters | \r | |
\f | The form-feed character ('\u000C') | Characters | \f | |
\a | The alert (bell) character ('\u0007') | Characters | \a | |
\e | The escape character ('\u001B') | Characters | \e | |
\cx | The control character corresponding to x | Characters | \cA \cZ | |
[abc] | a, b, or c (simple class) | Character Classes | [aeiou] [123] | |
[^abc] | Any character except a, b, or c (negation) | Character Classes | [^aeiou] [^0-9] | |
[a-zA-Z] | a through z or A through Z, inclusive (range) | Character Classes | [a-z] [A-Z] [0-9] | |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) | Character Classes | [a-c[x-z]] | |
[a-z&&[def]] | d, e, or f (intersection) | Character Classes | [a-z&&[aeiou]] | |
[a-z&&[^bc]] | a through z, except for b and c: [ad-z] (subtraction) | Character Classes | [a-z&&[^aeiou]] | |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z] (subtraction) | Character Classes | [0-9&&[^456]] | |
. | Any character (may or may not match line terminators) | Predefined Character Classes | . | |
\d | A digit: [0-9] | Predefined Character Classes | \d | |
\D | A non-digit: [^0-9] | Predefined Character Classes | \D | |
\s | A whitespace character: [ \t\n\x0B\f\r] | Predefined Character Classes | \s | |
\S | A non-whitespace character: [^\s] | Predefined Character Classes | \S | |
\w | A word character: [a-zA-Z_0-9] | Predefined Character Classes | \w | |
\W | A non-word character: [^\w] | Predefined Character Classes | \W | |
\p{Lower} | A lower-case alphabetic character: [a-z] | POSIX Character Classes | \p{Lower} | |
\p{Upper} | An upper-case alphabetic character: [A-Z] | POSIX Character Classes | \p{Upper} | |
\p{ASCII} | All ASCII: [\x00-\x7F] | POSIX Character Classes | \p{ASCII} | |
\p{Alpha} | An alphabetic character: [\p{Lower}\p{Upper}] | POSIX Character Classes | \p{Alpha} | |
\p{Digit} | A decimal digit: [0-9] | POSIX Character Classes | \p{Digit} | |
\p{Alnum} | An alphanumeric character: [\p{Alpha}\p{Digit}] | POSIX Character Classes | \p{Alnum} | |
\p{Punct} | Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ | POSIX Character Classes | \p{Punct} | |
\p{Graph} | A visible character: [\p{Alnum}\p{Punct}] | POSIX Character Classes | \p{Graph} | |
\p{Print} | A printable character: [\p{Graph}\x20] | POSIX Character Classes | \p{Print} | |
\p{Blank} | A space or a tab: [ \t] | POSIX Character Classes | \p{Blank} | |
\p{Cntrl} | A control character: [\x00-\x1F\x7F] | POSIX Character Classes | \p{Cntrl} | |
\p{XDigit} | A hexadecimal digit: [0-9a-fA-F] | POSIX Character Classes | \p{XDigit} | |
\p{Space} | A whitespace character: [ \t\n\x0B\f\r] | POSIX Character Classes | \p{Space} | |
\p{javaLowerCase} | Equivalent to java.lang.Character.isLowerCase() | Java Character Classes | \p{javaLowerCase} | |
\p{javaUpperCase} | Equivalent to java.lang.Character.isUpperCase() | Java Character Classes | \p{javaUpperCase} | |
\p{javaWhitespace} | Equivalent to java.lang.Character.isWhitespace() | Java Character Classes | \p{javaWhitespace} | |
\p{javaMirrored} | Equivalent to java.lang.Character.isMirrored() | Java Character Classes | \p{javaMirrored} | |
\p{IsLatin} | A Latin script character (script) | Unicode Classes | \p{IsLatin} | |
\p{InGreek} | A character in the Greek block (block) | Unicode Classes | \p{InGreek} | |
\p{Lu} | An uppercase letter (category) | Unicode Classes | \p{Lu} | |
\p{IsAlphabetic} | An alphabetic character (binary property) | Unicode Classes | \p{IsAlphabetic} | |
\p{Sc} | A currency symbol | Unicode Classes | \p{Sc} | |
\P{InGreek} | Any character except one in the Greek block (negation) | Unicode Classes | \P{InGreek} | |
[\p{L}&&[^\p{Lu}]] | Any letter except an uppercase letter (subtraction) | Unicode Classes | [\p{L}&&[^\p{Lu}]] | |
^ | The beginning of a line | Boundary Matchers | ^ | |
$ | The end of a line | Boundary Matchers | $ | |
\b | A word boundary | Boundary Matchers | \b | |
\B | A non-word boundary | Boundary Matchers | \B | |
\A | The beginning of the input | Boundary Matchers | \A | |
\G | The end of the previous match | Boundary Matchers | \G | |
\Z | The end of the input but for the final terminator, if any | Boundary Matchers | \Z | |
\z | The end of the input | Boundary Matchers | \z | |
X? | X, once or not at all | Greedy Quantifiers | a? colou?r | |
X* | X, zero or more times | Greedy Quantifiers | a* \d* | |
X+ | X, one or more times | Greedy Quantifiers | a+ \d+ | |
X{n} | X, exactly n times | Greedy Quantifiers | a{3} \d{4} | |
X{n,} | X, at least n times | Greedy Quantifiers | a{3,} \d{2,} | |
X{n,m} | X, at least n but not more than m times | Greedy Quantifiers | a{2,4} \d{3,5} | |
X?? | X, once or not at all (reluctant) | Reluctant Quantifiers | a?? | |
X*? | X, zero or more times (reluctant) | Reluctant Quantifiers | a*? | |
X+? | X, one or more times (reluctant) | Reluctant Quantifiers | a+? | |
X{n}? | X, exactly n times (reluctant) | Reluctant Quantifiers | a{3}? | |
X{n,}? | X, at least n times (reluctant) | Reluctant Quantifiers | a{3,}? | |
X{n,m}? | X, at least n but not more than m times (reluctant) | Reluctant Quantifiers | a{2,4}? | |
X?+ | X, once or not at all (possessive) | Possessive Quantifiers | a?+ | |
X*+ | X, zero or more times (possessive) | Possessive Quantifiers | a*+ | |
X++ | X, one or more times (possessive) | Possessive Quantifiers | a++ | |
X{n}+ | X, exactly n times (possessive) | Possessive Quantifiers | a{3}+ | |
X{n,}+ | X, at least n times (possessive) | Possessive Quantifiers | a{3,}+ | |
X{n,m}+ | X, at least n but not more than m times (possessive) | Possessive Quantifiers | a{2,4}+ | |
XY | X followed by Y | Logical Operators | ab \d\w | |
X|Y | Either X or Y | Logical Operators | cat|dog \d+|\w+ | |
(X) | X, as a capturing group | Logical Operators | (abc) (\d+) | |
\n | Whatever the nth capturing group matched | Back References | \1 \2 | |
\Q | Nothing, but quotes all characters until \E | Quotation | \Q...\E | |
\E | Nothing, but ends quoting started by \Q | Quotation | \Q...\E | |
(?:X) | X, as a non-capturing group | Special Constructs | (?:abc) (?:\d+) | |
(?idmsux-idmsux) | Nothing, but turns match flags i d m s u x on - off | Special Constructs | (?i) (?-i) | |
(?idmsux-idmsux:X) | X, as a non-capturing group with the given flags i d m s u x on - off | Special Constructs | (?i:abc) | |
(?=X) | X, via zero-width positive lookahead | Special Constructs | (?=\d) | |
(?!X) | X, via zero-width negative lookahead | Special Constructs | (?!\d) | |
(?<=X) | X, via zero-width positive lookbehind | Special Constructs | (?<=\d) | |
(?<!X) | X, via zero-width negative lookbehind | Special Constructs | (?<!\d) | |
(?>X) | X, as an independent, non-capturing group | Special Constructs | (?>\d+) |
This comprehensive reference is based on Oracle's official Java SE 7 documentation for thejava.util.regex.Pattern
class.