Quick reference to Regular Expressions (and search) in Python.
Function | Does what |
---|---|
^Python | Match "Python" at the start of a string or line. |
Python$ | Match "Python" at the end of a string or line. |
\APython | Match "Python" at the start of a string. |
Python\Z | Match "Python" at the end of a string. |
\bPython\b | Match "Python" at a word boundary. |
\bfoo\B | \B is nonword boundary: match "foo" in "food" and "fool" but not alone. |
Python(?=!) | Match "Python" if followed by an exclamation point. |
Python(?!!) | Match "Python" if not followed by an exclamation point. |
Setting | Does what |
---|---|
<.*> | Greedy repetition: matches "<python>perl>". |
<.*?> | Nongreedy repetition: matches "<python>" in "<python>perl>". |
group(num=0) | This method returns the entire match (or the specific subgroup num). |
groups() | This method returns all matching subgroups in a tuple (empty if no matches). |
re.I | Option flag: Performs case-insensitive matching. |
re.L | Option flag: Interprets words according to the current locale. This interpretation affects the alphabetic group (\w and \W), as well as word boundary behavior (\b and \B). |
re.M | Option flag: Makes $ match the end of a line (not just the end of the string) and makes ^ match the start of any line (not just the start of the string). |
re.S | Option flag: Makes a period (dot) match any character including a newline. |
re.U | Option flag: Interprets letters according to the Unicode character set. This flag affects the behavior of \w, \W, \b, \B. |
re.X | Option flag: Permits "nice" regular expression syntax. It ignores whitespace (except inside a set [] or when escaped by a backslash) and treats unescaped # as a comment marker. |
Function | Does what |
---|---|
^ | Matches beginning of line. |
$ | Matches end of line. |
. | Matches any single character except newline. Using m option allows it to match newline as well. |
[...] | Matches any single character in brackets. |
[^...] | Matches any single character not in brackets. |
re* | Matches 0 or more occurrences of preceding expression. |
re+ | Matches 1 or more occurrence of preceding expression. |
re? | Matches 0 or 1 occurrence of preceding expression. |
re{n} | Matches exactly n number of occurrences of preceding expression. |
re{n,} | Matches n or more occurrences of preceding expression. |
re{n,m} | Matches at least n and at most m occurrences of preceding expression. |
a|b | Matches either a or b. |
(re) | Groups regular expressions and remembers matched text. |
(?imx) | Temporarily toggles on i, m, or x options within a regular expression. If in parentheses only that area is affected. |
(?-imx) | Temporarily toggles off i, m, or x options within a regular expression. If in parentheses only that area is affected. |
(?: re) | Groups regular expressions without remembering matched text. |
(?imx: re) | Temporarily toggles on i, m, or x options within parentheses. |
(?-imx: re) | Temporarily toggles off i, m, or x options within parentheses. |
(?#...) | Comment. |
(?= re) | Specifies position using a pattern. Doesn't have a range. |
(?! re) | Specifies position using pattern negation. Doesn't have a range. |
(?> re) | Matches independent pattern without backtracking. |
\w | Matches word characters. |
\W | Matches nonword characters. |
\s | Matches whitespace. Equivalent to [\t\n\r\f]. |
\S | Matches nonwhitespace. |
\d | Matches digits. Equivalent to [0-9]. |
\D | Matches nondigits. |
\A | Matches beginning of string. |
\Z | Matches end of string. If a newline exists it matches just before newline. |
\z | Matches end of string. |
\G | Matches point where last match finished. |
\b | Matches word boundaries when outside brackets. Matches backspace (0x08) when inside brackets. |
\B | Matches nonword boundaries. |
\n, \t, etc. | Matches newlines, carriage returns, tabs, etc. |
\1 ... \9 | Backreference: Matches nth grouped subexpression. |
\10 | Backreference: Matches nth grouped subexpression if it was matched. Otherwise it refers to the octal representation of a character code. |
Syntax | Does what |
---|---|
([Pp])ython&\1erl | Backreference: Match python&perl or Python&Perl. |
(['"])[^\1]*\1 | Backreference: Single or double-quoted string. "\1" matches whatever the 1st group matched. "\2" matches whatever the 2nd group matched, etc. |
R(?#comment) | Matches "R". The rest is a comment. |
R(?i)uby | Case-insensitive while matching "uby". |
R(?i:uby) | Case-insensitive while matching "uby". |
rub(?:y|le) | Group only without creating the "\1" backreference. |
More: