What is Regex?
A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regex patterns can match, extract, replace, and validate text in ways that simple string operations cannot. They are supported in virtually every programming language and are used extensively in log parsing, form validation, URL routing, and code search.
Regular expression syntax originates from formal language theory (Kleene's regular languages, 1951) and has been extended by implementations over decades. Core syntax elements include: literal characters that match themselves, . matching any character except newline, ^ and $ anchoring to start and end of a string, * (zero or more), + (one or more), ? (zero or one), {n,m} (between n and m times), | for alternation, [] for character classes, and () for grouping and capture.
Character class shortcuts include \d (digit, [0-9]), \w (word character, [a-zA-Z0-9_]), \s (whitespace), and their uppercase inverses (\D, \W, \S) for negation. Flags modify matching behaviour: i for case-insensitive, g for global (find all matches), m for multiline (^ and $ match line boundaries), and s for dotAll (. matches newlines).
Named capture groups (?<name>...) let you extract matched substrings by name rather than position. Lookaheads (?=...) and lookbehinds (?<=...) assert context without consuming characters, enabling patterns that match only when surrounded by specific text.
Performance traps: catastrophic backtracking can cause a regex engine to run for exponential time on crafted inputs. ReDoS (Regular Expression Denial of Service) exploits this in web applications. Avoid nested quantifiers on overlapping patterns (e.g., (a+)+ against "aaaa...b") and use atomic groups or possessive quantifiers where supported.
Different languages implement slightly different regex dialects. JavaScript uses ECMA-262 syntax, Python uses the re module with some unique extensions, and PCRE (Perl Compatible Regular Expressions) is the standard in PHP, Java, and many other environments. Most differences are minor for everyday use.