A regular expression (RE) is a string representing a set of strings.
Each character in the string usually represents itself, but there are some special characters:
* | asterisk | any string with zero or more chars |
---|---|---|
? | question mark | any single char |
[charset] | square brackets | a single char in the given set |
\ | backslash | treats literally the special char immediately following |
A charset is a string defining a set of characters.
A well formed charset is composed by one or more items, separated by ',' (comma). If no operator is specified the resulting charset is built by adding items (union).
Each character in the charset definition usually represents itself, but there are some special characters.
, | comma | item separator |
---|---|---|
! | bang | not operator, only at item beginning: add the complementary charset |
- | dash | minus operator, only at item beginning: subtract characters from set |
- | dash | range operator, between two chars |
\ | backslash | treats literally the special char immediately following |
An expression of regular expressions (REE) is an expression whose terms are regular expressions, with the following operators, from higher priority to lower:
() | brackets | contains a subexpression, nesting allowed |
---|---|---|
! | bang | NOT |
& | ampersand | AND |
| | pipe | OR |
\ | backslash | treats literally the special char immediately following |
Examples:
REE | meaning |
---|---|
a* | any string starting with 'a' |
*z | any string ending with 'z' |
a*m*z | any string starting with 'a', containing at least one 'm', ending with 'z' |
??? | any three characters long string |
[AEIOU] | an uppercase vowel |
[A-Z,-AEIOU] | an uppercase consonant |
[!AEIOU] | any character except uppercase vowels |
[0123456789] | a decimal digit, same as [0-9] |
[0-9A-Fa-f] | a hexadecimal digit, same as [0-9,A-F,a-f] |
[A-Z,a-z] | an alphabetic character, uppercase or lowercase |
!*j* | any string NOT containing a 'j' |
?*@?*.?* | a well formed email address |
s??|(*m*&?????) | a three characters long string starting with 's' OR a five characters long string containing a 'm' |