XQuery
Priscilla Walmsley (pwalmsley@datypic.com)
ISBN: 0596006349
1st edition, , O'Reilly Media, Inc.
Chapter 18: Regular expressions
Regular expression | Strings that match | Strings that do not match |
---|---|---|
fo
|
fo
|
f, foo
|
fo?
|
f, fo
|
foo
|
fo*
|
f, fo, foo, fooo, …
|
fx
|
fo+
|
fo, foo, fooo, …
|
f
|
fo{2}
|
foo
|
fo, fooo
|
fo{2,}
|
foo, fooo, foooo, …
|
f, fo
|
fo{2,3}
|
foo, fooo
|
f, fo, foooo
|
Regular expression | Strings that match | Strings that do not match |
---|---|---|
(fo)+z
|
foz, fofoz
|
z, fz, fooz, ffooz
|
(fo|xy)z
|
foz, xyz
|
z
|
(fo|xy)+z
|
fofoz, foxyz, xyfoz
|
z
|
(f+o)+z
|
foz, ffoz, foffoz
|
z, fz, fooz
|
yes|no
|
yes, no
|
Regular expression | Strings that match | Strings that do not match |
---|---|---|
d
|
d
|
g
|
d+efg+
|
defg, ddefgg
|
defgefg, deffgg
|
defg
|
defg
|
d, efg
|
d|e|f
|
d, e, f
|
g
|
f*o
|
fo, ffo, fffo
|
f*o
|
f\*o
|
f*o
|
fo, ffo, fffo
|
déf
|
déf
|
def, df
|
Regular expression | Strings that match | Strings that do not match |
---|---|---|
f.o
|
fao, fbo, f2o
|
fo, fbbo
|
f..o
|
faao, fbco, f12o
|
fo, fao
|
f.*o
|
fo, fao, fbcde23o
|
f
o
|
f\.o
|
f.o
|
fao
|
In the third example, assume a line feed character between f and o . This string does not match unless you are in dot-all mode. |
Regular expression | Strings that match | Strings that do not match | Comment |
---|---|---|---|
f\d
|
f0, f1
|
f, f01
| multi-character escape |
f\d*
|
f, f0, f012
|
ff
| multi-character escape |
f\s*o
|
fo, fo
|
foo
| multi-character escape |
\p{Ll}
|
a, b
|
A, B, 1, 2
| category escape |
\P{Ll}
|
A, B, 1, 2
|
a, b
| category escape |
\p{L}
|
a, b, A, B
|
1, 2
| category escape |
\P{L}
|
1, 2
|
a, b, A, B
| category escape |
\p{IsBasicLatin}
|
a, b
|
â, ß
| block escape |
\P{IsBasicLatin}
|
â, ß
|
a, b
| block escape |
Regular expression | Strings that match | Strings that do not match | Comment |
---|---|---|---|
[def]
|
d, e, f
|
def
| Single characters |
[def]*
|
d, eee, dfed
|
a, b
| Single characters, repeating |
[\p{Ll}d]
|
a, b, 1
|
A, B
| Single characters with escapes |
[d-f]
|
d, e, f
|
a, D
| Range of characters |
[0-9d-fD-F]
|
3, d, F
|
a, 3dF
| Multiple ranges |
[0-9stu]
|
4, 9, t
|
a, 4t
| Range plus single characters |
[s-u\d]
|
4, 9, t
|
a, t4
| Range plus single-character escape |
[a-x-[f]]
|
a, d, x
|
f, 2
| Subtracting from a range |
[a-x-[fg]]
|
a, d, x
|
f, g, 2
| Subtracting from a range |
[a-x-[e-g]]
|
a, d, x
|
e, g, 2
| Subtracting from a range with a range |
[^def]
|
a, g, 2
|
d, e, f
| Negating single characters |
[^\[]
|
a, b, c
|
[
| Negating a single-character escape |
[^\d]
|
d, E
|
1, 2, 3
| Negating a multi-character escape |
[^a-cj-l]
|
d, 4
|
b, j, l
| Negating a range |
Example | Return value |
---|---|
replace("reluctant", "r.*t", "X")
|
X
|
replace("reluctant", "r.*?t", "X")
|
Xant
|
replace("aaah", "a{2,3}", "X")
|
Xh
|
replace("aaah", "a{2,3}?", "X")
|
Xah
|
replace("aaaah", "a{2,3}", "X")
|
Xah
|
replace("aaaah", "a{2,3}?", "X")
|
XXh
|
Regular expression | Strings that match | Strings that do not match |
---|---|---|
str
|
str, str5, 5str, 5str5
|
st, sttr
|
^str$
|
str
|
5str5, str5, 5str
|
^str
|
str, str5
|
5str5, 5str
|
str$
|
str, 5str
|
5str5, str5
|
Regular expression | Strings that match | Strings that do not match |
---|---|---|
str
|
str
|
st
|
^str$
|
str,
555
str
555
|
555str
555
|
^str
|
str555,
555
555str
|
555str
555
|
str$
|
555str,
555str
555
|
555
str555
|
Some of the examples span several lines; individual examples are separated by commas. |
Example | Return value |
---|---|
matches($address, "Street.*City")
|
false
|
matches($address, "Street.*City", "s")
|
true
|
matches($address, "Street$")
|
false
|
matches($address, "Street$", "m")
|
true
|
matches($address, "street")
|
false
|
matches($address, "street", "i")
|
true
|
matches($address, "Main Street")
|
true
|
matches($address, "Main Street", "x")
|
false
|
matches($address, "Main \s Street", "x")
|
true
|
matches($address, "street$", "im")
|
true
|
declare namespace functx = "http://www.functx.com"; declare function functx:get-matches-and-non-matches ($string as xs:string?, $regex as xs:string) as element()* { let $iomf := functx:index-of-match-first($string, $regex) return if (empty($iomf)) then <non-match>{$string}</non-match> else if ($iomf > 1) then (<non-match>{substring($string,1,$iomf - 1)}</non-match>, functx:get-matches-and-non-matches( substring($string,$iomf),$regex)) else let $length := string-length($string) - string-length(functx:replace-first($string, $regex,'')) return (<match>{substring($string,1,$length)}</match>, if (string-length($string) > $length) then functx:get-matches-and-non-matches( substring($string,$length + 1),$regex) else ()) } ; declare function functx:index-of-match-first ($arg as xs:string?, $pattern as xs:string) as xs:integer? { if (matches($arg,$pattern)) then string-length(tokenize($arg, $pattern)[1]) + 1 else () } ; declare function functx:replace-first ($arg as xs:string?, $pattern as xs:string, $replacement as xs:string ) as xs:string { replace($arg, concat('(^.*?)', $pattern), concat('$1',$replacement)) } ;
Example | Return value |
---|---|
replace("Chap 2…Chap 3…Chap 4…","Chap (\d)", "Sec $1.0")
|
Sec 2.0…Sec 3.0…Sec 4.0…
|
replace("abc123", "([a–z])", "$1x")
|
axbxcx123
|
replace("2315551212", "(\d{3})(\d{3})(\d{4})", "($1) $2-$3")
|
(231) 555-1212
|
replace("2006-10-18", "\d{2}(\d{2})-(\d{2})-(\d{2})", "$2/$3/$1")
|
10/18/06
|
replace("25", "(\d+)", "\$$1.00")
|
$25.00
|