Commonly used Unicode character properties

Here is the list of commonly used Unicode character properties in regular expressions that require to match Unicode texts:

Unicode character class Meaning
p{L} Match any letter from any language
p{Lu} Match any uppercase letter from any language
p{Ll} Match any lowercase letter from any language
p{N} Match any digit from any language
p{P} Match any punctuation letter from any language
p{Z} Match any kind of whitespace or invisible separator
p{C} Match any invisible control letter
p{Sc} Match any currency symbol
R Any Unicode linebreak sequence; is equivalent to u000Du000A|[u000Au000Bu000Cu000Du0085u2028u2029]
It is recommended to use R to match any newline character even if dealing with ASCII text.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.17