Examples of matching Unicode text in regular expressions

The following regex will match accented characters, such as "à":

    ^p{L}+$

The following regex will match a text consisting of Latin characters and Unicode whitespaces:

    ^[p{IsLatin}p{Zs}]+$

The following regex should be used to detect the presence of a Hebrew character in input:

    p{InHebrew}

The following regex should be used to detect an input that contains only Arabic text:

    ^p{InArabic}+$

How can we match Urdu text? Since Urdu is not a script, we will need to match certain Unicode code ranges. These are as follows:

    U+0600 to U+06FF
U+0750 to U+077F
U+FB50 to U+FDFF
U+FE70 to U+FEFF

A Java regex to detect the presence of any Urdu character will be:

[u0600-u06FFu0750-u077FuFB50-uFDFFuFE70‌​-uFEFF]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.176.88