Examples of matching Unicode text in regular expressions

The following regex will match accented characters, such as "à":

    ^p{L}+$

The following regex will match a text consisting of Latin characters and Unicode whitespaces:

    ^[p{IsLatin}p{Zs}]+$

The following regex should be used to detect the presence of a Hebrew character in input:

    p{InHebrew}

The following regex should be used to detect an input that contains only Arabic text:

    ^p{InArabic}+$

How can we match Urdu text? Since Urdu is not a script, we will need to match certain Unicode code ranges. These are as follows:

    U+0600 to U+06FF
    U+0750 to U+077F
    U+FB50 to U+FDFF
    U+FE70 to U+FEFF

A Java regex to detect the presence of any Urdu character will be:

[u0600-u06FFu0750-u077FuFB50-uFDFFuFE70‌-uFEFF]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.146.176.88

Table of Contents for Examples of matching Unicode text in regular expressions