Suppose we have to match characters that belong to one class but not to another in a composite character class pattern. There is no separate operator for the subtraction operation. Subtraction is performed by using the intersection operator, &&, and a negated inner character class.
For example, consider the following composite character class:
[0-9&&[^3-6]]
It will match the digits, 0 to 9, except the digits, 3 to 6. This character class can also be written as a union of two character classes:
[[0-2][7-9]]
We can also just use a simple character class, as follows:
[0-27-9]
In order to match all the English consonant uppercase letters, we can subtract five vowels from uppercase letters, such as in the following regex:
[A-Z&&[^AEIOU]]
We can also reverse the order of the two sets used in the preceding regex and use the following regex:
[[^AEIOU]&&A-Z]
Suppose we want to match all punctuation characters except four basic math operators: +, -, *, and /. We can use the following composite character class using the subtraction operation:
[p{Punct}&&[^+*/-]]
Here is a test class that tests the preceding subtraction character class:
package example.regex; import java.util.regex.*; public class SubtractionExample { public static void main(String[] args) { final Pattern p = Pattern.compile("[\p{Punct}&&[^+*/-]]"); final String[] arr = new String[] { "!", "@", "#", "$", "%", "+", "-", "*", "/", "1", "M", "d" }; for (String s: arr) { Matcher m = p.matcher(s); System.out.printf("[%s] %s%n", s, (m.matches() ? "matches" : "does not match")); } } }
This program produces the following output when we run it after compilation:
[!] matches [@] matches [#] matches [$] matches [%] matches [+] does not match [-] does not match [*] does not match [/] does not match [1] does not match [M] does not match [d] does not match
As is evident from this output, it allows all the punctuation characters except the four listed math operators.