To replace all the semi-colons with hyphens, we can use the following:
input = input.replaceAll(";", "-");
To remove all the non-digits from the input, we can use:
input = input.replace("\D+", "");
To replace all the leading and trailing commas from an input, we can use an alternation regex:
input = input.replaceAll("^,+|,+$", "");
To replace all the occurrences of two or more white spaces with a single space, we can use:
input = input.replaceAll("\s{2,}, " ");
How can we escape all the dollar signs that are just before the % character? In other words, to replace all the occurrences of $% with $%, we can use:
input = input.replaceAll("\$%", "\\\$%");
Note that we are using \\ (four backslashes) to enter a single , and we're using \$ to enter a single $ in the replacement, whereas % will just be a literal.
Consider the following input:
$200 $%apple% $%banana% $%orange%
It will be converted into this:
$200 $%apple% $%banana} ${orange}
We can also leverage the group reference $0 here, which is populated with the entire matched text using a regex. So, our code can be simplified to this as $0 will refer to the matched text $% by our regex:
input = input.replaceAll("\$%", "\\$0");
Another nice trick we can use here is to use the static method, Matcher.quoteReplacement that is in the Matcher API. This method handles all the special characters in a replacement string and escapes them appropriately. Now, our code can become this:
input = input.replaceAll("\$%", Matcher.quoteReplacement("\") + "$0");
Let's solve an interesting problem. We need to replace all the multiple occurrences of non-word characters with a single instance of the same character.
Consider the following input text:
Let''''''s learn::: how to write cool regex...
The expected output is:
Let's learn: how to write cool regex.
Note that we are replacing multiple occurrences of non-word characters only and not replacing multiple occurrences of word characters.
Here is the code listing to solve this problem:
package example.regex;
public class StringReplaceAll
{
public static void main(String[] args)
{
// our input string
String input = "Let''''''s learn::: how to write cool regex...";
// call replaceAll and assign replaced string to same variable
input = input.replaceAll("(\W)\1+", "$1");
// print the result
System.out.printf("Replaced result: %s%n", input);
}
}
Here are a few points about this solution:
- We are using the predefined class, W, to match a non-word character
- We are using a capturing group around the non-word character to be able to use a back-reference later in the regex and in the replacement
- The pattern, (\W)\1+, is used to match two or more occurrences of the same non-word character
- 1 represents the back-reference to the first captured group
- In the replacement, we are using the reference, $1, to place the captured non-word character back in the replaced string
- $1 represents the reference to the first captured group
- Using the named group directives that you learnt in the previous chapter, we can also write the replaceAll method call as follows:
input = input.replaceAll("(?<nwchar>\W)\k<nwchar>+", "${nwchar}");