Another interesting shell command is tr
. This translates, squeezes, or deletes characters from the standard input. The syntax will be as follows:
tr [OPTION]... SET1 [SET2]
The options for the tr
commands are explained in the following table:
SETs are a string of characters that can be specified using the following:
To provide an input text from a file and an output to a file, we can use the file redirection operators: <
(less than for input) and >
(greater than for output).
Sometimes, removing a few unnecessary characters from an input text is important. For example, our input text is in the tr.txt
file:
$ cat tr.txt This is a text file for demonstrating tr command. This input file contains digit 2 3 4 and 5 as well. THIS IS CAPS LINE this a lowercase line
Suppose we want to remove all the caps letters from this file. We can use the -d
option with SET1
as 'A-Z'
:
$ tr -d 'A-Z' < tr.txt This is a text file for demonstrating tr command. This input file contains digit 2 3 4 and 5 as well. this a lowercase line
We see that the output doesn't have any caps letter. We can also removed a new line and space from a file as follows:
$ tr -d '
' < tr.txt > tr_out1.txt
Here, we have redirected the output to tr_out1.txt
:
$ cat tr_out1.txt Thisisatextfilefordemonstratingtrcommand.Thisinputfileconatainsdigit234and5aswell.THISISCAPSLINEthisalowercaseline
The -s
option is useful when we don't want to delete a character throughout the input text, instead we want to squeeze down to a single occurrence if consecutive multiple occurrences of the given character is there.
One of the use-case where it will prove useful is when we have multiple spaces in between two words that we want to bring down to a single space between any two words/strings in the input text. Consider the tr1.txt
file as an example:
$ cat tr1.txt India China Canada USA Japan Russia Germany France Italy Australia Nepal
By looking into this file, it's quite clear that texts are not properly aligned. There are multiple spaces between two words. We can squeeze multiple spaces to one space using the tr
option with -s
:
$ tr -s ' ' < tr1.txt India China Canada USA Japan Russia Germany France Italy Australia Nepal
Command tr
also provides the -c
or -C
options to invert a character set to be translated. This is useful when we know what is not to be translated.
For example, we want to keep only alphanumeric, newline, and white-space in the text string. Everything should be deleted from the input text. Here, it's easy to specify what not to delete rather than what to delete.
For example, consider the tr2.txt
file whose content is as follows:
$ cat tr2.txt This is an input file. It conatins special character like ?, ! etc &^var is an invalid shll variable. _var1_ is a valid shell variable
To delete characters other than alphanumeric, newline, and white-space, we can run the following command:
tr -cd '[:alnum:] ' < tr2.txt This is an input file It conatins special character like etc var is an invalid shll variable var1 is a valid shell variable
3.135.205.181