Replacing tabs with spaces in a text file

Searching and replacing file content is an often needed routine that can be automated with the help of Groovy scripts, one of which will be shown in this recipe.

Getting ready

Let's assume that we have an input.txt file that contains some tabulation characters. We want to replace the tabulation characters with spaces and save the results into a output.txt file.

To perform any action on these files (similar to the Filtering a text file's content recipe), we need to create two instances of java.io.File objects:

def inputFile = new File('input.txt')
def outputFile = new File('output.txt')

How to do it...

Let's go through several ways to achieve the desired result:

  1. First of all, we will take advantage of the transformLine method available in the java.io.Reader class, as well as the withWriter and withReader methods that are described in more detail in the Writing to a file and Reading from a file recipes:
    outputFile.withWriter { Writer writer ->
      inputFile.withReader { Reader reader ->
        reader.transformLine(writer) { String line ->
          line.replaceAll('	', '  ')
        }
      }
    }
  2. A more concise form of the above code snippet looks like this:
    inputFile.withReader { reader ->
      reader.transformLine(outputFile.newWriter()) { line ->
        line.replaceAll('	', '  ')
      }
    }
  3. Another way to do this is with the help of the getText and setText extension methods of java.io.File:
    outputFile.text = inputFile.text.replaceAll('	', '  ')

    Although this approach is the shortest one, it has some complications, which we will describe in the next section.

How it works...

Groovy adds an extension method, transformLine, to the java.io.Reader class. We used this method in the first and second examples. The method takes two input parameters, a Writer and a Closure. The closure expects a String and it should return a transformed String back. The writer is used to output the transformed lines. This means that by having a reader and a writer, we can use transformLine to perform a line-based data transformation.

Since the transformLine method automatically closes the writer, we can omit the outer method call to withWriter and just pass a new Writer instance to the transformLine method. That's what we've shown in the second example we just described.

The previous code snippet, of course, looks more concise; but this approach has one disadvantage. The whole file content will be loaded into the memory, and, in the case of a very large file, we are at risk of getting an OutOfMemoryError exception. With the first and second approaches, we don't risk incurring any memory problems.

You have to choose which approach is more appropriate based on your input file sizes.

There's more...

In a similar way as we used transformLine, you can also use the transformChar method to make character-based input transformations. For example, this code will transform a TAB character with a single space character:

inputFile.withReader { Reader reader ->
  reader.transformChar(outputFile.newWriter()) { String chr ->
    chr == '	' ? ' ' : chr
  }
}

See also

The following recipes give an introduction to I/O operations in Groovy:

  • Reading from a file
  • Writing to a file

Also, it's worth looking at what additional functionality Groovy offers in the java.io.Reader class:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.97.126