Filtering a text file's content

Filtering a text file's content is a rather common task. In this recipe, we will show how it can be easily achieved with Groovy.

Getting ready

Let's assume we want to filter out comment lines from a Bash script stored in the script.sh file and we want to save it into the new_script.sh file. First of all, we need to define two variables of the java.io.File type that point to our inputFile and outputFile:

def inputFile = new File('script.sh')
def outputFile = new File('new_script.sh')

How to do it...

File filtering can be implemented in several ways:

  1. We can make use of the closure-based methods (that is eachLine and withPrintWriter) that we have got familiar with in the Reading a text file line by line and Writing to a file recipes:
    outputFile.withPrintWriter { writer ->
      inputFile.eachLine { line ->
        if (!line.startsWith('#')) {
          writer.println(line)
        }
      }
    }
  2. Another way to achieve the same result is to use a filterLine method. It takes a Writer and a closure as input parameters. The closure gets a string line as an input and should return true or false depending on whether line is filtered in or out. We can rewrite the original code snippet in the following way:
    outputFile.withWriter { writer ->
      inputFile.filterLine(writer) { line ->
        !line.startsWith('#')
      }
    }
  3. Actually, the filterLine method also closes the given writer automatically. So, we can omit one closure from the previous code and just pass a new Writer instance to the method:
    inputFile.filterLine(outputFile.newWriter()) { line ->
      !line.startsWith('#')
    }

How it works...

In the first code example above, we have first called a method withPrintWriter to which we passed a closure in which we iterated through text lines with the help of the eachLine method. The inner closure passed to eachLine has access to both the writer and the line objects. Inside that closure, we added a simple conditional statement to write out only those lines that do not start with #.

In the second snippet, we passed a closure to the filterLine method. That closure gives you access to the line and expects to return a boolean that indicates whether that line should be written to the final output (writer) or not.

All code examples achieve the same result. The filtered Bash script should contain no comments after execution of the Groovy code demonstrated in the previous section.

There's more...

There is another overloaded version of the filterLine method that returns an instance of groovy.lang.Writable. Writable has only one method that is, writeTo(java.io.Writer writer). It's a Groovy abstraction that allows postponing the content creation until it is actually streamed to the final output target. Instances of Writable can be used in most of the output operations such as print, write, append, and leftShift. That's why this filterLine version does not need any Writer. Taking it all into account, we can rewrite the previous code as a one-liner:

outputFile << inputFile.filterLine { !it.startsWith('#') }

The Writable result of the filterLine method is sent to outputFile with the help of the leftShift operator (you can read more about it in Writing to a file recipe). We also omitted the line variable and simply referred to Groovy's default it closure parameter. In this way, the code looks almost similar to an OS command; short and clear.

See also

Check the following recipes for some additional insights:

  • Reading a text file line by line
  • Writing to a file

The following Groovydoc links may be of interest for the reader:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.116.137