File Operations

I said earlier that regular expressions are often used to process data stored in files on disk. In some earlier examples, I read in data from a disk file, did some pattern matching, and displayed the results on the screen. Here is one more example in which I count the words in a file. You do this by scanning each line in order to create an array of words (that is, sequences of alphanumeric characters) and then adding the size of each array to the variable, count:

wordcount.rb

count = 0
File.foreach( 'regex1.rb' ){ |line|
    count += line.scan( /[a-z0-9A-Z]+/ ).size
}
puts( "There are #{count} words in this file." )

If you want to verify that the word count is correct, you could display a numbered list of words read in from the file. This is what is do here:

wordcount2.rb

File.foreach( 'regex1.rb' ){ |line|
    line.scan( /[a-z0-9A-Z]+/ ).each{ |word|
        count +=1
        print( "[#{count}] #{word}
" )
    }
}

Now let’s see how to deal with two files at once—one for reading, another for writing. The next example opens the file testfile1.txt for writing and passes the file variable, f, into a block. I now open a second file, regex1.rb, for reading and use File.foreach to pass into a second block each line of text read from this file. I use a simple regular expression to create a new string to match lines with Ruby-style comments; the code substitutes C-style comment characters (//) for the Ruby comment character (#) when that character is the first nonwhitespace character on a line and writes each line to testfile1.txt with code lines unmodified (because there are no matches on those) and with comment lines changed to C-style comment lines:

regexp_file1.rb

File.open( 'testfile1.txt', 'w' ){ |f|
    File.foreach( 'regex1.rb' ){ |line|
        f.puts( line.sub(/(^s*)#(.*)/, '1//2')  )
    }
}

This illustrates just how much can be done with regular expressions and very little coding. The next example shows how you might read in one file (here the file regex1.rb) and write out two new files—one of which (comments.txt) contains only line comments, while the other (nocomments.txt) contains all the other lines.

regexp_file2.rb

file_out1 = File.open( 'comments.txt', 'w' )
file_out2 = File.open( 'nocomments.txt', 'w' )

File.foreach( 'regex1.rb' ){ |line|
    if line =˜ /^s*#/ then
        file_out1.puts( line )
    else
        file_out2.puts( line )
    end
}

file_out1.close
file_out2.close
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.74.160