Expanding and Compressing Tabs

Problem

You need to convert space characters to tab characters in a file, or vice versa. You might want to replace spaces with tabs to save space on disk, or go the other way to deal with a device or program that can’t handle tabs.

Solution

Use my Tabs class or its subclass EnTab.

Discussion

Example 3-5 is a listing of EnTab , complete with a sample main program. The program works a character at a time; if the character is a space, we see if we can coalesce it with previous spaces to output a single tab character. This program depends on the Tabs class, which we’ll come to shortly. The Tabs class is used to decide which column positions represent tab stops and which do not. The code also has several Debug printouts. (Debug was introduced in Section 1.12.)

Example 3-5. Entab.java

import com.darwinsys.util.Debug;
import java.io.*;

/** entab- replace blanks by tabs and blanks.
 * Transmuted from K&R Software Tools book into C.
 * Transmuted again, years later, into Java.
 */
public class EnTab {

    /** Main program: just create an EnTab program, and pass
     * the standard input or the named file(s) through it.
     */
    public static void main(String[] argv) throws IOException {
        EnTab et = new EnTab(8);
        if (argv.length == 0)    // do standard input
            et.entab(new BufferedReader(
                new InputStreamReader(System.in)));
        else for (int i=0; i<argv.length; i++) {    // do each file
            et.entab(new BufferedReader(new FileReader(argv[i])));
        }
    }

    /** The Tabs (tab logic handler) */
    protected Tabs tabHandler;
    /** A symbolic constant for end-of-file */
    public static int EOF = -1;

    /** Constructor: just save the tab values.
     * @arguments n The number of spaces each tab is to replace.
     */
    public EnTab(int n) {
        tabHandler = new Tabs(n);
    }

    /** putchar - convenience routine for printing one character */
    protected void putchar(int ch) {
        System.out.print((char)ch);
    }

    /** entab: process one entire file, replacing blanks with tabs.
     * @argument is A BufferedReader opened to the file to be read.
     */
    public void entab(BufferedReader is) throws IOException {
        String line;
        int c, col = 0, newcol;

        // main loop: process entire file one char at a time.
        do {
            newcol = col;
            // If we get a space, increment column count; if this
            // takes us to a tab stop, output a tab character.
            while ((c = is.read(  )) == ' ') {
                Debug.println("space", "Got space at " + col);
                newcol++;
                if (tabHandler.tabpos(newcol)) {
                    Debug.println("tab", "Got a Tab Stop " + newcol);
                    putchar('	'),
                    col = newcol;
                }
            }
            // If we're just past a tab stop, we need to put the
            // "leftover" spaces back out, since we just consumed 
            // them in the "while c ... == ' ')" loop above.
            while (col < newcol) {
                Debug.println("pad", "Padding space at " + col);
                putchar(' '),
                col++;
            }
            Debug.println("out", "End of loop, c is " + c);

            // Now either we're at the end of the input file,
            // or we have a plain character to output.
            // If the "plain" char happens to be 
 or 
, then
            // output it, but also set col back to 1.
            // This code for 
 and 
 should satisfy Unix, Mac and MS.
            if (c != EOF) {
                putchar(c);
                col = (c == '
' || c == '
' ? 1 : col + 1);
            }
        } while (c != EOF);
        System.out.flush(  );    // output everything for this file.
    }
}

As the comments state, this code was patterned after a program in Kernighan and Plauger’s classic work Software Tools. While their version was in a language called RatFor (Rational Fortran), my version has been through several translations since then, though I’ve tried to preserve the overall structure. This is not the most “natural” way of writing the code in Java, which would be the line-at-a-time mode. I’ve left this C-language relic to provide some hints on translating a working C program written in this character-at-a-time style into Java. This version tries to work correctly on Windows, Unix, or the Macintosh, since it resets the column count whenever it finds either a return ( ) or a newline ( ); see Section 2.5. Java is platform independent, but it’s possible to write platform-dependent code -- I would have done so were it not for the code that handles both. The code still may not work on some odd platforms that don’t use either of the two line-ending characters.

The Detab program in Example 3-6 doesn’t have this problem, as it reads a line at a time.

Example 3-6. Detab.java

public void detab(BufferedReader is) throws IOException {
    String line;
    char c;
    int col;
    while ((line = is.readLine(  )) != null) {
        col = 0;
        for (int i=0; i<line.length(  ); i++) {
            // Either ordinary character or tab.
            if ((c=line.charAt(i)) != '	') {
                System.out.print(c); // Ordinary
                ++col;
                continue;
            }
            do { // Tab, expand it, must put >=1 space
                System.out.print(' '),
            } while (!tabpos(++col));
        }
        System.out.println(  );
    }
}

The Tabs class provides two methods, settabpos( ) and istabstop( ). Example 3-7 is the source for the Tabs class.

Example 3-7. Tabs.java

import com.darwinsys.util.Debug;

/** Basic tab-character handling stuff.
 * <p>
 * N.B. Can only handle equally-spaced tab stops as written.
 */
public class Tabs {
    /** tabs every so often */
    public final static int DEFTABSPACE =   8;
    /** the current tab stop setting. */
    protected int tabSpace = DEFTABSPACE;
    /** The longest line that we worry about tabs for. */
    public final static int MAXLINE  = 250;
    /** the current tab stops */
    protected boolean[] tabstops;

    /** Construct a Tabs object with a given tab stop settings */
    public Tabs(int n) {
        tabstops = new boolean[MAXLINE];
        tabSpace = n;
        settabs(  );
    }

    /** Construct a Tabs object with a default tab stop settings */
    public Tabs(  ) {
        tabstops = new boolean[MAXLINE];
        settabs(  );
    }

    /** settabs - set initial tab stops */
    public void settabs(  ) {
        int i;
        for (i = 0; i < tabstops.length; i++) {
            tabstops[i] = 0 == (i % tabSpace);
            Debug.println("settabs", "Tabs[" + i + "]=" + tabstops[i]);
        }
    }

    /** tabpos - returns true if given column is a tab stop.
     * If current input line is too long, we just put tabs whereever, 
     * no exception is thrown.
     * @argument col - the current column number
     */
    boolean tabpos(int col) {
        if (col > tabstops.length-1)
            return true;
        else 
            return tabstops[col];
    }
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.27.74