12.3.1. Design of the Query Program

Image

A good way to start the design of a program is to list the program’s operations. Knowing what operations we need can help us see what data structures we’ll need. Starting from requirements, the tasks our program must do include the following:

• When it reads the input, the program must remember the line(s) in which each word appears. Hence, the program will need to read the input a line at a time and break up the lines from the input file into its separate words

• When it generates output,

– The program must be able to fetch the line numbers associated with a given word

– The line numbers must appear in ascending order with no duplicates

– The program must be able to print the text appearing in the input file at a given line number.

These requirements can be met quite neatly by using various library facilities:

• We’ll use a vector<string> to store a copy of the entire input file. Each line in the input file will be an element in this vector. When we want to print a line, we can fetch the line using its line number as the index.

• We’ll use an istringstream8.3, p. 321) to break each line into words.

• We’ll use a set to hold the line numbers on which each word in the input appears. Using a set guarantees that each line will appear only once and that the line numbers will be stored in ascending order.

• We’ll use a map to associate each word with the set of line numbers on which the word appears. Using a map will let us fetch the set for any given word.

For reasons we’ll explain shortly, our solution will also use shared_ptrs.

Data Structures

Although we could write our program using vector, set, and map directly, it will be more useful if we define a more abstract solution. We’ll start by designing a class to hold the input file in a way that makes querying the file easy. This class, which we’ll name TextQuery, will hold a vector and a map. The vector will hold the text of the input file; the map will associate each word in that file to the set of line numbers on which that word appears. This class will have a constructor that reads a given input file and an operation to perform the queries.

The work of the query operation is pretty simple: It will look inside its map to see whether the given word is present. The hard part in designing this function is deciding what the query function should return. Once we know that a word was found, we need to know how often it occurred, the line numbers on which it occurred, and the corresponding text for each of those line numbers.

The easiest way to return all those data is to define a second class, which we’ll name QueryResult, to hold the results of a query. This class will have a print function to print the results in a QueryResult.

Sharing Data between Classes

Our QueryResult class is intended to represent the results of a query. Those results include the set of line numbers associated with the given word and the corresponding lines of text from the input file. These data are stored in objects of type TextQuery.

Because the data that a QueryResult needs are stored in a TextQuery object, we have to decide how to access them. We could copy the set of line numbers, but that might be an expensive operation. Moreover, we certainly wouldn’t want to copy the vector, because that would entail copying the entire file in order to print (what will usually be) a small subset of the file.

We could avoid making copies by returning iterators (or pointers) into the TextQuery object. However, this approach opens up a pitfall: What happens if the TextQuery object is destroyed before a corresponding QueryResult? In that case, the QueryResult would refer to data in an object that no longer exists.

This last observation about synchronizing the lifetime of a QueryResult with the TextQuery object whose results it represents suggests a solution to our design problem. Given that these two classes conceptually “share” data, we’ll use shared_ptrs (§ 12.1.1, p. 450) to reflect that sharing in our data structures.

Using the TextQuery Class

When we design a class, it can be helpful to write programs using the class before actually implementing the members. That way, we can see whether the class has the operations we need. For example, the following program uses our proposed TextQuery and QueryResult classes. This function takes an ifstream that points to the file we want to process, and interacts with a user, printing the results for the given words:

void runQueries(ifstream &infile)
{
    // infile is an ifstream that is the file we want to query
    TextQuery tq(infile);  //  store the file and build the query map
    // iterate with the user: prompt for a word to find and print results
    while (true) {
        cout << "enter word to look for, or q to quit: ";
        string s;
        // stop if we hit end-of-file on the input or if a 'q' is entered
        if (!(cin >> s) || s == "q") break;
        // run the query and print the results
        print(cout, tq.query(s)) << endl;
    }
}

We start by initializing a TextQuery object named tq from a given ifstream. The TextQuery constructor reads that file into its vector and builds the map that associates the words in the input with the line numbers on which they appear.

The while loop iterates (indefinitely) with the user asking for a word to query and printing the related results. The loop condition tests the literal true2.1.3, p. 41), so it always succeeds. We exit the loop through the break5.5.1, p. 190) after the first if. That if checks that the read succeeded. If so, it also checks whether the user entered a q to quit. Once we have a word to look for, we ask tq to find that word and then call print to print the results of the search.


Exercises Section 12.3.1

Exercise 12.27: The TextQuery and QueryResult classes use only capabilities that we have already covered. Without looking ahead, write your own versions of these classes.

Exercise 12.28: Write a program to implement text queries without defining classes to manage the data. Your program should take a file and interact with a user to query for words in that file. Use vector, map, and set containers to hold the data for the file and to generate the results for the queries.

Exercise 12.29: We could have written the loop to manage the interaction with the user as a do while5.4.4, p. 189) loop. Rewrite the loop to use a do while. Explain which version you prefer and why.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.168.8