CHAPTER 13

image

Searching Tabular Data

In previous chapters, you saw how to create and use tabular data. This chapter returns to the issue of processing tabular data to examine the operation of the SEARCH and SEARCH ALL verbs. SEARCH is used for linear searches, and SEARCH ALL is used for binary searches.

The chapter begins by noting that when you use SEARCH or SEARCH ALL, the table they are searching must have an associated table index. You learn the metalanguage for the INDEXED BY clause used to specify the table index and explore the nature of the index data item. Because index data items can’t be manipulated by ordinary COBOL verbs, the chapter introduces the versions of the SET verb that are used to assign, increment, and decrement table index values.

With the background material covered, you see how the SEARCH verb operates on single-dimension tables and work through an example. The chapter highlights the limitations of the SEARCH with regard to searching multidimensional tables and suggests, and demonstrates, a solution.

The SEARCH verb searches a table serially. To search a table using a binary search, you must use SEARCH ALL. You learn that the SEARCH ALL can only work correctly if the table is ordered, and the chapter discusses the extension to the OCCURS clause that allows you to identify the data item on which the table is ordered. You see how a binary search works along with an example of the operation of SEARCH ALL.

Finally, the chapter introduces the notion of variable-length tables. Although variable-length tables are not truly dynamic, they are still useful, because the variable size limitations are obeyed by COBOL verbs such as the SEARCH and SEARCH ALL. You see this with an example program.

SEARCHING Tabular Data

The task of searching a table to determine whether it contains a particular value is a common operation. The method used to search a table depends heavily on the way the values are organized in the table. If the values are not ordered, then the only strategy available is a linear search. A linear search starts at the first element and then examines each succeeding element until the item is found or until the end of the table is reached (item not found). If the values are ordered, then you have the option of using either a linear search or a binary search. A binary search works by dividing the table in half and determining whether the item sought is in the top half of the table or the bottom half. This process continues until the item is found or it is determined that the item is not in the table.

COBOL has special verbs that let you search tables using either strategy. The SEARCH verb is used for linear searches, and the SEARCH ALL verb is used for binary searches.

Searching Using SEARCH and SEARCH ALL

One advantage of using SEARCHor SEARCH ALL rather than a handcrafted search is that because these are specialized instructions, their operation can be optimized. Part of that optimization involves creating a special subscript to be used when searching the table. You create this special subscript using an extension to the OCCURS clause called the INDEXED BY clause.

INDEXED BY Clause

Before you can use SEARCH or SEARCH ALL to search a table, you must define the table as having an index item associated with it. Using an index makes the searching more efficient. Because the index is linked to a particular table, the compiler—taking into account the size of the table—can choose the most efficient representation possible for the index. This speeds up the search.

The index is specified by the IndexName given in an INDEXED BY clause attached to the OCCURS clause. The extended OCCURS clause metalanguage is shown in Figure 13-1.

9781430262534_Fig13-01.jpg

Figure 13-1. OCCURS metalanguage, including the INDEXED BY clause

The following are some things to consider about Figure 13-1:

  • The index defined in a table declaration is associated with that table and is the subscript that SEARCH or SEARCH ALL uses to access the table.
  • The only entry that needs to be made for an IndexName is to use it in an INDEXED BY phrase. It does not require a PICTURE clause, because the compiler handles its declaration automatically.
  • Because of its special binary representation, the table index cannot be displayed, and its value cannot be manipulated using ordinary COBOL verbs such as MOVE, ADD, and SUBTRACT. Only four COBOL verbs can change the value of a table index: SEARCH, SEARCH ALL, PERFORM..VARYING, and SET.
  • Index names must be unique.
  • An index is only valid for the table to which it is bound. An index bound to one table cannot be used with another table.

Using SET to Manipulate the Table Index

A table index is a special data item. It has no PICTURE clause, it is associated with a particular table, and the compiler defines the index using the most computationally efficient representation possible. Because of its special binary representation, the table index cannot be displayed and can only be assigned a value, or have its value assigned, by the SET verb. Similarly, the SET verb must be used to increment or decrement the value of an index item.

The metalanguage for the formats of the SET verb that are used to manipulate the value of an index item are given in Figure 13-2.

9781430262534_Fig13-02.jpg

Figure 13-2. Metalanguage for SET formats used to manipulate index values

The SEARCH Verb

When the values in a table are not ordered, the only searching strategy available is a linear search. You start at the first element and then search through the table serially, element by element, until either you find the item you seek or you reach the end of the table. You use the SEARCH verb when you want to search a table serially. The metalanguage for the SEARCH verb is given in Figure 13-3.

9781430262534_Fig13-03.jpg

Figure 13-3. Metalanguage for the SEARCH verb

Note the following about Figure 13-3:

  • Before you can use SEARCH to search TableName, you must define a table index for the table in an INDEXED BY clause attached to the OCCURS clause that defines the table. The index specified in the INDEXED BY clause of TableName is the controlling index (subscript) of SEARCH. The controlling index controls the submission of the elements, or element items, for examination by the WHEN phrase of SEARCH. A SEARCH can have only one controlling index.
  • TableName must identify a data item in the table hierarchy with both OCCURS and INDEXED BY clauses.
  • SEARCH searches a table serially, starting at the element pointed to by the table index. This means the table index is under your control.
  • Because the table index is under your control, before SEARCH executes you must SET the table index to point to one of the elements in the table (usually the first element).
  • When SEARCH executes, the table index cannot have a value less than one or greater than the size of the table, or SEARCH will immediately terminate.
  • The VARYING phrase is used for a number of purposes:
    • When more than one index is attached to the table (note the ellipsis after IndexName in Figure 13-1), IndexItem identifies the IndexName that SEARCH uses as the table index.
    • When IndexItem is an index attached to another table or is a data item defined as USAGE IS INDEX, SEARCH increments the IndexItem at the same time and by the same amount as the table index.
    • When a non-index data item is used, SEARCH increments the data item by one each time it increments the table index.
  • If AT END is specified and the index is incremented beyond the highest legal occurrence for the table (that is, the item has not been found), then the statement following AT END is executed and SEARCH terminates.
  • The WHEN conditions attached to SEARCH are evaluated in turn. As soon as one is true, the statements following the WHEN phrase are executed, SEARCH ends, and the table index remains set at the value it had when the condition was satisfied.

SEARCH Examples

This section contains a number of examples that show how SEARCH is used. The section starts with a simple example by way of introduction and then ratchets up the complexity.

Letter Position Example

The example shown in Listing 13-1 uses SEARCH to discover the alphabet position of a letter entered by the user.

Listing 13-1. Finding the Position of a Letter in the Alphabet

IDENTIFICATION DIVISION.
PROGRAM-ID.  Listing13-1.
AUTHOR.  Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 LetterTable.
   02 TableValues.
      03 FILLER PIC X(13)
         VALUE "ABCDEFGHIJKLM".
      03 FILLER PIC X(13)
         VALUE "NOPQRSTUVWXYZ".
   02 FILLER REDEFINES TableValues.
      03 Letter PIC X OCCURS 26 TIMES
                      INDEXED BY LetterIdx.
 
01 IdxValue  PIC 99 VALUE ZEROS.
 
01 LetterIn  PIC X.
   88 ValidLetter VALUE "A" THRU "Z".
 
PROCEDURE DIVISION.
FindAlphabetLetterPosition.
   PERFORM WITH TEST AFTER UNTIL ValidLetter
      DISPLAY "Enter an uppercase letter please - " WITH NO ADVANCING
      ACCEPT LetterIn
   END-PERFORM
   SET LetterIdx TO 1
   SEARCH Letter
      WHEN Letter(LetterIdx) = LetterIn
          SET IdxValue TO LetterIdx
          DISPLAY LetterIn, " is in position ", IdxValue
   END-SEARCH
   STOP RUN.

9781430262534_unFig13-01.jpg

I use a loop to get a valid uppercase letter from the user. Because the loop will exit only when a valid letter has been entered, the AT END clause is not used in SEARCH because the letter is always found in the table.

LetterIdx is the table index. It is automatically incremented by SEARCH. Note how it is associated with the table by means of the INDEXED BY clause.

Before SEARCH executes, the SET verb is used to set the table index (LetterIdx) to the position in the table where I want SEARCH to start.

Finally, because LetterIdx is a special binary index item, you can’t display its value directly. So IdxValue, a numeric data item whose value can be displayed, is set to value of LetterIdx, and then IdxValue is displayed.

American States Example

The program in Listing 13-2 uses SEARCH to interrogate a table of American states, their ISO two-letter codes, and their capitals. The user is asked to choose the state code, state name, or state capital as their search term. Whichever is chosen, the program displays the other two. For instance, if the user chooses to search on the state code, then the program displays the state name and the state capital. If the state name is chosen, then the program displays the state code and state capital.

Listing 13-2. Given One of StateCode, StateName, or StateCapital, Display the Other Two

IDENTIFICATION DIVISION.
PROGRAM-ID.  Listing13-2.
AUTHOR.  Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 StatesTable.
   02 StateValues.
      03 FILLER PIC X(60)
         VALUE "ALAlabama       Montgomery    AKAlaska        Juneau".
      03 FILLER PIC X(60)
         VALUE "AZArizona       Phoenix       ARArkansas      Little Rock".
      03 FILLER PIC X(60)
         VALUE "CACalifornia    Sacramento    COColorado      Denver".
      03 FILLER PIC X(60)
         VALUE "CTConnecticut   Hartford      DEDelaware      Dover".
      03 FILLER PIC X(60)
         VALUE "FLFlorida       Tallahassee   GAGeorgia       Atlanta".
      03 FILLER PIC X(60)
         VALUE "HIHawaii        Honolulu      IDIdaho         Boise".
      03 FILLER PIC X(60)
         VALUE "ILIllinois      Springfield   INIndiana       Indianapolis".
      03 FILLER PIC X(60)
         VALUE "IAIowa          Des Moines    KSKansas        Topeka".
      03 FILLER PIC X(60)
         VALUE "KYKentucky      Frankfort     LALouisiana     Baton Rouge".
      03 FILLER PIC X(60)
         VALUE "MEMaine         Augusta       MDMaryland      Annapolis".
      03 FILLER PIC X(60)
         VALUE "MAMassachusetts Boston        MIMichigan      Lansing".
      03 FILLER PIC X(60)
         VALUE "MNMinnesota     Saint Paul    MSMississippi   Jackson".
      03 FILLER PIC X(60)
         VALUE "MOMissouri      Jefferson CityMTMontana       Helena".
      03 FILLER PIC X(60)
         VALUE "NENebraska      Lincoln       NVNevada        Carson City".
      03 FILLER PIC X(60)
         VALUE "NHNew Hampshire Concord       NJNew Jersey    Trenton".
      03 FILLER PIC X(60)
         VALUE "NMNew Mexico    Santa Fe      NYNew York      Albany".
      03 FILLER PIC X(60)
         VALUE "NCNorth CarolinaRaleigh       NDNorth Dakota  Bismarck".
      03 FILLER PIC X(60)
         VALUE "OHOhio          Columbus      OKOklahoma      Oklahoma City".
      03 FILLER PIC X(60)
         VALUE "OROregon        Salem         PAPennsylvania  Harrisburg".
      03 FILLER PIC X(60)
         VALUE "RIRhode Island  Providence    SCSouth CarolinaColumbia".
      03 FILLER PIC X(60)
         VALUE "SDSouth Dakota  Pierre        TNTennessee     Nashville".
      03 FILLER PIC X(60)
         VALUE "TXTexas         Austin        UTUtah          Salt Lake City".
      03 FILLER PIC X(60)
         VALUE "VTVermont       Montpelier    VAVirginia      Richmond".
      03 FILLER PIC X(60)
         VALUE "WAWashington    Olympia       WVWest Virginia Charleston".
      03 FILLER PIC X(60)
         VALUE "WIWisconsin     Madison       WYWyoming       Cheyenne".
   02 FILLER REDEFINES StateValues.
      03 State OCCURS 50 TIMES
               INDEXED BY StateIdx.
         04 StateCode    PIC XX.
         04 StateName    PIC X(14).
         04 StateCapital PIC X(14).
 
01 StateNameIn           PIC X(14).
 
01 StateCapitalIn        PIC X(14).
 
01 StateCodeIn           PIC XX.
 
01 SearchChoice          PIC 9 VALUE ZERO.
   88 ValidSearchChoice  VALUES 1, 2, 3, 4.
   88 EndOfInput         VALUE 4.
 
PROCEDURE DIVISION.
Begin.
   PERFORM WITH TEST AFTER UNTIL EndOfInput
      PERFORM WITH TEST AFTER UNTIL ValidSearchChoice
         DISPLAY SPACES
         DISPLAY "Search by StateCode (1), StateName (2), StateCapital (3), STOP (4) - "
                 WITH NO ADVANCING
         ACCEPT SearchChoice
      END-PERFORM
      SET StateIdx TO 1
      EVALUATE SearchChoice
         WHEN 1 PERFORM GetNameAndCapital
         WHEN 2 PERFORM GetCodeAndCapital
         WHEN 3 PERFORM GetCodeAndName
      END-EVALUATE
   END-PERFORM
   STOP RUN.
 
GetNameAndCapital.
   DISPLAY "Enter the two letter State Code - " WITH NO ADVANCING
   ACCEPT StateCodeIn
   MOVE FUNCTION UPPER-CASE(StateCodeIn) TO StateCodeIn
   SEARCH State
       AT END DISPLAY "State code " StateCodeIn " does not exist"
       WHEN StateCode(StateIdx) = StateCodeIn
            DISPLAY "State Name    = " StateName(StateIdx)
            DISPLAY "State Capital = " StateCapital(StateIdx)
   END-SEARCH.
 
GetCodeAndCapital.
   DISPLAY "Enter the State Name - " WITH NO ADVANCING
   ACCEPT StateNameIn
   SEARCH State
       AT END DISPLAY "State Name " StateNameIn " does not exist"
       WHEN FUNCTION UPPER-CASE(StateName(StateIdx)) = FUNCTION UPPER-CASE(StateNameIn)
            DISPLAY "State Code    = " StateCode(StateIdx)
            DISPLAY "State Capital = " StateCapital(StateIdx)
   END-SEARCH.
                          
GetCodeAndName.
   DISPLAY "Enter the State Capital - " WITH NO ADVANCING
   ACCEPT StateCapitalIn
   SEARCH State
       AT END DISPLAY "State capital " StateCapitalIn " does not exist"
       WHEN FUNCTION UPPER-CASE(StateCapital(StateIdx)) = FUNCTION UPPER-CASE(StateCapitalIn)
            DISPLAY "State Code = " StateCode(StateIdx)
            DISPLAY "State Name = " StateName(StateIdx)
   END-SEARCH.

9781430262534_unFig13-02.jpg

The program contains a table prefilled with the state codes, state names, and state capitals of the American states. The user provides any one of the three (state code, state name, or state capital), and SEARCH returns the other two from the table.

Most of the program is straightforward and doesn’t require any explanation. However, each of the three paragraphs GetNameAndCapital, GetCodeAndCapital, and GetCodeAndName makes use of intrinsic functions. You have not encountered intrinsic functions yet. You won’t examine them formally until Chapter 15, but I have introduced them here by way of a preview.

A function is a closed subroutine (block of code) that substitutes a returned value for its invocation. In Java, a method with a non-void return value type is a function. COBOL does not have user-defined functions, but it does have a number of built-in system functions called intrinsic functions.

The problem with using user input for comparison purposes is that you have to compare like with like. Alaska is not the same as alaska or ALASKA or aLaska. In GetCodeAndCapital and GetCodeAndName, the intrinsic function UPPER-CASE is used to convert the table data item and the data entered by the user to uppercase to ensure that the program is comparing like with like. In the comparison, the intrinsic function invocation is replaced by the returned function result, and then the comparison is done. For instance, an IF statement such as

IF FUNCTION UPPER-CASE("rEdMond") = FUNCTION UPPER-CASE("REDmond")
 
becomes
 
IF "REDMOND" = "REDMOND"

In GetNameAndCapital, I could have used the intrinsic function the same way as in GetCodeAndCapital and GetCodeAndName; but because the state code is already in uppercase, I took the opportunity to show another way of using intrinsic functions. In this paragraph, I use the intrinsic function to convert the user input to uppercase by moving the converted input data back into the input data item StateCodeIn.

Searching Multidimensional Tables

In the notes on SEARCH, I observed that SEARCH can have only one controlling index (the IndexName specified in the INDEXED BY phrase attached to the table being searched). Because SEARCH can have only one controlling index, SEARCH can only be used to search a single dimension of a table at a time. If the table to be searched is multidimensional, then you must control the indexes of the other dimensions.

Listing 13-3 is a small program that demonstrates how to use SEARCH to search a two-dimensional table. The program sets Appointment(3, 2) and Location(3, 2) to “Peter’s Wedding” and “Saint John’s Church”. SEARCH is then used to search the appointments timetable for the appointment details of “Peter’s Wedding”. When found, these details are displayed.

Listing 13-3. Program Demonstrating How to Search a Two-Dimensional Table

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-3.
AUTHOR. Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 MyTimeTable.
   02 DayOfApp OCCURS 5 TIMES INDEXED BY DayIdx.
      03 HourOfApp OCCURS 9 TIMES INDEXED BY HourIdx.
         04 Appointment      PIC X(15).
         04 Location         PIC X(20).
            
01 AppointmentType           PIC X(15).
 
01 DaySub                    PIC 9.
01 HourSub                   PIC 9.
 
01 FILLER                    PIC 9 VALUE ZERO.
   88 AppointmentNotFound    VALUE ZERO.
   88 AppointmentFound       VALUE 1.
 
01 DayValues VALUE "MonTueWedThuFri".
   02 DayName   PIC XXX OCCURS 5 TIMES.
 
01 TimeValues  VALUE " 9:0010:0011:0012:0013:0014:0015:0016:0017:00".
   02 TimeValue   PIC X(5) OCCURS 9 TIMES.
          
PROCEDURE DIVISION.
Begin.
   MOVE "Peter's Wedding" TO AppointmentType, Appointment(2, 3)
   MOVE "Saint John's Church" TO Location(2, 3)
   SET DayIdx TO 1.
   PERFORM UNTIL AppointmentFound OR DayIdx > 5
      SET HourIdx TO 1
      SEARCH HourOfApp
         AT END SET DayIdx UP BY 1
         WHEN AppointmentType = Appointment(DayIdx, HourIdx)
              SET AppointmentFound TO TRUE
              SET HourSub TO HourIdx
              SET DaySub TO DayIdx
              DISPLAY AppointmentType " is on " DayName(DaySub)
              DISPLAY "at " TimeValue(HourSub) " in " Location(DayIdx, HourIdx)
     END-SEARCH
   END-PERFORM
   IF AppointmentNotFound
      DISPLAY "Appointment " AppointmentType " was not in the timetable"
   END-IF
   STOP RUN.

9781430262534_unFig13-03.jpg

The table used to hold the appointment timetable is described in Example 13-1 and is graphically depicted in Figure 13-4.

Example 13-1. Declarations for the Table Used to Record Appointments

01 MyTimeTable.
   02 DayOfApp OCCURS 5 TIMES INDEXED BY DayIdx.
      03 HourOfApp OCCURS 9 TIMES INDEXED BY HourIdx.
         04 Appointment      PIC X(15) VALUE SPACES.
         04 Location         PIC X(20) VALUE SPACES.

9781430262534_Fig13-04.jpg

Figure 13-4. Graphical depiction of the two-dimensional table MyTimeTable

As you can see by examining Figure 13-4, in this two-dimensional table each day element consists of a table of hour elements. The table of hour elements is the SEARCH target. In most searches of a multidimensional table, the SEARCH target is the lowest data item in the hierarchy that contains both a OCCURS and an INDEXED BY clause. In this case, HourOfApp is the SEARCH target. In SEARCH, the controlling index is the item attached to the target table by an INDEXED BY clause. In this case, it is HourIdx.

Because SEARCH can have only one controlling index, you have to control the other. In this program, the SET verb is used to control the value in DayIdx, and HourIdx is under the control of SEARCH.

When SEARCH executes, it searches whichever of the HourOfApp tables is pointed to by DayIdx. If the appointment is not found, AT END activates, DayIdx is incremented, and SEARCH is executed again, this time examining the HourOfApp table in the next DayOfApp element. If the appointment is found, WHEN activates, and the HourIdx and DayIdx values are used to display the time (24-hour format) and day of the appointment. DayName and TimeValue are set up using the facility introduced in the ANS 85 version of COBOL that allows you to create prefilled tables without using the REDEFINES clause.

Searching the First Dimension of a Two-Dimensional Table

Listing 13-3 uses SEARCH to search the second dimension of the two-dimensional table. It does not normally make sense to search the first dimension of a two-dimensional table, because as you can see from Figure 13-4, each element at that level contains a table, not a discrete value. Sometimes, though, you need to perform such a search.

Suppose you have a two-dimensional table that records the number of jeans sold in three different colors in 150 shops. Suppose for each group of jeans sales totals, you also record the shop name. The table to record this information is described in Example 13-2 and is graphically depicted in Figure 13-5.

Example 13-2. Description of JeansSalesTable

01 JeansSalesTable.
   02 Shop OCCURS 150 TIMES INDEXED BY ShopIdx.
      03 ShopName        PIC X(15).
      03 JeansColor OCCURS 3 TIMES INDEXED BY ColorIdx.
         04 TotalSold        PIC 9(5).

9781430262534_Fig13-05.jpg

Figure 13-5. Graphical depiction of JeansSalesTable

Listing 13-4 is a simple program that shows how to use SEARCH to search the first dimension of a two-dimensional table. To keep the program simple, I haven’t filled the table with all the values shown in Figure 13-5; only the element Shop(3) is filled with data values.

Listing 13-4. Searching the First Dimension of a Two-Dimensional Table

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-4.
AUTHOR. Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 JeansSalesTable.
   02 Shop OCCURS 150 TIMES INDEXED BY ShopIdx.
      03 ShopName        PIC X(15) VALUE SPACES.
      03 JeansColor OCCURS 3 TIMES.
         04 TotalSold    PIC 9(5)  VALUE ZEROS.
            
01 ShopQuery            PIC X(15).
 
01 PrnWhiteJeans.
   02 PrnWhiteTotal     PIC ZZ,ZZ9.
   02 FILLER            PIC X(12) VALUE " white jeans".
    
01 PrnBlueJeans.
   02 PrnBlueTotal      PIC ZZ,ZZ9.
   02 FILLER            PIC X(12) VALUE " blue  jeans".
    
01 PrnBlackJeans.
   02 PrnBlackTotal     PIC ZZ,ZZ9.
   02 FILLER            PIC X(12) VALUE " black jeans".
 
PROCEDURE DIVISION.
Begin.
    MOVE "Jean Therapy" TO ShopName(3), ShopQuery
    MOVE 00734 TO TotalSold(3, 1)
    MOVE 04075 TO TotalSold(3, 2)
    MOVE 01187 TO TotalSold(3, 3)
    
    SET ShopIdx TO 1
    SEARCH Shop
       AT END Display "Shop not found"
       WHEN ShopName(ShopIdx) = ShopQuery
           MOVE TotalSold(ShopIdx, 1) TO PrnWhiteTotal
           MOVE TotalSold(ShopIdx, 2) TO PrnBlueTotal
           MOVE TotalSold(ShopIdx, 3) TO PrnBlackTotal
           DISPLAY "Sold by " ShopQuery
           DISPLAY PrnWhiteJeans
           DISPLAY PrnBlueJeans
           DISPLAY PrnBlackJeans
    END-SEARCH
    STOP RUN.

9781430262534_unFig13-04.jpg

The SEARCH ALL Verb

As I noted earlier in this chapter, the method used to search a table depends heavily on the way the values are organized in the table. If the values are not ordered, then the only strategy available is a linear search. If the values are ordered, then you have the option of using either a linear search or a binary search. This section introduces SEARCH All, the COBOL verb used for binary searches.

Because SEARCH ALL implements a binary search, it only works on an ordered table. The table must be ordered on the values in the element or, where the element is a group item, on a data item within the element. The item on which the table is ordered is known as the key field and is identified using the KEY IS phrase in the table declaration.

KEY IS Clause

The KEY IS clause is used to identify the data item on which the table to be searched is ordered. If you want to search a table using SEARCH ALL, the table declaration must contain a KEY IS phrase. The OCCURS metalanguage that includes the KEY IS clause is given in Figure 13-6.

9781430262534_Fig13-06.jpg

Figure 13-6. OCCURS metalanguage including the KEY IS clause

How a Binary Search Works

I have mentioned that SEARCH ALL implements a binary search. Before discussing SEARCH ALL itself, let’s take the time to refresh your memory about how a binary search works.

A binary search works by repeatedly dividing the search area into a top half and a bottom half, deciding which half contains the required item, and making that half the new search area. The search continues halving the search area like this until the required item is found or the search discovers that the item is not in the table.

The algorithm for a binary search is given in Example 13-3.

Example 13-3. Binary Search Algorithm

PERFORM UNTIL ItemFound OR ItemNotInTable
   COMPUTE Middle = (Lower + Upper) / 2
   EVALUATE TRUE
      WHEN Lower > Upper THEN SET ItemNotInTable TO TRUE
      WHEN KeyField(Middle) < SearchItem THEN Lower = Middle + 1
      WHEN KeyField(Middle) > SearchItem THEN Upper = Middle -1      WHEN KeyField(Middle) = SearchItem THEN SET ItemFound TO TRUE
   END-EVALUATE
END-PERFORM

To illustrate how this algorithm works, let’s consider it in the context of a table containing the letters of the alphabet. The table holding the letters is described in Example 13-4, and its representation in memory is illustrated pictorially in Figure 13-7.

Example 13-4. Table Prefilled with the Letters of the Alphabet

01  LetterTable.
    02 LetterValues.
       03 FILLER PIC X(13)
          VALUE "ABCDEFGHIJKLM".
       03 FILLER PIC X(13)
          VALUE "NOPQRSTUVWXYZ".
    02 FILLER REDEFINES LetterValues.
       03 Letter PIC X OCCURS 26 TIMES
                       ASCENDING KEY IS Letter
                       INDEXED BY LetterIdx.

9781430262534_Fig13-07.jpg

Figure 13-7. Table containing the letters of the alphabet

Suppose you want to search LetterTable to find the position of the letter R. The general binary search algorithm introduced in Example 13-3 can be made more specific to the problem, as shown in Example 13-5. Figure 13-8 shows a succession of diagrams illustrating the application of this algorithm.

Example 13-5. Binary Search to Find the Letter R

PERFORM UNTIL ItemFound OR ItemNotInTable
   COMPUTE Middle = (Lower + Upper) / 2
   EVALUATE TRUE
      WHEN Lower > Upper THEN SET ItemNotInTable TO TRUE
      WHEN Letter(Middle) < "R" THEN Lower = Middle + 1
      WHEN Letter(Middle) > "R"  THEN Upper = Middle -1
      WHEN Letter(Middle) = "R"  THEN SET ItemFound TO TRUE

   END-EVALUATE
END-PERFORM

9781430262534_Fig13-08.jpg

Figure 13-8. Finding the letter R using a binary search

SEARCH ALL

The metalanguage for SEARCH ALL is given in Figure 13-9.

9781430262534_Fig13-09.jpg

Figure 13-9. Metalanguage for SEARCH ALL

Consider the following:

  • The OCCURS clause of the table to be searched must have a KEY IS clause in addition to an INDEXED BY clause. The KEY IS clause identifies the data item on which the table is ordered.
  • When you use SEARCH ALL, you do not need to set the table index to a starting value because SEARCH ALL controls it automatically.
  • ElementIdentifier must be the item referenced by the table’s KEY IS clause.
  • ConditionName may have only one value, and it must be associated with a data item referenced by the table’s KEY IS clause.

image Bug Alert  SEARCH ALL presents no problems when the table is fully loaded (all the elements have been assigned data values); but if the table is not fully loaded, then SEARCH ALL may not function correctly because the values in the unloaded part of the table will not be in key order. To rectify this problem, before you load the table you should fill it with HIGH-VALUES if the key is ascending or LOW-VALUES if the key is descending. See Listing 13-5 for an example.

Listing 13-5 is a simple program that displays the country name when the user enters a two-letter Internet code. Unlike the other example programs in this chapter, the table data in Listing 13-5 is obtained from a file that is loaded into the table at runtime. This is a much more realistic scenario for any sort of volatile data. Countries come and go and change their names with sufficient frequency that loading the table from a file makes good sense from a maintenance perspective. The table can hold up to 250 countries, but the country-code file contains only 243 entries at present.

Listing 13-5. Displaying the Corresponding Country Name When the User Enters a Country Code

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-5.
AUTHOR. Michael Coughlan.
 
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
       SELECT CountryCodeFile ASSIGN TO "Listing13-5.dat"
                 ORGANIZATION IS LINE SEQUENTIAL.
                  
DATA DIVISION.
FILE SECTION.
FD CountryCodeFile.
01 CountryCodeRec.
   88 EndOfCountryCodeFile VALUE HIGH-VALUES.
   02 CountryCodeCF    PIC XX.
   02 CountryNameCF    PIC X(25).
 
WORKING-STORAGE SECTION.
01 CountryCodeTable.
   02 Country OCCURS 300 TIMES
              ASCENDING KEY IS CountryCode
              INDEXED BY Cidx.
      03 CountryCode   PIC XX.
      03 CountryName   PIC X(25).
      
01 CountryCodeIn       PIC XX.
   88 EndOfInput       VALUE SPACES.
    
01 FILLER              PIC 9 VALUE ZERO.
   88 ValidCountryCode VALUE 1.
 
PROCEDURE DIVISION.
Begin.
    PERFORM LoadCountryCodeTable
    PERFORM WITH TEST AFTER UNTIL EndOfInput
       PERFORM WITH TEST AFTER UNTIL ValidCountryCode OR EndOfInput
           DISPLAY "Enter a country code (space to stop) :- "
                    WITH NO ADVANCING

           ACCEPT CountryCodeIn
           SEARCH ALL Country
               AT END IF NOT EndOfInput
                         DISPLAY "Country code " CountryCodeIn " is not valid"
                      END-IF
               WHEN CountryCode(Cidx) =  FUNCTION UPPER-CASE(CountryCodeIn)
               DISPLAY CountryCodeIn " is " CountryName(Cidx)
           END-SEARCH
           DISPLAY SPACES
       END-PERFORM
    END-PERFORM
    STOP RUN.
 
LoadCountryCodeTable.
* Loads table with HIGH-VALUES so the SEARCH ALL works when the table is partially loaded
    MOVE HIGH-VALUES TO CountryCodeTable
    OPEN INPUT CountryCodeFile
    READ CountryCodeFile
       AT END SET EndOfCountryCodeFile TO TRUE
    END-READ
 
    PERFORM VARYING Cidx FROM 1 BY 1 UNTIL EndOfCountryCodeFile
        MOVE CountryCodeRec TO Country(Cidx)
        READ CountryCodeFile
           AT END SET EndOfCountryCodeFile TO TRUE
        END-READ
    END-PERFORM
    CLOSE CountryCodeFile.

9781430262534_unFig13-05.jpg

Variable-Length Tables

All the examples you have seen so far have used fixed-length tables. You may have wondered if COBOL supports variable-length tables. The answer is that it does support variable-length tables—of a sort.

You can declare variable-length tables using extensions to the OCCURS clause, as shown in Figure 13-10. Although you can dynamically alter the number of element occurrences in a variable-length table, the amount of storage allocated is fixed. It is defined by the value of LargestSize#i and is assigned at compile time. Standard COBOL has no mechanism for dynamically changing the amount of storage allocated to a table.

9781430262534_Fig13-10.jpg

Figure 13-10. Full OCCURS metalanguage, including the entries required for variable-length tables and the SEARCH and SEARCH ALL verbs

Note that this format of the OCCURS clause may only be used to vary the number of elements in the first dimension of a table.

An example declaration is shown in Example 13-6.

Example 13-6. Example Variable-Length Table Declaration

01 BooksReservedTable.
   02 BookId     PIC 9(7) OCCURS 1 TO 10
                 DEPENDING ON NumOfReservations.

The program in Listing 13-5 fills the table with HIGH-VALUES in order to get SEARCH ALL to work correctly, because the table was only partially populated (250 elements in size but only 243 countries). You could achieve the same effect by declaring the table as a variable-length table.

Although variable-length tables are not dynamic (the storage allocated is defined by the table’s maximum size), they are treated by COBOL verbs as if they were dynamic. For instance, when you use SEARCH or SEARCH ALL with the table, only the elements between SmallestSize#i and TableSize#i are interrogated.

Listing 13-6 revisits the program from Listing 13-5 to emphasize these points.

Listing 13-6. SEARCH ALL Used with a Variable-Length Table

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-6.
AUTHOR. Michael Coughlan.
 
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
       SELECT CountryCodeFile ASSIGN TO "Listing13-6.dat"
                 ORGANIZATION IS LINE SEQUENTIAL.
                  
DATA DIVISION.
FILE SECTION.
FD CountryCodeFile.
01 CountryCodeRec.
   88 EndOfCountryCodeFile VALUE HIGH-VALUES.
   02 CountryCodeCF    PIC XX.
   02 CountryNameCF    PIC X(25).
 
WORKING-STORAGE SECTION.
01 CountryCodeTable.
   02 Country OCCURS 1 TO 300 TIMES
              DEPENDING ON NumberOfCountries
              ASCENDING KEY IS CountryCode
              INDEXED BY Cidx.
      03 CountryCode   PIC XX.
      03 CountryName   PIC X(25).
      
01 CountryCodeIn       PIC XX.
   88 EndOfInput       VALUE SPACES.
    
01 FILLER              PIC 9 VALUE ZERO.
   88 ValidCountryCode VALUE 1.
    
01 NumberOfCountries   PIC 999.
 
PROCEDURE DIVISION.
Begin.
    PERFORM LoadCountryCodeTable
    PERFORM WITH TEST AFTER UNTIL EndOfInput
       PERFORM WITH TEST AFTER UNTIL ValidCountryCode OR EndOfInput
           DISPLAY "Enter a country code (space to stop) :- "
                   WITH NO ADVANCING
           ACCEPT CountryCodeIn
           PERFORM SearchCountryCodeTable
           DISPLAY SPACES
       END-PERFORM
    END-PERFORM
    MOVE 244 TO NumberOfCountries
    MOVE "ZZ" TO CountryCodeIn
    PERFORM SearchCountryCodeTable
    STOP RUN.
    
SearchCountryCodeTable.
    SEARCH ALL Country
        AT END IF NOT EndOfInput
                  DISPLAY "Country code " CountryCodeIn " is not valid"
               END-IF
        WHEN CountryCode(Cidx) =  FUNCTION UPPER-CASE(CountryCodeIn)
             DISPLAY CountryCodeIn " is " CountryName(Cidx)
    END-SEARCH.
 
LoadCountryCodeTable.
    OPEN INPUT CountryCodeFile
    READ CountryCodeFile
    AT END SET EndOfCountryCodeFile TO TRUE
    END-READ
 
    PERFORM VARYING NumberOfCountries FROM 1 BY 1 UNTIL EndOfCountryCodeFile
       MOVE CountryCodeRec TO Country(NumberOfCountries)
       READ CountryCodeFile
          AT END SET EndOfCountryCodeFile TO TRUE
       END-READ
    END-PERFORM
    MOVE "ZZ **** FOUND ****" TO Country(244)
    CLOSE CountryCodeFile.

9781430262534_unFig13-06.jpg

The program in Listing 13-6 is the same as Listing 13-5, with the following changes:

  • Variable-length tables are used. When the country data is loaded into the table from the file, the table increases in size as each record is read (see VARYING NumberOfCountries).
  • Once the table has been loaded from the file, the value "ZZ **** FOUND ****" is loaded into element 244. The purpose of this is to prove that SEARCH ALL recognizes the table size specified in NumberOfCountries.
  • In this program, I moved SEARCH ALL to its own paragraph because I need to use it in two different parts of the program and I don’t want to repeat the code.
  • When the program runs, the user enters a number of country codes, and the country names are returned. Note that this all works correctly even though the table has not had HIGH-VALUES moved to it.
  • Then user enters ZZ. This is the country code of the entry I placed beyond the end of the table (as identified by NumberOfCountries). SEARCH ALL reports that it can’t find this country code.
  • When the loop exits, the program increases NumberOfCountries to 244, and the search is attempted again. This time SEARCH ALL does find the ZZ country code, because this time the code is the table.

Summary

This chapter examined SEARCH and SEARCH ALL, the COBOL verbs that allow you to search tabular data. The chapter introduced the INDEXED BY and KEY IS extensions to the OCCURS clause. These extensions are required when you want to use SEARCH and SEARCH ALL to search a table. You saw how to use SEARCH for linear searches of single-dimension tables and how, by controlling one of the indexes yourself, you can even use SEARCH to search a multidimensional table. The chapter showed how a binary search works and demonstrated how to use SEARCH ALL to search a table. Finally, you were introduced to the topic of variable-length tables and learned to declare and use them.

The next chapter introduces the SORT and MERGE verbs. SORT is generally used to sort files, but it may also be used to sort a table. As I noted when discussing sequential files in Chapter 7, many operations on sequential files are not possible unless the files are ordered. For this reason, many programs begin by sorting the file into the required order.

PROGRAMMING EXERCISE 1

Prepare your 2B pencil; it’s exercise time again. In this program, you will use your knowledge of variable-length tables and the SEARCH verb.

A program is required that will report the frequency of all the words in a document. To make the problem easier, the document has been split into individual words, one word per record. The program should be able to report on a maximum of 1,000 words.

The document words are held in an unordered sequential file called DocWords.dat. Each record has the following description:

image

Write a program to read a file of document words and produce a report that shows the top ten words in descending order of frequency. The report template is as follows:

Top Ten Words In Document
Pos   Occurs    Document Word
1.     XXX     XXXXXXXXXXXXXXXXXXXX
2.     XXX     XXXXXXXXXXXXXXXXXXXX
3.     XXX     XXXXXXXXXXXXXXXXXXXX
4.     XXX     XXXXXXXXXXXXXXXXXXXX
5.     XXX     XXXXXXXXXXXXXXXXXXXX
6.     XXX     XXXXXXXXXXXXXXXXXXXX
7.     XXX     XXXXXXXXXXXXXXXXXXXX
8.     XXX     XXXXXXXXXXXXXXXXXXXX
9.     XXX     XXXXXXXXXXXXXXXXXXXX
10.     XXX     XXXXXXXXXXXXXXXXXXXX

PROGRAMMING EXERCISE 2

The task in this exercise is to write a program that accepts ten numbers from the user, places them in a table, and then detects and reports on the following states:

  • No zeros found in the table
  • Only one zero found in the table
  • Two zeros found, but no numbers between the two zeros
  • Two zeros, and between them an even number of non-zeros
  • Two zeros, and between them an odd number of non-zeros

PROGRAMMING EXERCISE 1: ANSWER

Listing 13-7. Program to Find the Top Ten Words in a Document

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-7.
AUTHOR. Michael Coughlan.

ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
       SELECT DocWordsFile ASSIGN TO "Listing13-7.DAT"
                 ORGANIZATION IS LINE SEQUENTIAL.
                  
DATA DIVISION.
FILE SECTION.
FD DocWordsFile.
01 WordIn               PIC X(20).
   88 EndOfDocWordsFile VALUE HIGH-VALUES.

WORKING-STORAGE SECTION.
01 WordFreqTable.
   02 Word OCCURS 0 TO 2000 TIMES
              DEPENDING ON NumberOfWords
              INDEXED BY Widx.
      03 WordFound    PIC X(20).
      03 WordFreq     PIC 9(3).
      
01 TopTenTable.
   02 WordTT  OCCURS 11 TIMES
               INDEXED BY TTidx.
      03 WordFoundTT  PIC X(20) VALUE SPACES.
      03 WordFreqTT   PIC 9(3)  VALUE ZEROS.

01 NumberOfWords       PIC 9(4) VALUE ZERO.

01 ReportHeader        PIC X(27) VALUE "  Top Ten Words In Document".

01 SubjectHeader       PIC X(29) VALUE "Pos   Occurs    Document Word".

01 DetailLine.
   02 PrnPos           PIC Z9.
   02 FILLER           PIC X VALUE ".".
   02 PrnFreq          PIC BBBBBZZ9.
   02 PrnWord          PIC BBBBBX(20).

01 Pos                 PIC 99.

PROCEDURE DIVISION.
Begin.
    OPEN INPUT DocWordsFile
    READ DocWordsFile
       AT END SET EndOfDocWordsFile TO TRUE
    END-READ
    PERFORM LoadWordFreqTable UNTIL EndOfDocWordsFile
    PERFORM FindTopTenWords
            VARYING Widx FROM 1 BY 1 UNTIL Widx > NumberOfWords
    PERFORM DisplayTopTenWords
    CLOSE DocWordsFile
    STOP RUN.
    
LoadWordFreqTable.
* The AT END triggers when Widx is one greater than the current size of the
* table so all we have to do is extend the table and write into the new table
* element
    SET Widx TO 1
    SEARCH Word
       AT END ADD 1 TO NumberOfWords
              MOVE 1 TO WordFreq(Widx)
              MOVE FUNCTION LOWER-CASE(WordIn) TO WordFound(Widx)
       WHEN   FUNCTION LOWER-CASE(WordIn) = WordFound(Widx)
              ADD 1 TO WordFreq(Widx)
    END-SEARCH
    READ DocWordsFile
       AT END SET EndOfDocWordsFile TO TRUE
    END-READ.
    
FindTopTenWords.
   PERFORM VARYING TTidx FROM 10 BY -1 UNTIL TTidx < 1
      IF WordFreq(Widx) > WordFreqTT(TTidx)
         MOVE WordTT(TTidx) TO WordTT(TTidx + 1)
         MOVE Word(Widx) TO WordTT(TTidx)
      END-IF
   END-PERFORM.

DisplayTopTenWords.
   DISPLAY ReportHeader
   DISPLAY SubjectHeader
   PERFORM  VARYING TTidx FROM 1 BY 1 UNTIL TTIdx > 10
      SET Pos TO TTidx
      MOVE Pos TO PrnPos
      MOVE WordFoundTT(TTidx) TO PrnWord
      MOVE WordFreqTT(TTidx) TO PrnFreq
      DISPLAY DetailLine
   END-PERFORM

9781430262534_unFig13-07.jpg

PROGRAMMING EXERCISE 2: ANSWER

Listing 13-8. Program to Find the Number of Zeros in a List of Ten Numbers

IDENTIFICATION DIVISION.
PROGRAM-ID. Listing13-8.
AUTHOR. Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 NumberArray.
   02 Num PIC 99 OCCURS 10 TIMES
                 INDEXED BY Nidx.
  
01 FirstZeroPos            PIC 99 VALUE ZERO.
   88 NoZeros              VALUE 0.
  
01 SecondZeroPos           PIC 99 VALUE ZERO.
   88 OneZero              VALUE 0.
  
01 ValuesBetweenZeros      PIC 9 VALUE ZERO.
   88 NoneBetweenZeros     VALUE 0.

PROCEDURE DIVISION.
Begin.
  DISPLAY "Enter 10 two digit numbers "
  PERFORM VARYING Nidx FROM 1 BY 1 UNTIL Nidx > 10
     DISPLAY "Enter number - "  SPACE WITH NO ADVANCING
     ACCEPT Num(Nidx)
  END-PERFORM

  SET Nidx TO 1
  SEARCH Num
    AT END SET NoZeros TO TRUE
    WHEN Num(Nidx) = ZERO
     SET FirstZeroPos TO Nidx
     SET Nidx UP BY 1
     SEARCH Num
       AT END SET OneZero TO TRUE
       WHEN Num(Nidx) = ZERO
          SET SecondZeroPos TO Nidx
          COMPUTE ValuesBetweenZeros = (SecondZeroPos - 1) - FirstZeroPos
     END-SEARCH
  END-SEARCH
  
  EVALUATE TRUE
    WHEN NoZeros    DISPLAY "No zeros found"
    WHEN OneZero    DISPLAY "Only one zero found"
    WHEN NoneBetweenZeros DISPLAY "No numbers between the two zeros"
    WHEN FUNCTION REM(ValuesBetweenZeros, 2)= ZERO
                    DISPLAY "Even number of non-zeros between zeros"
    WHEN OTHER      DISPLAY "Odd number of non-zeros between zeros"
  END-EVALUATE
  STOP RUN.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.44.94