CHAPTER 2: COBOL Foundation

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 2

COBOL Foundation

This chapter presents some of the foundational material you require before you can write COBOL programs. It starts by identifying some elements of COBOL that programmers of other languages find idiosyncratic and it explains the reasons for them. You’re then introduced to the unusual syntax notation (called metalanguage) used to describe COBOL verbs and shown some examples.

COBOL programs have to conform to a fairly rigid hierarchical structure. This chapter introduces the structural elements and explains how each fits into the overall hierarchy. Because the main structural element of a COBOL program is the division, you spend some time learning about the function and purpose of each of the four divisions.

COBOL programs, especially in restrictive coding shops, are required to conform to a number of coding rules. These rules are explained and placed in their historical context.

The chapter discusses the details of name construction; but because name construction is about more than just the mechanics, you also learn about the importance of using descriptive names for both data items and blocks of executable code. The importance of code formatting for visualizing data hierarchy and statement scope is also discussed.

To whet your appetite for what is coming in the succeeding chapters, the chapter includes a number of small example programs and gives brief explanations. The chapter ends by listing the most important COBOL compilers, both free and commercial, available for Windows and UNIX.

COBOL Idiosyncrasies

COBOL is one of the oldest programming languages still in use. As a result, it has some idiosyncrasies, which programmers used to other languages may find irritating. One of the design goals of COBOL was to assist readability by making the language as English-like as possible.¹ As a consequence, the structural concepts normally associated with English prose, such as division, section, paragraph, sentence, verb, and so on, are used in COBOL programs. To further aid readability, the concept of noise words was introduced. Noise words are words in a COBOL statement that have no semantic content and are used only to enhance readability by making the statement more English-like.

One consequence of these design decisions is that the COBOL reserved-word list is extensive and contains many hundreds of entries. The reserved words themselves also tend to be long, with words like UNSTRING, EVALUATE, and PERFORM being typical. The English-like structure, the long reserved words, and the noise words makes COBOL programs seem verbose, especially when compared to languages such as C.

When COBOL was designed, today’s tools were not available. Programs were written on coding forms (see Figure 2-1), passed to punch-card operators for transfer onto punch cards (see Figure 2-2), and then submitted to the computer operator to be loaded into the computer using a punch-card reader. These media (coding sheets and punch cards) required adherence to a number of formatting restrictions that some COBOL implementations still enforce today, long after the need for them has gone. This book discusses these coding restrictions but doesn’t adhere to them. You should be aware, though, that depending on the coding rules in a particular coding shop, you might be obliged to abide by these archaic conventions.

Figure 2-1. COBOL coding sheet

Figure 2-2. COBOL punch card for line 11 of the coding sheet2

The final COBOL irritant is that although many of the constructs required to write well-structured programs have been introduced into modern COBOL (ANS 85 COBOL and OO-COBOL), the need for backward compatibility means some language elements remain that, if used, make it difficult and in some cases impossible to write good programs. ALTER verb, I’m thinking of you.

COBOL Syntax Metalanguage

COBOL syntax is defined using a notation sometimes called the COBOL metalanguage. In this notation

Words in uppercase are reserved words. When underlined, they are mandatory. When not underlined, they are noise words, used for readability only, and are optional.
Words in mixed case represent names that must be devised by the programmer (such as the names of data items).
When material is enclosed in curly braces { }, a choice must be made from the options within the braces. If there is only one option, then that item is mandatory.
When material is enclosed in square brackets [ ], the material is optional and may be included or omitted as required.
When the ellipsis symbol ... (three dots) is used, it indicates that the preceding syntactic element may be repeated at your discretion.
To assist readability, the comma, semicolon, and space characters may be used as separators in a COBOL statement, but they have no semantic effect. For instance, the following statements are semantically identical:
```
ADD Num1 Num2   Num3  TO Result
ADD Num1, Num2, Num3  TO Result
ADD Num1; Num2; Num3  TO Result
```

In addition to the metalanguage diagrams, syntax rules govern the interpretation of metalanguage. For instance, the metalanguage for PERFORM..VARYING (see Figure 2-3) implies that you can have as many AFTER phrases as desired. In fact, as you will discover when I discuss this construct in Chapter 6, only two are allowed.

Figure 2-3. PERFORM..VARYING metalanguage

Some Notes on Syntax Diagrams

As mentioned in the previous section, the interpretation of the COBOL metalanguage is modified by syntax rules. Because it can be tedious to wade through all the rules for each COBOL construct, this book uses a modified form of the syntax diagram. In this modified diagram, special operand suffixes indicate the type of the operand; these are shown in Table 2-1.

Table 2-1. Special Metalanguage Operand Suffixes

Suffix	Meaning
$i	Uses an alphanumeric data item
$il	Uses an alphanumeric data item or a string literal
#i	Uses a numeric data item
#il	Uses a numeric data item or numeric literal
$#i	Uses a numeric or an alphanumeric data item

Example Metalanguage

As an example of how the metalanguage for a COBOL verb is interpreted, the syntax for the COMPUTE verb is shown in Figure 2-4. I’m presenting COMPUTE here because, as the COBOL arithmetic verb (the others are ADD, SUBTRACT, MULTIPLY, DIVIDE) that’s closest to the way things are done in many other languages, it will be a point of familiarity. The operation of COMPUTE is discussed in more detail in Chapter 4.

Figure 2-4. COMPUTE metalanguage syntax diagram

The COMPUTE verb assigns the result of an arithmetic expression to a variable or variables. The interpretation of the COMPUTE metalanguage is as follows:

A COMPUTE statement must start with the keyword COMPUTE.
The keyword must be followed by the name of a numeric data item that receives the result of the calculation (the suffix #i indicates that the operand must be the name of a numeric data item [variable]).
The equals sign (=) must be used.
An arithmetic expression must follow the equals sign.
The square braces [ ] around the word ROUNDED indicate that rounding is optional. Because the word ROUNDED is underlined, the word must be used if rounding is required.
The ellipsis symbol (...) indicates that there can more than one Result#i data item.
The ellipsis occurs outside the curly braces {}, which means each result field can have its own ROUNDED phrase.
- In other words, you could have a COMPUTE statement like
```
COMPUTE Result1 ROUNDED, Result2  =  ((9 * 9) + 8) / 5
```
- where Result1 would be assigned a value of 18 (rounded 17.8) and Result2 would be assigned a value of 17 (truncated 17.8), assuming both Result1 and Result2 were defined as PIC 99.

Structure of COBOL Programs

COBOL is much more rigidly structured than most other programming languages. COBOL programs are hierarchical in structure. Each element of the hierarchy consists of one or more subordinate elements. The program hierarchy consists of divisions, sections, paragraphs, sentences, and statements (see Figure 2-5).

Figure 2-5. Hierarchical COBOL program structure

A COBOL program is divided into distinct parts called divisions. A division may contain one or more sections. A section may contain one or more paragraphs. A paragraph may contain one or more sentences, and a sentence one or more statements.

Note Programmers unused to this sort of rigidity may find it irksome or onerous, but this layout offers some practical advantages. Many of the programmatic items that might need to be modified as a result of an environmental change are defined in the ENVIRONMENT DIVISION. External references, such as to devices, files, collating sequences, the currency symbol, and the decimal point symbol are all defined in the ENVIRONMENT DIVISION.

Divisions

The division is the major structural element in COBOL. Later in this chapter, I discuss the purpose of each division. For now, you can note that there are four divisions: the IDENTIFICATION DIVISION, the ENVIRONMENT DIVISION, the DATA DIVISION, and the PROCEDURE DIVISION.

Sections

A section is made up of one or more paragraphs. A section begins with the section name and ends where the next section name is encountered or where the program text ends.

A section name consists of a name devised by the programmer or defined by the language, followed by the word Section, followed by a period (full stop). Some examples of section names are given in Example 2-1.

In the first three divisions, sections are an organizational structure defined by the language. But in the PROCEDURE DIVISON, where you write the program’s executable statements, sections and paragraphs are used to identify blocks of code that can be executed using the PERFORM or the GO TO.

Example 2-1. Example Section Names

SelectTexasRecords SECTION.
FILE SECTION.
CONFIGURATION SECTION.
INPUT-OUTPUT SECTION.

Paragraphs

A paragraph consists of one or more sentences. A paragraph begins with a paragraph name and ends where the next section name or paragraph name is encountered or where the program text ends.

In the first three divisions, paragraphs are an organizational structure defined by the language (see Example 2-2). But in the PROCEDURE DIVISON, paragraphs are used to identify blocks of code that can be executed using PERFORM or GO TO (see Example 2-3).

Example 2-2. ENVIRONMENT DIVISION Entries Required for a File Declaration

ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
    SELECT ExampleFile ASSIGN TO "Example.Dat"
           ORGANIZATION IS SEQUENTIAL.

Example 2-3. PROCEDURE DIVISION with Two Paragraphs (Begin and DisplayGreeting)

PROCEDURE DIVISION.
Begin.
   PERFORM  DisplayGreeting 10 TIMES.
   STOP RUN.
 
DisplayGreeting.
   DISPLAY "Greetings from COBOL".

Sentences

A sentence consists of one or more statements and is terminated by a period. There must be at least one sentence, and hence one period, in a paragraph. Example 2-4 shows two sentences. The first sentence also happens to be a statement; the second consists of three statements.

Example 2-4. Two Sentences

SUBTRACT Tax FROM GrossPay GIVING NetPay. 
 
MOVE .21 TO VatRate
COMPUTE VatAmount = ProductCost * VatRate
DISPLAY "The VAT amount is - " VatAmount.

Statements

In COBOL, language statements are referred to as verbs. A statement starts with the name of the verb and is followed by the operand or operands on which the verb acts. Example 2-5 shows three statements.

Example 2-5. Three Statements

DISPLAY "Enter name " WITH NO ADVANCING
ACCEPT  StudentName
DISPLAY "Name entered was " StudentName

In Table 2-2, the major COBOL verbs are categorized by type. The arithmetic verbs are used in computations, the file-handling verbs are used to manipulate files, the flow-of-control verbs are used to alter the normal sequential execution of program statements, the table-handling verbs are used to manipulate tables (arrays), and the string-handling verbs allow such operations as character counting, string splitting, and string concatenation.

Table 2-2. Major COBOL Verbs, Categorized by Type

The Four Divisions

At the top of the COBOL hierarchy are the four divisions. These divide the program into distinct structural elements.

Although some of the divisions may be omitted, the sequence in which they are specified is fixed and must be as follows. Just like section names and paragraph names, division names must be followed by a period:

IDENTIFICATION DIVISION. Contains information about the program
ENVIRONMENT DIVISION. Contains environment information
DATA DIVISION. Contains data descriptions
PROCEDURE DIVISION. Contains the program algorithms

IDENTIFICATION DIVISION

The purpose of the IDENTIFICATION DIVISIONis to provide information about the program to you, the compiler, and the linker. The PROGRAM-ID paragraph is the only entry required. In fact, this entry is required in every program. Nowadays all the other entries have the status of comments (which are not processed when the program runs), but you may still find it useful to included paragraphs such as AUTHOR and DATE-WRITTEN.

The PROGRAM-ID is followed by a user-devised name that is used to identify the program internally. This name may be different from the file name given to the program when it was saved to backing storage. The metalanguage for the PROGRAM-ID is

PROGRAM–ID. UserAssignedProgramName.
[IS [COMMON] [INITIAL] PROGRAM].

The metalanguage items in square braces apply only to subprograms, so I will reserve discussion of these items until later in the book.

When a number of independently compiled programs are combined by the linker into a single executable run-unit, each program is identified by the name given in its PROGRAM-ID. When control is passed to a particular program by means of a CALL verb, the target of the CALL invocation is the name given in the subprogram’s PROGRAM-ID for instance:

CALL "PrintSummaryReport".

Example 2-6 shows an example IDENTIFICATION DIVISION. Pay particular attention to the periods — they are required.

Example 2-6. Sample IDENTIFICATION DIVISION

IDENTIFICATION DIVISION.
PROGRAM-ID. PrintSummaryReport.
AUTHOR. Michael Coughlan.
DATE-WRITTEN. 20th June 2013.

ENVIRONMENT DIVISION

The ENVIRONMENT DIVISIONis used to describe the environment in which the program works. It isolates in one place all aspects of the program that are dependent on items in the environment in which the program runs. The idea is to make it easy to change the program when it has to run on a different computer or one with different peripheral devices or when the program is being used in a different country.

The ENVIRONMENT DIVISION consists of two sections: the CONFIGURATION SECTION and the INPUT-OUTPUT SECTION. In the CONFIGURATION SECTION, the SPECIAL-NAMES paragraph allows you to specify such environmental details as what alphabet to use, what currency symbol to use, and what decimal point symbol to use. In the INPUT-OUTPUT SECTION, the FILE-CONTROL paragraph lets you connect internal file names with external devices and files.

Example 2-7 shows some example CONFIGURATION SECTION entries. A few notes about the listing:

In some countries the meaning of the decimal point and the comma are reversed. For instance, the number 1,234.56 is sometimes written 1.234,56. The DECIMAL-POINT IS COMMA clause specifies that the program conforms to this scheme.
The SYMBOLIC CHARACTERS clause lets you assign a name to one of the unprintable characters. In this example, names for the escape, carriage return, and line-feed characters have been defined by specifying their ordinal position (not value) in the character set.
The SELECT and ASSIGN clauses let you connect the name you use for a file in the program with its actual name and location on disk.

Example 2-7. CONFIGURATION SECTION Examples

IDENTIFICATION DIVISION.
PROGRAM-ID. ConfigurationSectionExamples.
AUTHOR. Michael Coughlan.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SPECIAL-NAMES.
    DECIMAL-POINT IS COMMA.
    SYMBOLIC CHARACTERS   ESC  CR  LF
                     ARE  28   14  11.
 
INPUT-OUTPUT SECTION.
FILE-CONTROL.
    SELECT StockFile ASSIGN TO  "D:DataFilesStock.dat"
           ORGANIZATION IS SEQUENTIAL.

DATA DIVISION

The DATA DIVISIONis used to describe most of the data that a program processes. The obvious exception to this is literal data, which is defined in situ as a string or numeric literal such as “Freddy Ryan” or -345.74.

The DATA DIVISION is divided into four sections:

The FILE SECTION
The WORKING-STORAGE SECTION
The LINKAGE SECTION
The REPORT SECTION

The first two are the main sections. The LINKAGE SECTION is used only in subprograms, and the REPORT SECTION is used only when generating reports. The LINKAGE and REPORT sections are discussed more fully when you encounter the elements that require them later in the book. For now, only the first two sections need concern you.

File Section

The FILE SECTIONdescribes the data that is sent to, or comes from, the computer’s data storage peripherals. These include such devices as card readers, magnetic tape drives, hard disks, CDs, and DVDs.

Working-Storage Section

The WORKING-STORAGE SECTIONdescribes the general variables used in the program. The COBOL metalanguage showing the general structure and syntax of the DATA DIVISION is given in Figure 2-6 and is followed by a fragment of an example COBOL program in Example 2-8.

Figure 2-6. DATA DIVISION metalanguage

Example 2-8. Simple Data Declarations

IDENTIFICATION DIVISION.
PROGRAM-ID.  SimpleDataDeclarations.
AUTHOR.  Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  CardinalNumber          PIC 99     VALUE ZEROS.
01  IntegerNumer            PIC S99    VALUE -14.
01  DecimalNumber           PIC 999V99 VALUE 543.21.
01  ShopName                PIC X(30)  VALUE SPACES.
01  ReportHeading           PIC X(25)  VALUE "=== Employment Report ===".

Data Hierarchy

All the data items in Example 2-8 are independent, elementary, items. Although data hierarchy is too complicated a topic to deal with at this point, a preview of hierarchical data declaration is given in BirthDate (see Example 2-9).

Example 2-9. Example of a Hierarchical Data Declaration

01  BirthDate.
    02  YearOfBirth.
        03 CenturyOB       PIC 99.
        03 YearOB          PIC 99.
    02  MonthOfBirth       PIC 99.
    02  DayOfBirth         PIC 99.

In this declaration, the data hierarchy indicated by the level numbers tells you that the data item BirthDate consists of (is made up of) a number of subordinate data items. The immediate subordinate items (indicated by the 02 level numbers) are YearOfBirth, MonthOfBirth, and DayOfBirth. MonthOfBirth and DayOfBirth are elementary, atomic, items that are not further subdivided. However, YearOfBirth is a data item that is further subdivided (indicated by the 03 level numbers) into CenturyOB and YearOB.

In typed languages such as Pascal and Java, understanding what is happening to data in memory is not important. But understanding what is happening to the data moved into a data item is critical in COBOL. For this reason, when discussing data declarations and the assignment of values to data items, I often give a model of the storage. For instance, Figure 2-7 gives the model of the storage for the data items declared in Example 2-9 and shows what happens to the data when you execute the statement - MOVE "19451225" TO BirthDate.

Figure 2-7. Memory model for the data items declared in Example 2-9

PROCEDURE DIVISION

The PROCEDURE DIVISION is where all the data described in the DATA DIVISION is processed and produced. It is here that you describe your algorithm. The PROCEDURE DIVISIONis hierarchical in structure. It consists of sections, paragraphs, sentences, and statements. Only the section is optional; there must be at least one paragraph, one sentence, and one statement in the PROCEDURE DIVISION.

Whereas the paragraph and section names in the other divisions are defined by the language, in the PROCEDURE DIVISION they are chosen by you. The names chosen should reflect the function of the code contained in the paragraph or section.

In many legacy COBOL programs, paragraph and section names were used chiefly as labels to break up the program text and to act as the target of GO TO statements and, occasionally, PERFORM statements. In these programs, GO TOs were used to jump back and forward through the program text in a manner that made the program logic very difficult to follow. This programmatic style was derisively labeled spaghetti code.

In this book, I advocate a programming style that eschews the use of GO TOs as much as possible and that uses performs and paragraphs to create single-entry, single-exit, open subroutines. Although the nature of an open subroutine is that control can drop into it, adherence to the single-entry, single-exit philosophy should ensure that this does not happen.

Shortest COBOL Program

COBOL has a very bad reputation for verbosity, but most of the programs on which that reputation was built were written in ANS 68 or ANS 74 COBOL. Those programs are 40 years old. In modern versions of the language, program elements are not required unless explicitly used. For instance, in the ShortestProgram(see Listing 2-1), no entries are required for the ENVIRONMENT and DATA DIVISIONs because they are not used in this program. The IDENTIFICATION DIVISION is required because it holds the mandatory PROGRAM-ID paragraph. The PROCEDURE DIVISION is also required, there must be at least one paragraph in it (DisplayPrompt), and the paragraph must contain at least one sentence (DISPLAY "I did it".). STOP RUN, a COBOL instruction to halt execution of the program, would normally appear in a program but is not required here because the program will stop when it reaches the end of the program text.

Listing 2-1. Shortest COBOL Program

IDENTIFICATION DIVISION.
PROGRAM-ID. ShortestProgram.
PROCEDURE DIVISION.
DisplayPrompt.
     DISPLAY "I did it".

Note Some COBOL compilers require that all the divisions be present in a program. Others only require the IDENTIFICATION DIVISION and the PROCEDURE DIVISION.

COBOL Coding Rules

Traditionally, COBOL programs were written on coding sheets (see Figure 2-8), punched on to punch cards, and then loaded into the computer via a card reader. Although nowadays most programs are entered directly via screen and keyboard, some COBOL formatting conventions remain that derive from its ancient punch-card history:

On the coding sheet, the first six character positions are reserved for sequence numbers. Sequence numbers used to be a vital insurance against the disaster of dropping your stack of punch cards.
The seventh character position is reserved for the continuation character or for an asterisk that denotes a comment line. The continuation character is rarely used nowadays because any COBOL statement can be broken into two lines anywhere (other than in a quoted string) there is a space character.

COBOL Detail While other programming languages permit a variety of comment forms (Java for instance supports multiline comments, documentation comments, and end of line comments) COBOL allows only full-line comments. Comment lines are indicated by placing an asterisk in column 7 (if adhering to the strict formatting conventions - see Figure 2-8) or the the first column if using a version of COBOL that does not adhere to archaic formatting conventions. One further note; the Open Source COBOL at Compileonline.com requires comments to begin with *> but like Java you can also place these comments at the end of the line.

Figure 2-8. Fragment of a coding sheet showing the different program areas

The actual program text starts in column 8. The four positions from 8 to 11 are known as Area A, and the positions from 12 to 72 are called Area B.
The area from position 73 to 80 is the identification area; it was generally used to identify the program. This again was disaster insurance. If two stacks of cards were dropped, the identification allowed the cards belonging to the two programs to be identified.

When a COBOL compiler recognizes the Areas A and B, all division names, section names, paragraph names, file-description (FD) entries, and 01 level numbers must start in Area A. All other sentences must start in Area B.

In some COBOL compilers, it is possible to set a compiler option or include a compiler directive to free you from these archaic formatting conventions. For instance, the Micro Focus Net Express COBOL uses the compiler directive - $ SET SOURCEFORMAT"FREE". Although modern compilers may free you from formatting restrictions, it is probably still a good idea to position items according to the Area A and Area B rule.

Name Construction

COBOL has a number of different user-devised names, such as data names (variable names), paragraph names, section names, and mnemonic names. The rules for name construction are given here along with some advice which all programmers should embrace.

All user-defined names in COBOL must adhere to the following rules:

They must contain at least 1 character and not more than 30 characters.
They must contain at least one alphabetic character and must not begin or end with a hyphen.
They must be constructed from the characters A to Z, the numbers 0 to 9, and the hyphen. Because the hyphen can be mistaken for the minus sign, a word cannot begin or end with a hyphen.
Names are not case-sensitive. SalesDate is the same as salesDate or SALESDATE.
None of the many COBOL reserved words may be used as a user-defined name. The huge number of reserved words is one of the annoyances of COBOL. One strategy to avoid tripping over them is to use word doubles such as using IterCount instead of Count.

Here are some examples of user-defined names:

TotalPay
Gross-Pay
PrintReportHeadings
Customer10-Rec

Comments about Naming

Data-item names are used to identify variables. In COBOL, all variable data is defined in the DATA DIVISION rather than throughout the program as is done in many other languages. In the PROCEDURE DIVISION, section names and paragraph names are devised by you and are used to identify blocks of executable code.

The proper selection of data-item, section, and paragraph names is probably the most important thing you can do to make your programs understandable. The names you choose should be descriptive. Data-item names should be descriptive of the data they contain; for instance, it is fairly clear what data the data items TotalPay, GrossPay, and NetPay hold. Section and paragraph names should be descriptive of the function of the code contained in the paragraph or section; for instance, these seem fairly descriptive: ApplyValidInsertion, GetPostage, and ValidateCheckDigit. Difficulty in assigning a suitably descriptive name to a block of code should be taken as a sign that the program has been incorrectly partitioned and is likely to offend the Module Strength/Cohesion guidelines^3-6.

Authors writing about other programming languages often make the same point: programmers should choose descriptive names. But in many of these languages, where succinctness appears to be a highly lauded characteristic, the ethos of the language seems to contradict this advice. In COBOL, the language is already so verbose that the added burden of descriptive names is not likely to be a problem.

Comments about Program Formatting

In COBOL, hierarchy is vitally important in the declaration of data. Proper indentation is a very useful aid to understanding data hierarchy (more on this later). Misleading or no indentation is often a source of programming errors. Good programmers seem to understand this instinctively: when student programs are graded, those that correctly implement the specification are often found to have excellent formatting, whereas those with programming errors are often poorly formatted. This is ironic, because the programmers who are most in need of the aid of a well-formatted program seem to be those who pay formatting the least attention. Weak programmers never appear to understand how a poorly formatted program conspires against them and makes it much more difficult to produce code that works.

Proper formatting is also important in the PROCEDURE DIVISION. Even though the scope of COBOL verbs is well-signaled using END delimiters, indentation is still a very useful aid to emphasize scope.

This discussion brings me to an important piece of advice for COBOL programmers. This advice is a restatement of the Golden Rule promulgated by Jesus, Confucius, and others:

Write your programs as you would like them written if you were the one who had to maintain them.

Comments about Programming Style

As noted earlier, data names and reserved words are not case sensitive. The reserved words PROCEDURE DIVISION can be written as uppercase, lowercase, or mixed case. My preference, developed during years of reading program printouts, is to put COBOL reserved words in uppercase and user-defined words in mixed case with capitals at the beginning of each word. Sometimes, for clarity, the words may be separated by a hyphen. This is the style I have chosen for this book because I believe it is the best for a printed format.

I want to stress, though, that this stylistic scheme is a personal preference. Programmers in other languages may be more used to a different scheme, and as long as the scheme is consistently used, it should present no problem. It is worth mentioning that when you start to work in a programming shop, a naming scheme may be forced on you. So perhaps it is not a bad thing to get some practice fitting in with someone else’s scheme.

Example Programs

This section provides some example programs to whet your appetite and give you a feel for how a full COBOL program looks. In particular, they give advance warning about how variables (data items) are declared in COBOL. This differs so much from other languages such as C, Java, and Pascal that it is likely to be a matter of some concern, if not consternation. These programs also introduce some of the more interesting and useful features of COBOL.

The COBOL Greeting Program

Let’s start with the program you last saw in the COBOL coding sheet (see Figure 2-1). In Listing 2-2 it has been modernized a little by introducing lowercase characters. This basic program demonstrates simple data declaration and simple iteration (looping). The variable IterNum is given a starting value of 5, and the PERFORM executes the paragraph DisplayGreeting five times:

Listing 2-2. The COBOL Greeting Program

IDENTIFICATION DIVISION.
PROGRAM-ID. CobolGreeting.
*>Program to display COBOL greetings
DATA DIVISION.
WORKING-STORAGE SECTION.
01  IterNum   PIC 9 VALUE 5.
 
PROCEDURE DIVISION.
BeginProgram.
   PERFORM DisplayGreeting IterNum TIMES.
   STOP RUN.
    
DisplayGreeting.
   DISPLAY "Greetings from COBOL".

The DoCalc Program

The DoCalc program in Listing 2-3 prompts the user to enter two single-digit numbers. The numbers are added together, and the result is displayed on the computer screen.

Listing 2-3. The DoCalc Example Program

IDENTIFICATION DIVISION.
PROGRAM-ID.  DoCalc.
AUTHOR.  Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 FirstNum       PIC 9     VALUE ZEROS.
01 SecondNum      PIC 9     VALUE ZEROS.
01 CalcResult     PIC 99    VALUE 0.
01 UserPrompt     PIC X(38) VALUE
                  "Please enter two single digit numbers".
PROCEDURE DIVISION.
CalculateResult.
   DISPLAY UserPrompt
   ACCEPT FirstNum
   ACCEPT SecondNum
   COMPUTE CalcResult = FirstNum + SecondNum
   DISPLAY "Result is = ", CalcResult
   STOP RUN.

The program declares three numeric data items (variables): FirstNum for the first number input, SecondNum for the second, and CalcResult to hold the result of the calculation. It also declares a data item to hold the string used to prompt the user to enter two single-digit numbers.

Data declarations in COBOL are very different from the type-based declaration you might be used to in other languages, so some explanation is required. In COBOL, every data-item declaration starts with a level number. Level numbers are used to represent data hierarchy. Because all the items in this example program are independent, elementary data items, they have a level number of 01.

Following the level number is the name of the data item, and this in turn is followed by a storage declaration for the data item. The storage declaration defines the type and size of the storage required. To do this, COBOL uses a kind of “declaration by example” strategy. An example, or picture (hence PIC), is given of the maximum value the data item can hold. The symbols used in the picture declaration indicate the basic type of the item (numeric = 9, alphanumeric = X, alphabetic = A), and the number of symbols used indicates the size.

Consider the following declarations in DoCalc:

01 FirstNum       PIC 9  VALUE ZEROS.
01 SecondNum      PIC 9  VALUE ZEROS.

These indicate that FirstNum and SecondNum can each hold a cardinal number with a value between 0 and 9. If these data items were required to hold an integer number, the pictures would have to be defined as PIC S9 (signed numeric).

In this program, the picture clauses (which is what they are called) are followed by VALUE clauses specifying that FirstNum and SecondNum start with an initial value of zero. In COBOL, unless a variable is explicitly given an initial value, its value is undefined.

Bug Alert Numeric data items must be given an explicit numeric starting value by means of the VALUE clause, using the INITIALIZE verb, or by assignment. If a numeric data item with an undefined value is used in a calculation, the program may crash. Of course, a data item with an undefined value may receive the result of a calculation because in that case any non-numeric data is overwritten with the calculation result.

The CalcResult data item is defined as follows:

01 CalcResult     PIC 99 VALUE 0.

This indicates that CalcResult can hold a cardinal number between 0 and 99. It too is initialized to zero, but in this case the value 0 is used rather than the word ZEROS. The word ZEROS is a special COBOL data item called a figurative constant. It has the effect of filling the data item with zeros. I have chosen to initialize this variable with the value 0 to make two points. First, numeric values can be used with the VALUE clause. Second, the figurative constant ZEROS should be used in preference to the numeric value because it is clearer than 0, which in some fonts can easily be mistaken for an O.

The UserPrompt data item is defined as follows:

01 UserPrompt     PIC X(24) VALUE
                  "Please enter two single digit numbers".

This indicates that it can hold an alphanumeric value of up to 24 characters. It has been initialized to a starting string value.

COBOL Detail UserPrompt should have been defined as a constant, but COBOL does not allow constants to be created. The nearest you can get to a user-defined constant in COBOL is to assign an initial value to a data item and then not change it. This is a serious deficiency that has finally been addressed in the ISO 2002 version of COBOL by means of the CONSTANT clause.

COBOL PUZZLE

Given the description of BirthDate in Example 2-10, what do you think would be displayed by the COBOL code in Example 2-11?

Example 2-10. BirthDate Data Description

01 BirthDate.
   02 YearOfBirth.
      03 CenturyOB     PIC 99.
      03 YearOB        PIC 99.
   02 MonthOfBirth     PIC 99.
   02 DayOfBirth       PIC 99.

Example 2-11. Code That Manipulates BirthDate and Its Subordinate Items

MOVE 19750215 TO BirthDate
DISPLAY "Month is = " MonthOfBirth
DISPLAY "Century of birth is = " CenturyOB
DISPLAY "Year of birth is = " YearOfBirth
DISPLAY DayOfBirth "/" MonthOfBirth "/" YearOfBirth
MOVE ZEROS TO YearOfBirth
DISPLAY "Birth date = " BirthDate.

The answer is at the end of the chapter.

The Condition Names Program

The final example program for this chapter previews COBOL condition names and the EVALUATE verb. A condition name is a Boolean item that can only take the value true or false. But it is much more than that. A condition name is associated (via level 88) with a particular data item. Rather than setting the condition name to true or false directly, as you might do in other languages, a condition name automatically takes the value true or false depending on the value of its associated data item.

Listing 2-4 accepts a character from the user and displays a message to say whether the character entered was a vowel, a consonant, or a digit. When CharIn receives a character from the user, the associated condition names are all set to true or false depending on the value contained in CharIn.

The EVALUATE verb, which is COBOL’s version of switch or case, is shown here at its simplest. It is immensely powerful, complicated to explain, but intuitively easy to use. In this program, the particular WHEN branch executed depends on which condition name is true. See anything familiar, Ruby programmers?

Listing 2-4. Using the EVALUATE Verb

IDENTIFICATION DIVISION.
PROGRAM-ID.  ConditionNames.
AUTHOR.  Michael Coughlan.
* Using condition names (level 88's) and the EVALUATE
DATA DIVISION.
WORKING-STORAGE SECTION.
01  CharIn             PIC X.
    88 Vowel           VALUE "a", "e", "i", "o", "u".
    88 Consonant       VALUE "b", "c", "d", "f", "g", "h"
                             "j" THRU "n", "p" THRU "t", "v" THRU "z".
    88 Digit           VALUE "0" THRU "9".
    88 ValidCharacter  VALUE "a" THRU "z", "0" THRU "9".
PROCEDURE DIVISION.
Begin.
    DISPLAY "Enter lower case character or digit. Invalid char ends."
    ACCEPT CharIn
    PERFORM UNTIL NOT ValidCharacter
      EVALUATE TRUE
        WHEN Vowel     DISPLAY "The letter " CharIn " is a vowel."
        WHEN Consonant DISPLAY "The letter " CharIn " is a consonant."
        WHEN Digit     DISPLAY CharIn " is a digit."
      END-EVALUATE
      ACCEPT CharIn
    END-PERFORM
    STOP RUN.

Chapter Exercise

Write a version of the ConditionNames program in your favorite language. See if you can convince yourself that your version is as clear, concise, readable, and maintainable as the COBOL version.

Where to Get a COBOL Compiler

Now that you’ve seen the basics of COBOL, it’s time to get the software. A couple of years ago, the question of where to get a free COBOL compiler would have been difficult to answer. The policies of COBOL vendors, who were locked into mainframe thought patterns and pricing structures, made it very difficult for interested students to get access to a COBOL compiler. In very recent years, though, and probably in response to the shortage of COBOL programmers, a number of options have become available.

Micro Focus Visual COBOL

Micro Focus COBOL is probably the best-known version of COBOL for Windows PCs. Micro Focus Visual COBOL is the company’s most recent version of COBOL. It implements the OO-COBOL standard and integrates either with Microsoft Visual Studio (where it acts as one of the standard .NET languages) or with Eclipse. It is available on Windows, Linux (Red Hat and SuSE) and Unix (Aix, HP-UX, and Solaris).

A personal edition of Visual COBOL is available that is free for non-commercial use. The Visual Studio version can be installed even if Visual Studio is not available, because in that case the Visual Studio Shell edition is installed.

www.microfocus.com/product-downloads/vcpe/index.aspx

OpenCOBOL

OpenCOBOL is an open source COBOL compiler. The OpenCOBOL web site claims to implement a substantial part of the ANS 85 and ANS 2002 COBOL standards as well as many of the extensions introduced by vendors such as Micro Focus and IBM.

OpenCOBOL translates COBOL into C. The C code can be compiled using the native C compiler on a variety of platforms including Windows, Unix/Linux, and Mac OS X.

The compiler is free and is available from www.opencobol.org/.

Raincode COBOL

Raincode is a supplier of programming-language analysis and transformation tools. The company has a version of COBOL available that integrates with Microsoft Visual Studio and generates fully managed .NET code. The COBOL compiler is free from www.raincode.com/mainframe-rehosting/.

Compileonline COBOL

An online COBOL compiler is available at compileonline.com. Its data input is somewhat problematic, which limits its usefulness, but it can be handy if you just want a quick syntax check. See www.compileonline.com/compile_cobol_online.php.

Fujitsu NetCOBOL

Fujitsu NetCOBOL is a very well-known version of COBOL for Windows. NetCOBOL implements a version of the OO-COBOL standard, compiles on the .NET Framework, and can interoperate with other .NET languages such as C# and VB.NET.

A number of other versions of this COBOL are available, including a version for Linux. A trial version is available for download but there is no free version: www.netcobol.com/product/netcobol-for-net/.

Summary

This chapter explored part of the foundational material required to write COBOL programs. Some of the material covered was informational, some practical, and some advisory. You saw how COBOL programs are organized structurally and learned the purpose of each of the four divisions. You examined COBOL metalanguage diagrams and the COBOL coding and name construction rules. I offered advice concerning name construction and the proper formatting of program code. Finally, you examined some simple COBOL programs as a preview of the material in coming chapters.

The next chapter examines how data is declared in COBOL. This chapter is only an introduction, though. Data declaration in COBOL is complicated and sophisticated because COBOL is mainly about data manipulation. COBOL data declarations offer many data-manipulation opportunities. Later chapters explore many advanced data-declaration concepts such as condition names, table declarations, the USAGE clause, and data redefinition using the REDEFINES clause.

References

1. Sammet J. The early history of COBOL, 2.6: intended purpose and users. ACM SIGPLAN Notices. 1978; 13(8): 121-161.

2. Kloth RD. Cardpunch emulator. www.kloth.net/services/cardpunch.php

3. Myres G. Composite/structured design. New York: Van Nostrand Reinhold; 1978.

4. Constantine L, with Yourdon E. Structured design. Yourdon Press; 1975.

5. Page-Jones M. Practical guide to structured systems design. 2nd ed. Englewood Cliffs (NJ): Prentice Hall, 1988.

6. Stevens W, Myers G, Constantine L. Structured design. In Yourdon E, editor. Classics in software engineering. Yourdon Press; 1979: 205-232.

COBOL PUZZLE ANSWER

Given the description of BirthDate in Listing 2-5, what do you think would be displayed by the COBOL code in Listing 2-6?

Listing 2-5. BirthDate data description

01 BirthDate.
   02 YearOfBirth.
      03 CenturyOB     PIC 99.
      03 YearOB        PIC 99.
   02 MonthOfBirth     PIC 99.
   02 DayOfBirth       PIC 99.

Listing 2-6. Code manipulating BirthDate and its subordinate items

COBOL Puzzle Answer

Month is = 02
Century of birth is = 19
Year of birth is = 1952
15/02/1952
Birth date = 00000215

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for CHAPTER 2: COBOL Foundation

Create new playlist

Sign In

Sign Up

Table of Contents for
CHAPTER 2: COBOL Foundation