From my previous tutorial, we’ve talked about how to make a simple looping program together on how to compile a COBOL program (plus some debugging tips). This time, we’ll talk about reading text files, how should it be formatted and how should we display it properly with the proper PIC format. I will be skipping the process of explaining the basics. I’ll jump through what’s important. If you want to go back through them, just click here. We move on to ENVIRONMENT DIVISION.
1.) Defining Input & Output Devices and File Organization through the ENVIRONMENT DIVISION
IDENTIFICATION DIVISION. PROGRAM-ID. SAMPLE2. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT IN-EMPLOYEE-FILE ASSIGN TO DISK ORGANIZATION IS LINE SEQUENTIAL. SELECT OUT-SALARY-FILE ASSIGN TO DISK ORGANIZATION IS LINE SEQUENTIAL.
The file name is saved as SAMPLE2.COB as written on the code.
Here’s the explanation. Environment Division tells the computer what devices will the computer interact with. In our case, it will interact with our local computer which is defined as ASSIGN TO DISK. DISK means our local computer. Those are defined within the INPUT-OUTPUT SECTION and the FILE-CONTROL. That area can be defined in other places such as PRINTER, ‘c:/localpath/file.dat’, etc. IN-EMPLOYEE-FILE and OUT-SALARY-FILE are your user-defined variables to let your computer know that these words are your representation to the program. It doesn’t stop here. You will notice there is still the remaining line ORGANIZATION IS LINE SEQUENTIAL. Now, what does that mean?
It tells what type of file organization are you using. There are 4 types namely LINE SEQUENTIAL, RECORD SEQUENTIAL, RELATIVE, and INDEXED. I’ll define these 4 just so you know:
LINE SEQUENTIAL – Correspond to simple text files as provided by your machine.
RECORD SEQUENTIAL – The simplest type of organization. Records are placed based on the order they are written and read back in the same order.
RELATIVE – Records can be read and written in any way be it random or sequential.
INDEXED – These are read based through a unique key/number. This is similar to a primary key in database tables.
2.) Coding the DATA DIVISION – (FD) File Descriptor
Now let’s move on to our DATA DIVISION.
DATA DIVISION. FILE SECTION. FD IN-EMPLOYEE-FILE LABEL RECORDS ARE STANDARD RECORD CONTAINS 80 CHARACTERS VALUE OF FILE-ID IS "INPUT.TXT" DATA RECORD IS IN-EMPLOYEE-REC.
From the DATA DIVISION, you will notice we have a section called FILE SECTION. We define here the properties of the file that was defined previously on the ENVIRONMENT DIVISION. Let me show you their relationship:
Note that we have not placed OUT-SALARY-FILE yet!
Moving on, we define the name in a word called FD which means File Descriptor. Next, we place 4 other properties below it.
Note that the delimiter ends at the data record, not per each line.
LABEL RECORDS ARE STANDARD –This means that our records are labeled for DISKS/TAPES (for short, your local machine) as they identify records that are readable by the human eye. Devices such as printers do not use label records so there is also another clause, LABEL RECORDS ARE OMITTED which are used for printers or devices that is not readable by the human eye. IS and ARE are both acceptable depending on the number of devices so its possible to have a format like LABEL RECORD IS STANDARD.
RECORD CONTAINS 80 CHARACTERS – the first record on disk or tape file is a standard 80-position header label identifying the file to the system
VALUE OF FILE-ID IS “INPUT.TXT” – This will define that the keyword IN-EMPLOYEE-FILE is based on INPUT.TXT.
DATA RECORD IS IN-EMPLOYEE-REC – This is where we define the object properties of the text file which is INPUT.TXT named into a group item called IN-EMPLOYEE-REC.
You might wonder that there is no existing INPUT.TXT yet. Bingo you are correct. Here’s the format of the INPUT.TXT
COBOL LORDSxxxxx 070002111111 ALBERT EINSTEIN 200002001111 ABRAHAM LINCOLN 030001000011
Note that they should have the exact format as that. I’ll be explaining them in a short while on why is it like that.
3.) Coding the DATA DIVISION – File Properties and Group & Elementary Items
Now let’s define our IN-EMPLOYEE-REC
01 IN-EMPLOYEE-REC. 05 IN-EMPLOYEE-NAME PIC X(20). 05 IN-SALARY PIC 9(5). 05 IN-NO-OF-DEPENDENTS PIC 9. 05 IN-STATE-TAX PIC 9(4)V99.
You will notice that this is not yet coded on our WORKING-STORAGE SECTION. Let me explain why was it formatted that way:
Let’s take a look back at our INPUT.TXT. As I mentioned, you should have the same format as that as we will be using it in our processing later on. Now let’s begin with 05 IN-EMPLOYEE-NAME, an elementary item of 01 IN-EMPLOYEE-REC. This is considered as an attribute of the group item IN-EMPLOYEE-REC. It’s defined with an alphanumeric PIC with 20 characters. The reason behind is this:
In our IN-EMPLOYEE-NAME, We define all our text input properties in this area. Its the same goes on the other elementary items. Its also based on its size and its PIC format. Pretty much it. Now I’ll skip to the OUTPUT’s FD and properties portion:
FD OUT-SALARY-FILE LABEL RECORDS ARE STANDARD RECORD CONTAINS 80 CHARACTERS VALUE OF FILE-ID IS "OUTPUT.TXT" DATA RECORD IS OUT-SALARY-REC. 01 OUT-SALARY-REC. 05 OUT-EMPLOYEE-NAME PIC X(20). 05 FILLER PIC X(5). 05 OUT-SALARY PIC $(2),$(3).99. 05 FILLER PIC X(5). 05 OUT-STATE-TAX PIC $Z(3)9.99.
Pretty much the same as from the input but the only difference is how the PIC’s are formatted and they have a literal named FILLER. It basically means spaces. PIC’s formatted for outputs are called numeric-edited fields and based on that, they will display differently. Just so you know, we use the dot (.) for the output and V for the input when it comes to decimals. Try and compare their elementary items 😉 To give you a glimpse, our expected output should look like this:
Magic isn’t it? The output is pretty much clean and formatted and there’s a logic behind the PIC’s we placed for our output 😉 Don’t know the logic behind it? The PIC format? The dollar sign? The spaces? Click here for my edited PIC manifesto (which was made for my class 🙂 ).
4.) The WORKING-STORAGE SECTION
Next, let’s define our WORKING-STORAGE SECTION
WORKING-STORAGE SECTION. 01 WS-WORK-AREAS. 05 ARE-THERE-MORE-RECORDS PIC X(3) VALUE 'YES'.
Plain and simple. The reason why I’m placing that is for the logic of reading the text file per line later on.
5.) Coding the Procedure Division – Opening, Reading, and Closing the File
Let’s now move on to the procedure division.
PROCEDURE DIVISION. 100-MAIN-MODULE. OPEN INPUT IN-EMPLOYEE-FILE OUTPUT OUT-SALARY-FILE.
READ IN-EMPLOYEE-FILE AT END MOVE 'NO' TO ARE-THERE-MORE-RECORDS.
PERFORM 200-CALC-RTN UNTIL ARE-THERE-MORE-RECORDS = 'NO'. CLOSE IN-EMPLOYEE-FILE OUT-SALARY-FILE.
Here’s the logic behind it. We created a paragraph named 100-MAIN-MODULE and:
Note: We still have not yet declared the paragraph 200-CALC-RTN. Relax.
OPEN INPUT IN-EMPLOYEE FILE – means we are opening INPUT.TXT. Remember that IN-EMPLOYEE-FILE came from our Environment Division and the FD which we defined earlier.
OUTPUT OUT-SALARY-FILE – Same as above.
READ IN-EMPLOYEE-FILE – We are now reading in EMPLOYEE-FILE per line and it has an extending clause AT END MOVE ‘NO’ TO ARE-THERE-MORE-RECORDS which means if it reads the last line, it will assign the literal ‘NO’ to ARE-THERE-MORE-RECORDS.
CLOSE IN-EMPLOYEE-FILE OUT-SALARY-FILE – Closes the file *bow*
Do I still need to explain STOP RUN?
5.) Coding the Procedure Division – The Moving and Looping Logic
Now let’s create our 2nd paragraph and name it 200-CALC-RTN.
200-CALC-RTN. MOVE SPACES TO OUT-SALARY-REC. MOVE IN-EMPLOYEE-NAME TO OUT-EMPLOYEE-NAME. MOVE IN-SALARY TO OUT-SALARY. MOVE IN-STATE-TAX TO OUT-STATE-TAX. WRITE OUT-SALARY-REC. READ IN-EMPLOYEE-FILE AT END MOVE 'NO' TO ARE-THERE-MORE-RECORDS.
The logic is simple. It only moves PIC’s from the input based on the text files format going to the output. If you recall, those are defined from the FD (File Descriptor) which we defined earlier. You will notice a new literal name called SPACES. It’s basically the same as FILLER. Now run the program and it shouldn’t display anything on the command prompt BUT it should create a file named OUTPUT.TXT as defined on our Data Division that gives it the output. It should look like this:
6.) Debriefing and Summary
In the end, your code should look like this:
Lines 1 – 44
Lines 17 – 58
It would be better if you coded this from scratch just in case you might encounter some bugs, you know how to fix it and be more aware of them next time.
Thanks and Happy Coding!