Tweaking COBOL Compiler Options for Improved Performance
COBOL is still the most widely used language in mainframe – both for online transaction processing as well as massive batch processing. Even minor improvement of performance of repeatedly executing COBOL programs directly provides savings of CPU and hence the costs. One of the simple things – and often neglected one – that would optimize COBOL performance is to use the right set of Compile options.
In quite a few performance tuning engagements, the culprit is that the compiler option (set in the JCL or the configuration tool) used by almost all programs having compiler options (mostly left to default) that bring down the performance. Even the veteran COBOL programmers, tend to ignore these and focus on programs alone while trying to improve performance. In this article, I would like to highlight the COBOL compiler options which impacts performance.
The OPTIMIZE compiler option can be used to improve the efficiency of the generated code. NOOPTIMIZE is the default. OPTIMIZE(STD) results in the following optimizations:
- Eliminate unnecessary transfers of control and inefficient branches (including those generated by the compiler).
- Simplify the PERFORM and CALL statement to a contained (nested) program, by placing the statements inline, eliminating the need for linkage code.
- Eliminate duplicate computations (such as subscript computations and repeated statements).
- Eliminate constant computations by performing them when the program is compiled.
- Eliminate constant conditional expressions.
- Aggregate moves of contiguous items (say with the use of MOVE CORRESPONDING) into a single move.
- Delete from the program, code that can never be performed (unreachable code elimination).
In case of OPTIMIZE(FULL) option, additionally it:
- Discard unreferenced data items from the DATA DIVISION, and suppress generation of code to initialize these data items to their VALUE clauses (If the program relies upon unreferenced level 01 or level 77 data items, do not use OPTIMIZE(FULL)).
Note that OPTIMIZE requires more CPU time for compiles than NOOPTIMIZE, but generally produces more efficient run-time code. It is suggested that NOOPTIMIZE is used while a program is being developed, as frequent compiles would be happening and it also makes it relatively easier to debug a program since code is not moved.
With DYNAM option, all subprograms invoked through the CALL literal statement will be loaded dynamically at run time. NODYNAM is the default. DYNAM allows sharing of common subprograms, provides control of using the virtual storage (that can be freed using CANCEL statement), but with a performance penalty as the call must go through a library routine, whereas with the NODYNAM option, the call goes directly to the subprogram. Detailed information is available at http://itknowledgeexchange.techtarget.com/enterprise-IT-tech-trends/static-dynamic-linking-in-ibm-cobol/.
According to IBM, for a CALL intensive application, the average overhead associated with the CALL using DYNAM ranged from 40% to 100% compared to that of NODYNAM.
Using the FASTSRT compiler option improves the performance of most sort operations. NOFASTSRT is the default. With FASTSRT, the DFSORT product (instead of Enterprise COBOL) performs the I/O on the input and output files named in the SORT . . . USING and SORT . . . GIVING statements.
One program that processed 100,000 records was 45% faster when using FASTSRT compared to using NOFASTSRT and used 4,000 fewer EXCPs.
XMLPARSE(XMLSS) option (the default) selects the z/OS XML System Services parser as against XMLPARSE(COMPAT) uses the built-in component of the COBOL run time. While XMLSS provides additional capabilities, at present COMPAT option is found to be faster by 20-108%. But it is important to note that as IBM would focus more on XML Parser, the performance difference is most likely to get lower.
THREAD option that enables multi-threading in COBOL and can be used in a non-threaded application (ref. http://itknowledgeexchange.techtarget.com/enterprise-IT-tech-trends/multithreading-in-cobol/) results in runtime performance degradation due to overhead of serialization logic that is automatically generated. NOTHREAD is the default.
ARITH option that allows controlling the maximum number of digits allowed for decimal numbers. ARITH(COMPAT), the default, allows the maximum digits of 18 (which should serve well for most requirements) while ARITH(EXTEND) allows up to 31.
ARITH(EXTEND) causes performance degradation for all decimal data types because of larger intermediate results. The performance impact on an average is 16%, while for programs with heavy use of decimals, it could be as high as 40%.
AWO option implicitly activates the APPLY WRITE-ONLY clause for all physical sequential, variable-length, blocked files (irrespective of whether it is specified in the program). NOAWO is the default option. Using the APPLY WRITE-ONLY clause makes optimum use of buffer and device space. With APPLY WRITE-ONLY specified, the file buffer is written to the output device only when the next record does not fit in the unused portion of the buffer. Without APPLY WRITE-ONLY specified, a file buffer is written to the output device when it does not have enough space for a maximum-size record.
According to IBM, a program using variable-length blocked files and AWO was 86% faster than NOAWO (as the result of using 98% fewer EXCPs to process the writes).
BLOCK0 option changes the default for QSAM files from unblocked to blocked thus gaining the benefit of system-determined blocking for output files. NOBLOCK0 is the default. BLOCK0 is applicable for each file that meets all of the following criteria:
- The FILE-CONTROL paragraph specifies ORGANIZATION clause as SEQUENTIAL or omits it.
- The FD entry does not specify RECORDING MODE U.
- The FD entry does not specify a BLOCK CONTAINS clause.
AWO might apply to more files than it otherwise would, if BLOCK0 is also specified (as AWO applies only for blocked variable-length records). One program using BLOCK0 was found to be 88% faster than using NOBLOCK0 (using 98% fewer EXCPs).
Note that specifying BLOCK0 for existing programs might change the behavior of the program – especially for files opened as INPUT without block size.
NUMPROC(PFD) improves the performance of processing numeric internal decimal and zoned decimal data. With NUMPROC(PFD), the compiler assumes that the data has the correct sign and bypasses the sign fix-up processing. But use this option only if your program data agrees exactly with the following IBM system standards.
Note that NUMPROC(NOPFD)is the default – and recommended if the numeric internal decimal and zoned decimal data might not use proper signs (especially if the program has to process external data files). Also note that NUMPROC(NOPFD) or NUMPROC(MIG) should be used if a COBOL program calls programs written in PL/I or FORTRAN.
NUMPROC(PFD) – that can provide performance benefit between 5-20% – is advisable for performance sensitive applications after ensuring that the necessary conditions are met.
TRUNC(OPT) is another performance tuning option for performance sensitive application and should be used only when the data in the application program conforms to the PICTURE and USAGE specifications.
While the above points are related to runtime efficiency, the following two options are worth noting in a development environment:
- Use BUFSIZE to allocate an amount of main storage to the buffer for each compiler work data set. Usually, a large buffer size improves the performance of the compiler.
- Use the COMPILE option only if you want to force full compilation even in the presence of serious errors. All diagnostics and object code will be generated. Do not try to run the object code if the compilation resulted in serious errors: the results could be unpredictable or an abnormal termination could occur.
With judicial use of the compiler options – suited for the given environment – performance benefits can be achieved without modifying the programs.