Improving Legacy COBOL
When computer programs have been around for a while, they tend not to age gracefully. No matter what the history of the program, the chances are that they have been updated by a number of programmers with differing experience and expertise.
The result is often that the program is no longer in the best of health. This manifests itself in the same way that old people show their age.
- It takes longer to do what they do – in the immortal words of Fred Wedlock, “it takes me all night to do what I used to do all night” – unfortunately I am beginning to understand what he meant.
- They tend to forget what they have just done, and repeat themselves a bit.
- They tend to make mistakes that they never used to make.
- Sometimes they just go on and on
- Sometimes they just fall over and are really hard to get up again
- Most of the time, you find yourself wondering what the hell they are talking about.
“So what?” you may find yourself asking. As long as the program is right most of the time, and it finishes before it breaches any Service Level Agreement then what is the problem?
The problems come in quite subtle ways. You may find yourself at three in the morning scratching your head with an incident manager asking you for updates more and more frantically as you wonder how to decide which records you can drop to get this program re-run and you back to your bed; or you may find yourself being asked about a transaction that is not quite right and not having a clue how to go about finding the cause of the problem; or, worst scenario of the lot, you may find yourself being asked to make changes to two thousand lines of procedure division, most of which is unintelligible.
And guess what? Management expect you to make the change based on the number of lines you are changing, not the complexity of the existing program. So three weeks later, with a two day estimate, you see your ratings going down the pan.
But there is hope! When I was a fair bit younger, I was involved in a project to create an optimising compiler. This meant looking at a lot of Complex Redundant And Poor code to identify improvements.
Some years later, but not very recently, I encountered the Worst Ever Program Ever! So I put some of the techniques I learned in my optimising days into practice and developed a set of techniques to improve the code. Unfortunately, it was not popular with management, because it took time (but much less time than they thought), cost money (but much less money than it saved) and was risky (but not nearly as risky as running the original program).
So what do you do when you are confronted by a huge, unsightly, old, undocumented, unreadable program that you do not understand, that is regularly producing incorrect results, falling over, and needing changes?
The first thing you need to do, unless you are prepared to spend a couple of days of your own time cutting code, is to get your management on side. Tell them your concerns, tell them it will not take as long as writing a good version of the program from scratch, tell them that it will be cheaper and that you will end up with a safer version. Smile, bat your eyelids and buy them a doughnut which has generally worked with the managers I have worked with, but be warned, the women are harder to get round. J
Assuming that the above ploys work, and you get the go-ahead, or like me, you will do it for the hell of it, what is the approach?
- Don’t try to understand the program at first, just make the changes you identify – this is an exercise in low-level code improvement
- Run the production program with copies of depersonalised production data in the test environment, using the output as a baseline set for comparison purposes
- If files are updated by the program rather than read or written, set up the necessary copy job to reset the file
- Change a little (less than 30 minutes coding) and test a lot
- Use Comparex to prove all output files match the original base
- After each test, secure the source before making any more changes
- Once you are happy with the program, perform parallel running between the production and test versions, using Comparex to do your checking
So what smells are you looking to fix (please note, I have not gone back to the old people analogy, I am using the wording from the classic re-engineering book ‘refactoring to patterns’).
- The first smell I like to remove is duplicate code. This is especially common where the program interacts with the system, for example file reads or writes in more than one place. The quickest and neatest way to approach this issue is
- Remove all reads and writes into their own section
- Ensure each file has its own inviolate status code – don’t let two files share one status code
- Ensure that each read and write is fully checked status wise – don’t ever just check for good reads and end of file – if you get in the situation where a bad return code occurs you will never be aware of error situations – always anticipate a bad code and put a catch-all error processing section in.
- Use a flag to ensure that you do not try to read beyond the end of file – the best way, especially when reading to compare against other files is to read into working storage, and when you get an EOF status move high-values to the working storage record.
- Within each file access section, wrap the code in an if-else statement – if good return code then read, else report error and end
Test the program via the Comparex method at this point.
- The second smell to remove is deeply embedded conditions. These typically cover several pages, go down three level of ifs, come up two elses and go down two more ifs, coming up one else and ending with a period. If you are lucky, indenting will be used to help you line up elses with ifs that are three pages apart, and if you are very lucky the indentations will be accurate. Out of date explanations will be embedded within this structure, as will twenty lines of commented out code dated from seventeen years ago, left in just in case it is needed.
-
- First – remove all commented out code – it aint running, it’s history, get rid of it.
- Take the lowest level if that you can find, and try to understand what the code executed is doing. For instance it may be calculating available balance and every other level may be reporting on various error conditions. Take the executed code, its enclosing if and the else condition if it exists, and remove it to a section called CALCULATE-AVAILABLE-BALANCE-OR-REPORT-ERROR. Now the more alert amongst us will recognise that this is too long a name in mainframe COBOL and will take steps to shorten, still keeping the meaning clear.
- Test, secure source if the test is successful and repeat. If the test fails, discard the program and go back to the previous version of source and be more careful.
- Repeat until all sections containing conditions are less than one page long, preferably about twenty lines allowing them to be seen on a single screen.
- Long sections should be examined to see if they are functionally primitive. This means they should do one thing only, for example moving fields and calculating results to go into a record, when you would call it SET-UP-RECORD.
- All IF statements should have corresponding END-IF statements. Never use the period to terminate a condition.
- All EVALUATE statements should have a well considered DEFAULT action. Don’t believe, just because the Analyst has said it, that an unexpected value cannot occur. Believe me they can, and they can be career stoppers.
- When all the code has been flattened and normalised so that it occurs once in small sections, and the more complex sections consist of a series of PERFORM statements with no logic controlling processes, then you have the opportunity to see what the program is doing. Determine what the real line of processing is, what I call the nominal processing – what the program would be doing in the ideal world, and organise the mainline of code to reflect that sequence of events.
- Once you have reached this state, you should find that your program is much more maintainable, with a surprising number of glaring errors that you can see how to fix – remember you should NOT be correcting code whilst performing this process, you are re-engineering to see what it does and aiming for understanding. Only when this process is complete can you proceed to making corrections, otherwise you risk being unable to test as your results deviate from the production version and you cannot test.
