Why do bugs exist and where in the development cycle do they creep in? Spending time and effort on the problem of understanding why bugs exist is the first step to writing bug-free code. The second step is to take action and institute policies that eliminate the problem or help detect the problem. Most important, make sure the entire programming staff knows about and understands the new policies.
A good friend of mine who works at a different company started to use run-time parameter validation in the modules that he wrote and the code that he modified. Run-time parameter validation is a good idea. However, management and other programmers at the site were reluctant to make this programming methodology mandatory. Well, one day my friend was modifying some existing code on the project and while he was at it, he added parameter validation to the functions that he had modified. He tested the code and checked it back into the source-code control system. A few weeks later the code started to display parameter errors from code that it had called that had been written years ago. Unbelievably, some programmers wanted the parameter validation removed. After all, they reasoned, code that once worked was now producing errors, so it must be the new parameter validation code, not the old existing code.
This is an extreme example, but it demonstrates that everyone involved with a project must fully understand new programming methodologies instituted in the middle of a programming project.
1.1 Small versus Large Projects
What if you needed to write a hex dump utility to be called DUMP? The program takes as an argument on the command line the name of the file you wish to display in hex. Would it be written without a single bug? Yes, probably so, but why? Because the task is small, well defined and isolated.
So what happens when you are asked to work on project ALPHA, a contract programming project your company is working on for another company? The project is several hundred thousand lines long and has ten programmers already working on the project. The deadlines are approaching and the company needs your programming talent. Do you think you could jump right in and write new code without introducing bugs into the project? I couldn't, at least not without the proper programming methodologies in place to catch the programming errors any beginner on the project is bound to make.
Think of the largest project you have worked on. How many include files were there and what did the include files contain? How many source files were there and what did they contain? You had no problem working on the DUMP utility, so what makes the large project so difficult to work on? Why is the large project not simply like working on ten or one hundred small projects?
Let's examine your small programming project. It consists of a couple of source files and a couple of include files. The include files contain function prototypes, data structure declarations, #define's, typedef's and whatever else. You have knowledge of everything, but because the number of files is relatively small, you can handle it all. Now, multiply this by ten or one hundred times for the large project and all of a sudden the project becomes unmanageable.
You have too much information in the include files to manage and now the project is getting behind, so you add more people to the project to get it completed faster. This only compounds the problem, because now you have even more people adding information to the pool of information that everyone else needs to learn and know about. It is a vicious cycle that all too many projects fall into.
1.2 Too Many Data Structures in Include Files
We can all agree that one major problem in large projects is that there is too much information to become familiar with in a short period of time. If you could somehow eliminate some of the information, this would make it easier, since there would be less information to become familiar with.
The root of the problem is that there is too much information placed into include files, the biggest contributor being data structure declarations. When you start a project, you place a couple of declarations in the include file. As the project continues, you place more and more declarations in the include file, some of which refer to or contain previous declarations. Before you know it, you have a real mess on your hands. The majority of your source files have knowledge of the data structures and directly reference elements from the structures.
Making changes in an environment where many data structures directly refer to other data structures becomes, at best, a headache. Consider what happens when you change a data structure. This change forces you to recompile every source file that directly, or more importantly indirectly, refers to the changed data structure. This happens when a source file refers to a data structure that refers to a data structure that refers to the changed data structure. A change to the data structure may force you to modify some code, possibly in multiple source files.
The class methodology §4 solves this data structure problem.
1.3 Using Techniques That Do Not Scale
As you have just seen, large projects have problems all their own. Because of this, you must be careful in selecting programming methodologies that work well in small projects as well as in large programming projects. In other words, a programming methodology must scale or work with any project size.
Let's say that Joe Programmer institutes a policy that all data declarations must be declared in include files so that all source files have direct access to the data structures. He reasons that by doing this, he will gain a speed advantage over his competition and his product will be superior.
This may work for the first release of his product, but what happens when the size of the project grows to the point that it becomes unmanageable because there are too many public data declarations? His job is at stake. The programming methodology Joe chose worked great for the small project when it started, but it failed miserably when the project grew. And all successful small projects grow into large projects.
Make sure the programming methodologies that you develop work equally well for both small and large projects.
1.4 Too Much Global Information
Global variables (variables known to more than one source file) should be avoided. Their usage does not scale well to large applications. Just think what eventually happens in a large application when the number of global variables gets larger and larger. Before you know it, there are so many of them that the variables become unmanageable.
Whenever you are about to use a global variable, ask yourself why you need direct access to the variable. Would a function call to the module where the variable is defined work just as well? Most of the time the answer is yes. If the global variable is modified (read and written), then you should be asking yourself how the global variable is being acted upon and if it be better to specify what to do through a function call, leaving the how to the function.
Some global variables are OK, but my experience has been that only a handful are ever needed, no matter how large a project gets.
1.5 Relying on Debuggers
Debuggers are great tools, but they are no substitute for good programming methodologies that help eliminate bugs in the first place.
Debuggers certainly make finding bugs easier, but at what price? Does knowing that there is a powerful debugger there to help you when you get in trouble cause you to design and write code faster and test it sooner than you should have, causing a bug that forces you to use the debugger? A bug that would not be there had you spent the time in the design and coding phases of the project.
I personally feel that programmers whose first course of action against a bug is to use a debugger begin to rely more and more on a debugger over time. Programmers who use a debugger only after all other options have been exercised begin to rely less and less on a debugger over time.
Instead of a debugger, why not look at the code and do a quick code review, using available evidence from the program crash or problem? Often, the evidence plus looking at the code pinpoints the problem faster than the time it takes to start the program in the debugger.
By now you are probably wondering how often I use the debugger in my programming environment. Because of all the programming methodologies that I have in my bag of techniques, not very often.
Your goal should be to develop and use programming techniques that catch your bugs for you. Do not rely on a debugger to catch your bugs.
1.6 Fixing the Symptom and Not the Problem
Let's say you encounter a bug in your code. How do you go about fixing the problem? Before you answer, what is the problem? Is the problem the bug itself, or is it what caused the bug to occur?
All too often, the bug or symptom is being fixed and not the actual problem which caused the bug in the first place. A simple test is to think how many times you have encountered essentially the same bug. If never, then great, you are already fixing the problem and not the symptom. Otherwise, give some careful thought to what you are fixing. You probably want to come up with a new programming methodology that helps to fix the problem once and for all.
A prime example is storage leaks. A storage leak is defined as a memory object that is allocated but never freed by your program. In most cases, the storage leak does not directly cause problems, but what if it happens on an object that gets allocated and (not) freed a lot? You run out of memory at some point.
Running out of memory is what caused you to start looking for the problem. You eventually find the missing line of code that should have deallocated the memory object and add the memory deallocation into the program. Bug fixed, right?
Wrong! The underlying problem of storage leaks going undetected is still in the program. In fact, had the heap manager told you where there was a storage leak in your program, you would not have wasted your time tracking down the storage leak. To fix the problem once and for all, the heap manager must tell you that there is a storage leak. This programming methodology is discussed in Chapter 5.
1.7 Unmaintainable Code
We all have had the experience of modifying code that someone else has written, either to add a new feature or to fix a problem in a module. I don't know about you, but for me it is typically not an enjoyable experience because the code is often so hard to understand that I end up spending the majority of my time just figuring out what is going on.
Good code runs. Great code runs and is also easily maintainable. In your code reviews, you should look for code that not only works, but also for code that is straightforward in how it works. What good is a coding technique if it cannot be understood? Consider the following.
Is this code fragment hard for you to understand? If not, then you know the technique being used. If you do not know the technique, then it is sure frustrating to figure out. This code is one way to implement the ceil function on the nTotal/nX value.
Properly commenting your code is a good first step. However, keep in the back of your mind that someone else is reading your code and avoid obscure programming techniques unless they are fully commented and documented in the project.
1.9 Chapter Summary
This book was previously published by Pearson Education, Inc.,
formerly known as Prentice Hall. ISBN: 0-13-183898-9