|
|
|
Writing Bug-Free C Code
A Programming Style That Automatically Detects Bugs in C Code
by Jerry Jongerius / January 1995
|
|
|
|
Chapter 1: Understand Why Bugs Exist
Why do bugs exist and where in the development cycle do they creep in?
Spending time and effort on the problem of understanding why bugs
exist is the first step to writing bug-free code. The second step
is to take action and institute policies that eliminate the problem
or help detect the problem. Most important, make sure the entire
programming staff knows about and understands the new policies.
|
The first step in writing bug-free code is to understand why bugs
exist. The second step is to take action.
|
A good friend of mine who works at a different company started to use
run-time parameter validation in the modules that he wrote and the
code that he modified. Run-time parameter validation is a good
idea. However, management and other programmers at the site were
reluctant to make this programming methodology mandatory. Well, one
day my friend was modifying some existing code on the project and
while he was at it, he added parameter validation to the functions
that he had modified. He tested the code and checked it back into
the source-code control system. A few weeks later the code started
to display parameter errors from code that it had called that had
been written years ago. Unbelievably, some programmers wanted the
parameter validation removed. After all, they reasoned, code that
once worked was now producing errors, so it must be the new
parameter validation code, not the old existing code.
This is an extreme example, but it demonstrates that everyone
involved with a project must fully understand new programming
methodologies instituted in the middle of a programming project.
1.1 Small versus Large Projects
What if you needed to write a hex dump utility to be called DUMP?
The program takes as an argument on the command line the name of the
file you wish to display in hex. Would it be written without a
single bug? Yes, probably so, but why? Because the task is small,
well defined and isolated.
So what happens when you are asked to work on project ALPHA, a
contract programming project your company is working on for another
company? The project is several hundred thousand lines long and has
ten programmers already working on the project. The deadlines are
approaching and the company needs your programming talent. Do you
think you could jump right in and write new code without introducing
bugs into the project? I couldn't, at least not without the proper
programming methodologies in place to catch the programming errors
any beginner on the project is bound to make.
Think of the largest project you have worked on. How many include
files were there and what did the include files contain? How many
source files were there and what did they contain? You had no
problem working on the DUMP utility, so what makes the large project
so difficult to work on? Why is the large project not simply like
working on ten or one hundred small projects?
|
Programming methodologies must make it easy for new programmers to
jump into a project without introducing a slew of new bugs.
|
Let's examine your small programming project. It consists of a
couple of source files and a couple of include files. The include
files contain function prototypes, data structure declarations,
#define's, typedef's and whatever else. You have knowledge of
everything, but because the number of files is relatively small, you
can handle it all. Now, multiply this by ten or one hundred times
for the large project and all of a sudden the project becomes
unmanageable.
You have too much information in the include files to manage and now
the project is getting behind, so you add more people to the project
to get it completed faster. This only compounds the problem,
because now you have even more people adding information to the pool
of information that everyone else needs to learn and know about. It
is a vicious cycle that all too many projects fall into.
1.2 Too Many Data Structures in Include Files
We can all agree that one major problem in large projects is that
there is too much information to become familiar with in a short
period of time. If you could somehow eliminate some of the
information, this would make it easier, since there would be less
information to become familiar with.
The root of the problem is that there is too much information placed
into include files, the biggest contributor being data structure
declarations. When you start a project, you place a couple of
declarations in the include file. As the project continues, you
place more and more declarations in the include file, some of which
refer to or contain previous declarations. Before you know it, you
have a real mess on your hands. The majority of your source files
have knowledge of the data structures and directly reference
elements from the structures.
|
A technique that helps eliminate data structures from include files
needs to be found.
|
Making changes in an environment where many data structures directly
refer to other data structures becomes, at best, a headache.
Consider what happens when you change a data structure. This change
forces you to recompile every source file that directly, or more
importantly indirectly, refers to the changed data structure. This
happens when a source file refers to a data structure that refers to
a data structure that refers to the changed data structure. A
change to the data structure may force you to modify some code,
possibly in multiple source files.
The class methodology §4
solves this data structure problem.
1.3 Using Techniques That Do Not Scale
As you have just seen, large projects have problems all their own.
Because of this, you must be careful in selecting programming
methodologies that work well in small projects as well as in large
programming projects. In other words, a programming methodology
must scale or work with any project size.
|
Programming methodologies must work equally well for both small and
large projects.
|
Let's say that Joe Programmer institutes a policy that all data
declarations must be declared in include files so that all source
files have direct access to the data structures. He reasons that by
doing this, he will gain a speed advantage over his competition and
his product will be superior.
This may work for the first release of his product, but what happens
when the size of the project grows to the point that it becomes
unmanageable because there are too many public data declarations?
His job is at stake. The programming methodology Joe chose worked
great for the small project when it started, but it failed miserably
when the project grew. And all successful small projects grow into
large projects.
Make sure the programming methodologies that you develop work equally
well for both small and large projects.
1.4 Too Much Global Information
Global variables (variables known to more than one source file)
should be avoided. Their usage does not scale well to large
applications. Just think what eventually happens in a large
application when the number of global variables gets larger and
larger. Before you know it, there are so many of them that the
variables become unmanageable.
Whenever you are about to use a global variable, ask yourself why you
need direct access to the variable. Would a function call to the
module where the variable is defined work just as well? Most of the
time the answer is yes. If the global variable is modified (read
and written), then you should be asking yourself how the global
variable is being acted upon and if it be better to specify what to
do through a function call, leaving the how to the function.
|
Variables known to more than one source file should be avoided.
|
Some global variables are OK, but my experience has been that only a
handful are ever needed, no matter how large a project gets.
1.5 Relying on Debuggers
Debuggers are great tools, but they are no substitute for good
programming methodologies that help eliminate bugs in the first
place.
Debuggers certainly make finding bugs easier, but at what price?
Does knowing that there is a powerful debugger there to help you
when you get in trouble cause you to design and write code faster
and test it sooner than you should have, causing a bug that forces
you to use the debugger? A bug that would not be there had you
spent the time in the design and coding phases of the project.
I personally feel that programmers whose first course of action
against a bug is to use a debugger begin to rely more and more on a
debugger over time. Programmers who use a debugger only after all
other options have been exercised begin to rely less and less on a
debugger over time.
Instead of a debugger, why not look at the code and do a quick code
review, using available evidence from the program crash or problem?
Often, the evidence plus looking at the code pinpoints the problem
faster than the time it takes to start the program in the debugger.
|
Use a debugger only as a last resort. Having to resort to a debugger
means your programming methodologies have failed.
|
By now you are probably wondering how often I use the debugger in my
programming environment. Because of all the programming
methodologies that I have in my bag of techniques, not very often.
Your goal should be to develop and use programming techniques that
catch your bugs for you. Do not rely on a debugger to catch your
bugs.
1.6 Fixing the Symptom and Not the Problem
Let's say you encounter a bug in your code. How do you go about
fixing the problem? Before you answer, what is the problem? Is the
problem the bug itself, or is it what caused the bug to occur?
All too often, the bug or symptom is being fixed and not the actual
problem which caused the bug in the first place. A simple test is
to think how many times you have encountered essentially the same
bug. If never, then great, you are already fixing the problem and
not the symptom. Otherwise, give some careful thought to what you
are fixing. You probably want to come up with a new programming
methodology that helps to fix the problem once and for all.
|
When fixing a bug, fix the underlying problem or cause of the bug and
not just the bug itself.
|
A prime example is storage leaks. A storage leak is defined as a
memory object that is allocated but never freed by your program. In
most cases, the storage leak does not directly cause problems, but
what if it happens on an object that gets allocated and (not) freed
a lot? You run out of memory at some point.
Running out of memory is what caused you to start looking for the
problem. You eventually find the missing line of code that should
have deallocated the memory object and add the memory deallocation
into the program. Bug fixed, right?
Wrong! The underlying problem of storage leaks going undetected is
still in the program. In fact, had the heap manager told you where
there was a storage leak in your program, you would not have wasted
your time tracking down the storage leak. To fix the problem once
and for all, the heap manager must tell you that there is a storage
leak. This programming methodology is discussed in
Chapter 5.
1.7 Unmaintainable Code
We all have had the experience of modifying code that someone else
has written, either to add a new feature or to fix a problem in a
module. I don't know about you, but for me it is typically not an
enjoyable experience because the code is often so hard to understand
that I end up spending the majority of my time just figuring out
what is going on.
|
Strive to write code that is understandable by other programmers.
|
Good code runs. Great code runs and is also easily maintainable. In
your code reviews, you should look for code that not only works, but
also for code that is straightforward in how it works. What good is
a coding technique if it cannot be understood? Consider the
following.
Hard to understand code?
nY = (nTotal-1)/nX + 1;
|
Is this code fragment hard for you to understand? If not, then you
know the technique being used. If you do not know the technique,
then it is sure frustrating to figure out. This code is one way to
implement the ceil function on the nTotal/nX value.
Properly commenting your code is a good first step. However, keep in
the back of your mind that someone else is reading your code and
avoid obscure programming techniques unless they are fully commented
and documented in the project.
1.8 Not Using the Windows Debug Kernel
I am primarily a Microsoft Windows developer and it astounds me the
number of times I have run commercial Windows applications on my
machine only to have them bomb because the Windows debug kernel is
reporting a programming error such as an invalid handle. The error
is in such an obvious place that you know the developers did not use
the debugging kernel of Windows during their development. Why would
anyone develop an application and not use the debugging kernel of
the underlying environment? It catches errors in your code
automatically.
The Windows development environment allows for running either the
retail kernel or the debugging kernel. The retail kernel is the
kernel as it is shipped to the customer. The debugging kernel has
extra error checking not present in the retail kernel. Error
messages from the kernel may then be redirected to a secondary
monochrome screen attached to the system. This redirection is an
option in the DBWIN.EXE program provided with Microsoft C8. You
should run with the debugging kernel all the time.
|
Use the debugging kernel of your development environment.
|
The Windows SDK provides D2N.BAT (debug to non-debug) and N2D.BAT
(non-debug to debug) batch files in the BIN directory to switch
between debug and non-debug Windows. It is easy to accidentally
leave your version of Windows in the non-debug mode. It has
happened to me a couple of times. To detect this situation, I
finally printed a special symbol in my application's status line to
signify that it is being run under debug Windows. I suggest that
you do something similar in a place that is easy to spot in your
Windows application. To detect if you are running under debug
Windows, use the GetSystemMetrics(SM_DEBUG) call. It returns zero
under retail Windows and a non-zero value under debug Windows.
|
1.9 Chapter Summary
- The first step in writing bug-free code is to understand why bugs
exist. The second step is to take action. That is what this book
is all about.
- Programming methodologies that are developed to prevent and detect
bugs must work equally well for both small and large programming
projects.
- A technique that helps eliminate data structure declarations from
include files needs to be found. Doing so will allow programmers to
come up to speed on an existing project much quicker.
- Global variables that are known to more than one source file should
be avoided. Global variables make it hard to maintain a project.
- Debuggers should be used only as a last
resort. Having to resort to a debugger means that your programming
methodologies used to detect bugs have failed.
- When you fix a bug, make sure you are really fixing the underlying
cause of the bug and not just the symptom of the bug. Ask yourself
how many times you have fixed the same type of bug.
- Strive to write code that is straightforward and easily
understandable by others. Avoid writing code that pulls a lot of
tricks.
- Finally, make sure that you use the Windows debug kernel all the
time. It contains extra error checking that can automatically
detect certain types of bugs that go undetected in the retail
release of Windows.
Copyright © 1993-1995, 2002-2008 Jerry Jongerius
This book was previously published by Person Education, Inc.,
formerly known as Prentice Hall. ISBN: 0-13-183898-9
|
|