This chapter introduces a new heap manager interface that checks for common mistakes on the part of the programmer allocating and freeing memory and also fulfills the requirements of the class methodology introduced in Chapter 4. All of the code introduced in this chapter can also be found in one convenient location in the Code Listings Appendix.
5.1 The Traditional Heap Manager
A good friend of mine was visiting one day when we happened to start talking about programming techniques. He had landed a job with a well-known, large computer company. I explained to him the replacement heap manager that I had come up with and the reasons why I felt that it was necessary. I then questioned my friend on the programming guidelines concerning memory deallocations that were in place at his company.
As it turns out, there were none. The standard C library routines were it. Throughout the code there were direct calls to the memory manager to allocate and free memory. I then asked what was done when an invalid memory pointer was inadvertently passed to free(). Nothing was done. It was assumed that memory pointers passed to free() were always valid. Whenever a memory management bug occurred, it would be tracked down and fixed.
Tracking down memory management bugs as they occur violates the principle of solving the problem that caused the bug and not just the manifestation of the bug.
5.1.1 The Interface
The standard C library routines for allocating and freeing memory are malloc() and free().
The malloc() function takes one argument, the size of the memory object to allocate, in bytes. It returns a void pointer to the allocated block of memory or NULL if there is insufficient memory or if some error exists. The memory is guaranteed to be properly aligned for any type of object.
The free() function takes one argument, a void memory pointer that was previously allocated through a malloc() call. The free() function has no return value. If a NULL pointer is passed to free(), the call is ignored. If an invalid memory pointer is passed to free(), the behavior of free() is undefined.
While there are several more standard functions for manipulating memory, for the sake of discussion, we are interested only in malloc() and free().
5.1.2 The Problem
The single biggest problem with the standard C library interface to memory is that it assumes the programmer never makes a mistake in calling the memory management functions. Consider the following.
Using free() on a memory pointer that has already been passed to free() is a common error in many large C programs. How free() behaves in such a case is undefined. On some systems, the run-time library may tell you that you have made a mistake and terminate your program. Other systems will continue, even though the validity of the application's heap is in question.
C was designed to be portable and produce small, efficient code. To accomplish this feat so successfully, a lot of small but important decisions have been left to be decided by the particular implementor. The designers of C wanted to give as much leeway as possible to each particular implementation.
It is up to the programmer to build upon this framework.
5.2 A New Heap Manager Specification
A clear set of goals is needed before a new heap manager specification can be designed. The job is finished when the goals have been accomplished.
5.2.1 Flat and Segmented Architectures
Flat memory model. For programmers using machine architectures that are based upon a flat (non-segmented) memory model, the choice for a heap manager interface is straightforward. There are no choices about which pointer size to use. Only one address size exists and it is usually 32 bits.
Segmented architectures. For programmers using machine architectures that are segmented, the choice for a heap manager interface is not at all straightforward. In fact, it is filled with a lot of choices. The primary reason for this is usually a concern for speed.
The following discussion centers around the segmented architecture used by Intel.
An address in a segmented architecture is composed of two parts. Part of the address is used to determine which segment is being used. The other part of the address is used to determine the byte within the segment. (See Figure 5-1).
There are two primary ways an address can be specified. The first is by specifying the full segment and offset (far address). The second is by specifying only the offset (near address), where the segment is implied to be one of the segment registers contained in the CPU. In terms of execution speed, specifying only the offset part of an address is fastest.
You may be thinking that the choice of which addressing method to use is easy. Pick near addresses for speed. The problem here is that a segment can address only 64 kilobytes of memory (i.e., 65,536 bytes). If the total amount of memory that you need to allocate is less than 64K bytes, then great; use near addresses for your heap pointers. However, if the total amount of memory that you need to allocate is more than 64K bytes, then you are forced to use multiple segments and hence far addresses.
All compilers allow different memory model programs to be built because of the different addressing possibilities: small, for when code and data together are less than 64K; compact, for code less than 64K and data greater than 64K; medium, for code greater than 64K and data less than 64K; large, for code and data greater than 64K.
This situation is complicated by the fact that all C compilers allow pointers to be tagged on a case-by-case basis as either near or far. This is called mixed-model programming.
All segmented and non-segmented issues can be isolated through the use of a set a macros that act as an interface to memory. A heap manager interface can be tailored to the specific needs of your environment and program. There is no need to come up with a super memory management routine that allows megabytes of memory to be allocated if you are writing small utility programs requiring only a few kilobytes of memory.
I feel that segmented architectures have gotten a bum rap. It is true that a 64K segment is a bit limiting, but this is only one implementation. It can be implemented in a better way. With 64-bit architectures coming down the road, there are a lot of possibilities! The biggest advantage of a segmented architecture is that every segment is protected from every other because all memory references are checked for validity in two very special ways. First, the segment value must be a valid segment. Second, the offset must be within the limits of the segment. If either check fails, the memory access generates a protection violation. Memory overwrites on a heap object are detected by the hardware, even an overwrite of 1 byte. This feature is great for debugging.
The requirements of the new heap manager interface are as follows.
Invalid memory pointers are detected and reported. First and foremost on the list is to plug the gap left by the standard C malloc() and free() functions. Invalid memory pointers passed to the memory deallocation routine are automatically detected and reported.
Memory allocation/deallocation performance must not be adversely affected. The last thing we want to do is come up with a set of requirements that are expensive to implement in terms of execution time. All objects in our class methodology are dynamically allocated and freed and we want to be careful not to adversely impact the performance of the object system.
Support for run-time type checking. All objects in the heap must have a header containing type information preceding each object. The object system (Chapter 4) requires a specific format for the two data items immediately preceding the object.
Memory is zero initialized. Memory that is allocated and freed is zero initialized. Memory is zero initialized for convenience. Knowing that all items of an object, after a NEWOBJ(), are bit initialized to zero is a nice feature to have. Memory is zero initialized when being deallocated for any incorrect references to the object after the object is deallocated. This is highly unlikely, however, because the FREE() macro sets the object pointer to NULL.
Memory allocations do not fail. This requirement makes sense only in a virtual memory environment. If a call is made to the heap management memory allocator, the call returns successfully. In the case of an error, it does not return at all and reports the error. This prevents having to place failure checks everywhere in the code. In the majority of applications, the maximum amount of memory that can be allocated by the application is well within the virtual memory limits for an individual program. In these cases, this requirement makes a lot of sense.
For the minority of applications whose memory allocations may not fit into the virtual address space limits, the memory allocator should return a failure status (NULL). Error checks must then be made throughout the code.
In the spirit of the design of C, it is up to you to decide how to implement this requirement.
Memory overwrites past the end of the object are detected and reported. The heap manager provides memory space for any type of object. This includes true objects in the object system as well as strings and whatever else is needed. Writing past the end of a dynamically allocated string should be detected and reported. Accidentally writing past the end of a class object is highly unlikely.
Storage leak detection. As a program executes, it allocates and frees memory. At program termination, if there is any allocated memory remaining in the heap, that memory is considered to be a storage leak. Any storage leaks should be reported.
Filename and line number tags. All heap objects should be tagged with the filename and line number at which the object was created. This information is useful for producing heap dumps and is required for producing useful information on storage leaks.
Symbolic heap dumps. The heap manager must know how many objects there are in the heap and must be able to walk the heap, producing useful information about all objects in the heap. The heap dumps are primarily for tracking down the cause of storage leaks.
Alignment preservation. Any special data alignment requirements of the CPU must be meet. Most RISC architectures require that data items be aligned on 2, 4, 8 or even 16-byte boundaries. Even on architectures with no absolute alignment requirements, it is usually more efficient to have aligned data items because the hardware expends extra CPU cycles to align unaligned data.
Be as portable as possible. The ideal interface would be easy to port to any system. While this is possible, the execution time on the varied platforms would probably be considerable. For this reason, two layers should be used. The first is a set of macros that present to the programmer a logical, high-level view of memory. An example of a macro like this is the NEWOBJ() macro. The second layer is the heap manager function call specification, which is used by the macros.
The only way to allocate and deallocate memory is through this set of macros. The macros, in turn, call the actual heap manager functions. If you port to another platform, simply tailor the heap manager to the particular environment, change the macros that call the new heap manager and recompile.
In practice, I have found that this technique works well.
Freeing the NULL pointer is OK. Calling the memory deallocator with the NULL pointer is allowed and the call is simply ignored. In practice, I have found this feature to be useful because it prevents having to bracket every memory deallocation in an if statement, checking to see if the pointer is non-NULL.
5.2.3 The Interface
The heap manager interface that I recommend is based upon the 32-bit model. Whether the architecture of the machine is segmented or not is not an issue (except for its performance impact). Users of the heap interface do not know and do not need to know the true nature of the 32-bit pointers being returned by the heap interface. The returned address could be composed of both segment and offset or it could be a flat memory pointer. The calling code works the same in either case.
To support run-time type checking §4.4, the memory allocator must be passed the address of a class descriptor. If the object being allocated is not a class object, NULL should be passed as the class descriptor address.
To support storage leak detection and symbolic heap dumps, a source filename and line number must be passed to the memory allocator. It is assumed that the source filename address is always accessible.
All the other heap manager requirements can be met within the heap manager. With this information, the prototypes to the heap manager interface can be written as follows.
FmNew() (far memory new) takes a memory size, a class descriptor pointer, a source filename and line number as arguments. It returns a pointer to the allocated memory.
FmFree() (far memory free) takes a pointer to a previously allocated block of memory as an argument and returns the NULL pointer. Passing NULL to FmFree() is allowed and no action is taken in this case.
FmRealloc() (far memory realloc) is used to reallocate the size of a block of memory.
FmStrDup() (far memory string dup) is used to duplicate a string. It is intended to replace strdup().
FmWalkHeap() (far memory heap walk) walks the heap displaying every item in the heap.
FmIsPtrOk() (far memory address validation) is used to validate that the given pointer is a valid address.
5.2.4 The Design
The new heap manager takes the classic approach of providing a wrapper around a system's current heap manager. The wrapper includes both code and data. (See Figure 5-2).
The code wrapper is in the form of the new FmNew() and FmFree() function calls that sit upon a system's existing malloc() and free() calls.
The data wrapper is in the form of a prefix structure that sits in front of a block of allocated memory and a postfix structure that sits after the block of memory.
The new heap manager is the only code that interfaces with the old heap manager. This is important since in order to run-time type check objects in the heap, the run-time type checking code assumes the new heap manager view, one in which heap objects are prefixed with type checking information.
The prefix structure is a container for holding information that needs to be associated with the object that is allocated. It contains type information, source filename and line number information and whatever else is needed to meet the heap manager's requirements.
The postfix structure is for detecting memory overwrites past the end of an allocated object. While it is no substitute for hardware overwrite protection and does not catch all memory overwrites, it catches the majority of memory overwrites. Most memory overwrites are caused by overwriting the end of a string buffer. This type of overwrite is caught by the usage of a postfix structure.
A way to walk the heap is still needed in order to provide symbolic heap dumps and to detect storage leaks. Some environments provide a mechanism for walking the heap, while others provide no such mechanism. Rather than making any assumptions about the environment that you are running under, heap walking support is provided by the new heap manager. The simplest way to implement this is through a doubly linked list. This requires a next and previous pointer stored in the prefix structure.
All other requirements previously specified are implementation details. The layout of the prefix and postfix structures can now be designed.
The prefix structure contains lpPrev and lpNext for maintaining a linked list of objects in the heap; lpPostfix, for pointing to the postfix structure after the heap object; lpFilename and lLineNumber, for source filename and line number support; lpMem and lpClassDesc, for run-time type checking support.
The postfix structure could contain anything since it is being used for memory overwrite detection. The contents of the structure are initialized when the heap object is created and verified when the heap object is destroyed. However, instead of filling in the structure with a constant value when the heap object is created and checking that constant when the heap object is destroyed, it is better to use a value that varies from heap object to heap object. A mechanism for achieving this result is to use a pointer back to the prefix object.
The only real outstanding issue is alignment preservation. Because malloc() returns a pointer that is correctly aligned, the prefix structure is properly aligned. The pointer that is returned from FmNew() immediately follows the prefix structure. You have two choices to make sure this pointer is aligned: Either align it in code, or make sure that the size of the prefix structure is a multiple of the alignment. The simplest technique is to make sure that the prefix structure is a multiple of the alignment. If it is not, pad the structure with dummy data items. In doing so, however, make sure that you pad at the beginning of the structure and not at the end. This is because the lpMem and lpClassDesc data items must be next to the allocated object. The prefix structure is currently correct for an alignment of sizeof(int). You can change the ALIGNMENT macro if sizeof(int) is incorrect for your environment. The CompilerAssert() macro should be used to guarantee the correct alignment.
The ALIGNMENT define and the ISPOWER2 CompilerAssert are placed near the top of the source file. The PREFIX CompilerAssert is placed in the source file after the prefix structure has been declared.
The pointer returned from FmNew() is now properly aligned. To properly align the postfix structure, use the same trick as before, but this time in code. Simply take whatever size is passed into FmNew() and increase its size, if needed, to the nearest alignment boundary.
The DOALIGN() macro aligns a number up to the nearest ALIGNMENT boundary. It assumes that there is no overflow and that ALIGNMENT is a power of two (the reason for the ISPOWER2 CompilerAssert() above).
5.2.5 FmNew() Implementation
The implementation of FmNew() is straightforward. The first thing to do is align wSize up to the nearest alignment boundary. A block of memory is then allocated, the size of which is big enough to hold the prefix structure, the user block of memory and the postfix structure.
The call is to malloc(), a 32-bit memory interface. The 32-bit memory allocator may be called by other names in other environments. Simply replace malloc() with a call that is appropriate to your environment.
If the memory allocation fails, the FmNew() code essentially exits with NULL. You may want to implement a policy where memory allocations must succeed and if they do not, the FmNew() function does not return to the calling code.
If the memory allocation succeeds, the first thing to do is add the memory block to the doubly linked list of memory blocks, by calling AddToLinkedList().
Once added to the linked list, lpPostfix is filled in to point to the postfix structure. The lpPrefix data item within the postfix structure is then filled in. It simply points back to the prefix structure. Next, the lpFilename and lLineNumber data items are initialized from the arguments passed to FmNew(). Finally, the lpMem and lpClassDesc data items are initialized for purposes of run-time type checking.
Finally, the block of user memory is zero initialized by calling memset().
5.2.6 FmFree() Implementation
The FmFree() implementation is straightforward, provided that an effective VerifyHeapPointer() can be written. This function has all the implementation problems that run-time type checking has. VerifyHeapPointer(), in order to be completely robust, must be tailored to a specific machine architecture.
Provided that the memory pointer passed to FmFree() is indeed a valid heap pointer, VerifyHeapPointer() allows the body of FmFree() to execute. The first thing to do is calculate a pointer to the prefix structure. The size of the object is calculated next and assigned to wSize. The heap object pointed to is removed by calling RemoveFromLinkedList().
Once the memory object has been removed from the doubly linked list, the memory block is initialized to zero.
Finally, free() is called to deallocate the memory block. The 32-bit memory deallocator may be called by other names in other environments. Simply replace free() with a call that is appropriate to your environment.
NULL is always returned from the FmFree() function. This is done to help implement the policy that a pointer is either valid or it is NULL (see §7.12). A pointer variable should never point to an invalid memory object. The memory macros use the NULL return value to store in the memory pointer passed to FmFree().
5.2.7 VerifyHeapPointer() Implementation
VerifyHeapPointer() code first checks to determine whether the pointer passed to it is NULL or not. Next, the lpMem pointer is validated to make sure that it is a valid pointer into the heap. This is done by calling FmIsPtrOk(). Then, a pointer to the prefix structure is obtained and the memory pointer in the structure is checked against the memory pointer being validated. Finally, the prefix pointer in the postfix structure is validated. Most types of memory overwrites past the end of an object trash the prefix pointer in the postfix structure. This is how memory overwrites are detected.
5.2.8 FmIsPtrOk() Implementation
The VerifyHeapPointer() must be called only by the heap management functions and is a local function to the heap management module. This function is complete except for the FmIsPtrOk() function. FmIsPtrOk() validates that a pointer passed to it is a realistic heap pointer.
For a non-protected-mode application, the FmIsPtrOk() function is as follows.
In a non-protected-mode environment, FmIsPtrOk() can assume only that the pointer is OK if the pointer is non-NULL and properly aligned.
5.2.9 FmWalkHeap() Implementation
An important part of any heap manager is the ability to display a symbolic heap dump.
The implementation of FmWalkHeap() is pretty straightforward. It walks the heap getting descriptions of objects and prints them out. The implementation of getting a description of an object is left to RenderDesc().
This is one possible implementation of RenderDesc(). It displays the address of an object, its filename and line number, if any, and its class descriptor name, if any.
5.2.10 FmRealloc() Implementation
Another memory management function that needs a new interface is realloc(). While not presented here, the source for FmRealloc() can be found in the Code Listings Appendix.
5.3 An Interface for Strings
How should strings be dynamically allocated from the heap? Who is responsible for calling the heap interface? The best way to approach this problem is through the utilization of an abstraction layer. It is best to implement the string interface through a set of macros.
5.3.1 NEWSTRING() Macro
The first macro is NEWSTRING(). It is used to allocate storage for a string from the heap.
NEWSTRING() can almost be considered a replacement for malloc(). The macro takes two arguments. The first argument is the name of a variable that is to contain the pointer of the allocated block of memory. The second argument is the size in bytes of the block of memory to allocate. This macro was designed this way so that the _LPV() macro could be used to avoid type casting of the pointer returned from FmNew().
To free the memory allocated through NEWSTRING(), simply call FREE().
5.3.2 MYLSTRDUP() Macro
The second macro is MYLSTRDUP(). It is a replacement for strdup() that uses the new heap manager interface.
MYLSTRDUP() is a macro interface around FmStrDup(). It is a new heap manager function that does the dirty work of duplicating the string.
If a NULL pointer is passed into FmStrDup(), a NULL pointer is returned. FmStrDup() works by first calculating the string length and then adding one for the null character. This number of bytes is then allocated by calling FmNew() and finally the string is copied into the new memory buffer.
5.4 An Interface for Arrays
Just as with strings, using macros as a level ob abstraction greatly simplifies using arrays.
5.4.1 NEWARRAY() Macro
The NEWARRAY() macro is used to create a new array with a specific number of array elements.
NEWARRAY() takes as its first argument the name of the variable that is to contain the pointer of the allocated array. The second argument is the number of array elements (not bytes) to allocate.
To free the memory allocated through NEWARRAY(), simply call FREE().
5.4.2 SIZEARRAY() Macro
The SIZEARRAY() macro is used to resize an array.
SIZEARRAY() takes as its first argument the name of the variable that points to an allocated array. If it is NULL, SIZEARRAY() allocates a new array. The second argument is the new size of the array in array elements (not bytes).
5.5 Detecting Storage Leaks
If a program allocates an object from the heap, it should also free the object before the program exits. If the object is not freed, it is an error and is called a storage leak.
Do you have any storage leaks in the programs that you write? How do you detect storage leaks? Unless you have a formal methodology in place, how do you really know?
I once wrote a large program that did not have any formal storage leak detection. The program had no problems and I would have claimed that there were no storage leaks present. However, adding storage leak detection actually turned up a few obscure leaks. They were so obscure that no matter how long the program was run, the storage leaks would never have caused a problem and this is why they were never found.
No matter how small or large a program you write, the program should always have some form of storage leak detection in place.
The simplest way to detect storage leaks is to call FmWalkHeap() the moment your program is about to exit. You will then see exactly what objects are left in the heap, if any. Because the filename and line number where the object are allocated is actually displayed, tracking down the storage leak is a snap.
5.7 Chapter Summary
This book was previously published by Pearson Education, Inc.,
formerly known as Prentice Hall. ISBN: 0-13-183898-9