What is a module? Is it simply a source file that contains functions? In a high-level sense, yes, but it is also much more. Anyone can write some functions, place them in a file and call them a module. For me, a module is the result of applying a well-founded set of techniques to solving a problem.
6.1 Small versus Large Projects
How are small projects usually designed? It has been my experience that small projects are usually written by one person over the course of several hours to several days. In this environment, there is no real concern given to splitting the project into several well-defined source files. More than likely, the small project ends up simply being one source file.
For small projects, this one source file design works quite well, but what about large projects with hundreds of thousands of lines of code and hundreds of source files?
The challenge in working on any project is applying a well founded set of techniques from day one, because every project, no matter how big it is today, started out as a small project.
6.2 The Key
The key to successfully coding a hierarchy of modules is that you must always code what to do, not how to do it.
In other words, implement the solution to a problem once, not multiple times. A simple, but effective example is copying a string from one location to another. Do you use while (*pDst++=*pSrc++); or do you use strcpy(pDst, pSrc);? This may seem like an absurd example, but study it carefully. The while loop is specifying how to copy a string from one location to another, while strcpy() is specifying what to do, leaving the how to strcpy().
This point is so key that I will repeat it. The while loop is specifying how to copy a string from one location to another, while the strcpy() is specifying what to do, leaving the how to strcpy().
I chose strcpy() on purpose because I do not know anyone who would argue for using the while loop instead of strcpy(). Why is the choice of which one to use so obvious?
More than likely, some of the code you have written contains code fragments that specify how to do something instead of specifying what to do.
6.2.1 An Example
Let me give you a prime example that comes straight from programming in a GUI environment. In a dialog box, there exists an edit control that contains text that the user can edit, but suppose you want to limit the number of characters that the user can type. How do you do it? In Microsoft Windows, you send a message to the edit control, informing it of the text limit. This is done with the SendDlgItemMessage() function and the EM_LIMITTEXT message.
So, whenever you want to limit the size of an edit control, you call SendDlgItemMessage(), right? I do not think so!
The problem is that by calling SendDlgItemMessage() directly, you are specifying how to limit the text size.
The solution is to code a new function, let's call it EmLimitText(), that calls SendDlgItemMessage(). Whenever you want to limit the text size of an edit control, you call EmLimitText() instead of SendDlgItemMessage().
By doing this, you now have the basis for an entire module. For all messages that can be sent to edit controls, provide code wrappers that specify what to do and let the functions call SendDlgItemMessage(), which specify how to do it.
6.2.2 Another Example
The class methodology is a prime example of specifying what to do and not how to do it. Consider what coding is like without the class methodology. It is up to each and every module to directly access a data object to perform whatever action is necessary. By directly accessing the object, the code is specifying how to perform the action.
The class methodology forces modules to use method functions to perform an action upon the object. The code is specifying what do to, leaving the how up to the method function.
6.2.3 The Advantages
By far the single biggest advantage of using the technique of specifying what to do instead of how to do it is that it allows you to radically change the how (the implementation), without having to change the what (the function calls).
Another advantage of this technique is that it allows you to make changes to the implementation and recompile only one source file. Since the function interface remains the same, no other source files have to be recompiled.
The turnaround time in making a change and testing it are also dramatically reduced because only one source file needs to be recompiled instead of tens or hundreds.
Another benefit of this technique is that code size is reduced. Instead of repeatedly spelling out in code how to do something, you are now making a function call, which in most cases reduces the code size.
6.2.4 The Goal
The goal in using this technique in your programs is to reach the point at which implementing new functionality is simply a matter of making a few function calls. This is obviously ideal and does not happen all the time, but the more this technique is used, the more frequently it starts to happen.
The key to this technique is to always code what to do, not how to do it. In other words, you never want to reinvent the wheel. Instead, implement it once and be done with it! Be careful, however, that you design a solid interface that stands up to the test of time. You do not want to change the interface later, since this requires all modules that use the interface to change as well.
This technique is easy to use when the implementation of something is hundreds of lines long and the obvious choice is to make a function call. However, the real power of this technique comes when it is used extensively in a project, even for implementations that are several lines to one line long. It takes a lot of discipline to use the technique in these cases, but the payoff comes the next time when all you need is a function call.
6.3 What Is a Module?
A module is a source file with well-defined entry points, or method functions (methods), that can be called by other modules. It is the result of applying a well-founded set of techniques to solving a problem. There are primarily two types of modules: code wrapper modules and class implementation modules. Some code wrapper modules also implement a class and hence are considered class implementation modules as well.
Code wrapper modules provide functions that are primarily code wrappers. Most code wrapper functions are independent and stand on their own. Because of this, they can be moved to another module without any compilation problems. Functions are grouped in a module because they provide similar functionality.
An example is the heap manager module. It sits on top of C's memory allocation routines to provide a cleaner, more robust interface into the system.
Another example is a module that provides code wrappers around all messages that can be sent to edit controls in a GUI environment. An example is using EmLimitText() instead of calling SendDlgItemMessage() directly.
6.4 Designing a Class Implementation Module
Class implementation modules provide functions that implement a class. The functions must be present in the module for a successful compilation. Functions are grouped in a module because the functions, as a whole, implement the class.
An example of a class implementation module is the random number generator discussed in §4.7.
6.4.1 The Interface or API
By far the toughest and most important part of designing a module is designing the API (Application Programmer's Interface) to the new module.
Once an API is chosen and implemented, it is not likely to change much over time. This is because as a module gets used more and more by other modules, a change in the API is expensive in terms of the time and effort required to implement and debug the change.
The first step in creating a new class implementation module API is deciding upon a handle name that is used to refer to objects created by the API. The handle name must begin with an H and be in uppercase. A name is chosen so that it is obvious, unique, relatively short and has a good mixed case name, usually spelled the same as the handle. You also need a module name so that functions of the module follow the module/verb/noun naming convention.
In the random number generator, HRAND was chosen as the handle name. It is pretty obvious, unique from all other handle names and relatively short. The variable name used to refer to objects of the class is hRand. Rand is the module name used to prefix method functions, as in RandCreate() and RandDestroy().
Suppose we are creating a class module that provides a code wrapper around opening, reading, writing and closing operating system files. In this case, the code wrapper module is a class implementation module. I would choose a handle name of HDOSFH (Handle to the DOS File Handle), a variable name of hDosFh, a module name of Dos and method function names of DosOpenFile(), DosRead(), DosWrite() and DosCloseFile().
All modules that implement a class have at least two method functions: one method function that creates the object (for example, RandCreate() and DosOpenFile()) and another method function that destroys the created object (for example, RandDestroy() and DosCloseFile()).
An important concept of class modules is that the method functions act upon dynamically allocated objects, objects that are created and destroyed using method functions. The method functions do not act upon static objects.
6.4.2 An Interface Specification Template
The interface specification for new class modules follows a general template. First of all there is a NEWHANDLE(HOBJ) declaration that is at the top of a global include file. This adds a new type checkable data type, named HOBJ, into the system. It is declared near the top of the include file, next to all other NEWHANDLE() declarations, so that, if needed, HOBJ may be used in the prototypes of other USE_ sections.
The NEWHANDLE(HOBJ) declaration is not included in the USE_HOBJ section because to use only the HOBJ data type in other USE_ prototype sections would require the entire USE_HOBJ section to be included first. Why include the entire section when all that is needed is access to NEWHANDLE(HOBJ)?
The function prototypes for the method functions of the class appear later in the include file. A create method that returns a handle to the created object is present along with all other method functions for the class (which expect the object handle as the first argument). By convention, NULL is always returned by the ObjDestroy() method. The arguments and return type of all other method functions depend upon the class implementation. Usage of APIENTRY is discussed later in this chapter.
Notice that the module prototypes are enclosed within an #ifdef / #endif section. This implements the USE_* include file model, which is discussed next.
6.5 The USE_* Include File Model
When I first started coding in C, I followed the traditional technique of placing data declarations and function prototypes in a separate include file and explicitly including it in whatever source files needed access to the information.
This traditional technique works for small projects, but it does not scale to large projects well, because before you know it, you end up with a lot of include files and a lot of interdependencies between them. To avoid headaches you end up including almost every include file in every source file. At least that is what happened to me. Also, compilation times got longer and longer.
Using the class mechanism described in Chapter 4 eliminated the data interdependency problem, but it still left me with long compile-times and a lot of include files. How could I eliminate all the include files and speed up the compile-times?
As the project I was working on got larger and larger, I began to notice something interesting. The include file model for a small project and a large project are different.
A small project tends to have a few source files with a lot of functions that perform a lot of varied tasks. This requires a lot of include files in each source file.
A large project tends to have highly specialized modules that contain a few method functions and a few internal support functions which implement a specific class object. This requires just a few include files in each source file.
The solution to all my problems came in the form of one global include file with all but a few sections enclosed in #ifdef USE_HOBJ / #endif sections. At the top of the global include file are all NEWHANDLE() declarations and function prototypes to commonly used system level code. Within each #ifdef USE_HOBJ / #endif section are the prototypes for the HOBJ class, where HOBJ is a place holder for the name of the class.
At the top of every module is a list of #define's enumerating what that module uses. Following this is a #include of the global include file.
The combination of the class mechanism and the new USE_* model caused the compilation times to improve dramatically and reduced the number of include files to just one.
6.5.1 Precompiled Headers Surprise
By now, many of you are probably wondering why precompiled headers were not used instead? Well, I tried them. They increased the build time of my large project by 50% percent. I was surprised, but the reasons make sense.
My project is huge with all modules being specialized. Because of this, each module is really not including that much because of the USE_* include model. However, the precompiled header that was being used contained the include information that was needed by all modules. The precompiled header was huge, but the compiler was able to load it fast and start compiling the source file right away. So the extra build time was not due to the huge precompiled header size. Then why was the build time of the project 50 percent longer?
My best guess is that the huge precompiled header was causing longer symbol table search times within the compiler. Without the precompiled header, only a few sections of the include file were being included, causing the compiler symbol table to be practically empty and short symbol table search times.
6.6 Coding the Module
Now that the module interface has been designed, it is time to start coding the module. It is not enough that a module be coded bug-free. It must also look good and be documented well. The true test of a well-written module is if your coworkers can take that module and read it, understanding everything that is going on without any help from you.
What looks good is highly subjective. However, I highly recommend that you pick some documenting style that works for you and subject the style to a review by your coworkers. After all, you have to read and modify their code and they have to read and modify your code!
6.6.1 The Copyright Header
At the top of every module (source file), there should be a copyright notice. Where I work, it looks something like the following.
(Company Name), (project name) and (date) are place holders to be filled in by you.
6.6.2 Module Documentation
Following the copyright header is a comment section that describes the module as a whole.
The OUTLINE section. Use this section to describe why the module needs to exist. What is it doing? Pretend that a coworker walked up to you and asked what you are working on.
The IMPLEMENTATION section. Use this section to spell out the major algorithms that you are going to use to implement the module. Again, pretend that a coworker asked you how you are going to implement the module that you just described to him or her.
The NOTES section. This section is a catchall section where you can put anything you want. I use this section for notes that would be helpful to anyone who has to modify the code months down the road. Another use is to document special situations that must be tested before the modified code can be checked back into a version control system.
Usage of pm in the comment header is used by an automatic documentation tool.
6.6.3 Include File Section and USEWINASSERT
Following the module documentation is a series of #define USE_'s followed by the #include of the global include file and USEWINASSERT.
A module always has at least one #define USE_*, because you always want #included the section of the include file for the module you are working on.
6.6.4 The Class Declaration
What follows next is usually a class declaration for the object that is being implemented by the module. The class declaration for the random number generator looks like the following.
6.6.5 Prototypes of LOCAL Functions
Following the class declaration are the prototypes for functions that are used and defined only in this module. It is important for proper error checking by the compiler that every function be prototyped before being used or called.
The method functions of the module are prototyped in the global include file and the proper #define USE_* causes them to be included. The functions that are local to this module must not be prototyped in the global include file because they are not part of the module interface that is called by other modules. Instead, they are private to the module and prototyped in the module.
6.6.6 APIENTRY (Method) Functions
I usually organize my module files so that all the functions that are entry points into the module appear at the top of the source file and all local functions appear after the entry point functions.
It is important to provide a comment header for every single function in the module that properly documents the functions. Where I work, a comment header that looks like the following is used.
The DESCRIPTION section. The section starts off with a terse description of the function in parentheses. Following this are the initials of the programmer(s) who worked on this function. The body of this section contains a sentence or two that describe what the function does.
The ARGUMENTS section. This section spells out the arguments that the function takes. There is one line per argument. The name of the argument is listed, followed by a dash and a short description of the argument.
The RETURNS section. This section describes the value that is returned by the function. If there is none, place (void) here.
The NOTES section. The notes section is optional and does not appear in all function comment headers. When present, it serves the same purpose as the notes section in the module comment header. I use this section to leave notes that would be helpful to anyone who has to modify the code months down the road.
Pf in the comment header is used by an automatic documentation tool.
Following the comment header is the entry point function itself. The template for an entry point function looks like the following.
Notice that the new-style standard C form of declaring the argument list is used. Also, following the ending brace of the function is a comment containing the name of the function the end brace belongs to. The only thing remaining is the usage of APIENTRY.
APIENTRY is a macro that is used to assign attributes to entry point functions. The important thing to remember is that APIENTRY is used to present a logical view to the programmer. The actual implementation of APIENTRY varies from environment to environment.
For example, if you are programming under the Intel segmented architecture and using Microsoft's C8 compiler, FAR and PASCAL are actually defined to be something (see §3.2.1). Due to the segmented architecture and the possibility of near or far code, APIENTRY functions must be accessible to other modules and hence, declared as FAR. Using PASCAL is optional but provides a savings in code size due to how arguments are pushed and popped (see §2.1.11).
If you are programming in a 32-bit flat model environment, FAR and PASCAL are defined to be nothing, so the function is public and accessible to other modules, which is the desired behavior.
Finally, if you are programming under Microsoft Windows and writing a DLL, you want to ensure that your API functions are exported. This is done with the EXPORT keyword.
APIENTRY is specifying what to do, not how to do it. The how is left to a macro that can be easily changed at any time. This also allows for a single code base that can be targeted to multiple platforms without any code changes.
6.6.7 LOCAL Functions
In the process of implementing modules I believe you will quickly find out that it is not always possible or desirable to fully implement an APIENTRY function in one function. You will end up calling support or helper functions that are private to the module you are working on.
The template for a LOCAL function is almost identical to an APIENTRY function.
The comment header and body style are the same except that APIENTRY has been replaced by LOCAL.
Because LOCAL functions are intended to be private functions, callable only by the module they are contained in, you should try to ensure that they can be called only by the one module. Another potential problem is how to name the LOCAL functions. You do not want to have to worry about name conflicts with LOCAL functions in other modules.
Luckily, C provides a solution to these problems.
If a function is declared as static, it is guaranteed by C that the function is visible only to the source file that declared the static function. What this means is that the function cannot be called from other source files because the function is not even visible to them. You no longer have to worry about name conflicts of LOCAL functions between modules because the LOCAL function names are not even visible to the other modules.
If you are programming in a 32-bit flat model environment, LOCAL is defined to be static because NEAR, FASTCALL and PASCAL are all defined to be nothing (see §3.2.1). However, if you are using Microsoft C8, several optimizations can be applied to LOCAL functions.
The first optimization is the usage of NEAR. This tells the compiler that the function is in the same code segment as the caller of the function and that a near (16-bit) function call should be used instead of a far (32-bit) function call. There is a big performance hit when you compare the execution speed of a far call to that of a near call. A far function call can be up to five times as slow as a near function in using Intel 80486 protected-mode.
The second optimization is the usage of FASTCALL. This instructs the compiler to attempt to pass as many arguments as possible to the function through the CPU's registers instead of on the stack.
The LOCALASM define is used for local functions containing assembly code. This is needed for Microsoft C8 because assembly code and the register calling convention are incompatible. Using PASCAL provides a savings in code size due to how arguments are pushed and popped (see §2.1.11).
6.8 Chapter Summary
This book was previously published by Pearson Education, Inc.,
formerly known as Prentice Hall. ISBN: 0-13-183898-9