find-maint: Make the Compiler Find the Bugs
4.1 Make the Compiler Find the Bugs
===================================
Finding bugs is tedious. If I have a filesystem containing two million
files, and a find command line should print one million of them, but in
fact it misses out 1%, you can tell the program is printing the wrong
result only if you know the right answer for that filesystem at that
time. If you don't know this, you may just not find out about that bug.
For this reason it is important to have a comprehensive test suite.
The test suite is of course not the only way to find the bugs. The
findutils source code makes liberal use of the assert macro. While on
the one hand these might be a performance drain, the performance impact
of most of these is negligible compared to the time taken to fetch even
one sector from a disk drive.
Assertions should not be used to check the results of operations
which may be affected by the program's external environment. For
example, never assert that a file could be opened successfully. Errors
relating to problems with the program's execution environment should be
diagnosed with a user-oriented error message. An assertion failure
should always denote a bug in the program.
Avoid using 'assert' to mark not-fully-implemented features of your
code as such. Finish the implementation, disable the code, or leave the
unfinished version on a local branch.
Several programs in the findutils suite perform self-checks. See for
example the function 'pred_sanity_check' in 'find/pred.c'. This is
generally desirable.
There are also a number of small ways in which we can help the
compiler to find the bugs for us.
4.1.1 Constants in Equality Testing
-----------------------------------
It's a common error to write '=' when '==' is meant. Sometimes this
happens in new code and is simply due to finger trouble. Sometimes it
is the result of the inadvertent deletion of a character. In any case,
there is a subset of cases where we can persuade the compiler to
generate an error message when we make this mistake; this is where the
equality test is with a constant.
This is an example of a vulnerable piece of code.
if (x == 2)
...
A simple typo converts the above into
if (x = 2)
...
We've introduced a bug; the condition is always true, and the value
of 'x' has been changed. However, a simple change to our practice would
have made us immune to this problem:
if (2 == x)
...
Usually, the Emacs keystroke 'M-t' can be used to swap the operands.
4.1.2 Spelling of ASCII NUL
---------------------------
Strings in C are just sequences of characters terminated by a NUL. The
ASCII NUL character has the numerical value zero. It is normally
represented in C code as '\0'. Here is a typical piece of C code:
*p = '\0';
Consider what happens if there is an unfortunate typo:
*p = '0';
We have changed the meaning of our program and the compiler cannot
diagnose this as an error. Our string is no longer terminated. Bad
things will probably happen. It would be better if the compiler could
help us diagnose this problem.
In C, the type of ''\0'' is in fact int, not char. This provides us
with a simple way to avoid this error. The constant '0' has the same
value and type as the constant ''\0''. However, it is not as vulnerable
to typos. For this reason I normally prefer to use this code:
*p = 0;