4. Names in General
One of the greatest sources of bugs in software is names that are inaccurate. This is so much so that it deserves attention on its own, because everything depends upon the programmer’s precise understanding of what he is working with, specifically what the code does and the impact(s) it creates inside the application. The cause of the bugs then is that a programmer THINKS he understands a code element by its name (e.g. what a variable contains, what a routine does, what a function returns, etc.), and programs accordingly. If, however, a name is something OTHER than what it actually is (or is ambiguous), this kicks the door wide open for the programmer to proceed or make changes according to his MIS-understanding. And voilà: you have bugs. And just as bad: they HIDE and are especially difficult to find because of the erroneous or ambiguous name! You can be looking right at the bug and miss it because “the code looks correct!”
Furthermore, it is observed:
That programmers often create names and then discover later that a different name would more accurately represent the value contained in a variable (or what a routine does or what a function returns). The other side of that coin is when the semantics (meaning) of a variable or function has changed over time, but the name didn’t change with it. Since both of these are common, we find no fault with it. However, this fact requires that we remain alert for it, and refactor when observed to keep the code clean, maximally understandable, and therefore least likely to hide bugs.
In a hurry or not, it is often the case that the FIRST name conceived for a concept is often not the best or most accurate name. Again, as this is discovered, it is vital that we remain both prepared to find more perfect names AND set aside time to do so!
Hint
This can involve consulting a Thesaurus.
Drawing from an historical reference: 90% of the cost of software is incurred AFTER its initial release! Thus, precise programmer understanding of code is required again and again in the life of every piece of source code that spends some of its life “in maintenance”. Reason: every time that code is fixed, modified or enhanced, and it must be fully understood prior to modifying it. Thus, the actual long-term cost of an ambiguous name in source code is typically many, many times the cost of the time required to change the name so it is precise. Days, weeks, and even months saved (from not having to find and deal with bugs released into the field) is not uncommon, especially considering the cost of bugs in the field in terms of time and customer goodwill.
Thus, this senior policy is paramount in all parts of our programming:
Senior Policy
PRECISE NAMING must be a real fact within projects, and time required to achieve this is worth its weight in gold and pays for itself repeatedly during the life of a project.
Description |
Term |
What to Name It |
---|---|---|
variable |
“variable” |
what it contains (noun) |
functions that return something |
“function” |
what it returns (noun) (not what it does) |
functions that do not return something |
“command” |
what it does (imperative verb) |
Corollary
The vocabulary used within the code MUST be the vocabulary of the problem domain, and it must be PRECISELY defined within the code itself. (Reason: the programmer, and possibly other programmers who will maintain that code, are not always going to be experts in that problem domain, yet clarity and precise understanding are vital! And communication with domain experts must be precisely understood by all involved.) Failure to do this causes ambiguities within the code, and a MYRIAD of bugs, all due to the programmer not FULLY grasping what the code is doing, or the real meaning of the data and routines he is working with.
Given 1-3 above, it helps to have either an IDE or an editor that makes changing names (in both active code AND comments) safe and easy.
Also, while it might be obvious, I am spelling it out here so that there are no assumptions:
Avoid Name Conflicts
Any names created by the programmer (macro, variable, function and module names) should not have ANY overlap with C keywords or library function-, variable- or macro names in the C Standard Library. Rationale: while the compiler will not complain, doing so would invite subtle, hard-to-find bugs into the system.
Also, something that applies to all names:
Readability Is More Important than Long Names
Long ago, it was once taught in the software industry that using brief names (such as 1 letter) for variables was a bad idea. However, this author finds that such brief names have their place, when their scope is local or at least limited. Such brevity in names can, at times, increase readability (and therefore understandability), and names that are too long can reduce readability (and therefore understandability), so if a shorter name will be just as clear as a longer name, the shorter name is preferred. Note below that while global names (variable and function) are always spelled out, care should be taken so as to not make them so long as to make them difficult to read.
Note carefully that the “just as clear” criterion above is not just important, but VITAL to this policy. Example:
for (i = 0; i < liSomeArrayCount; i++) {
if (liaSomeArray[i] == 1 << i) {
break;
}
}
Note that this is ENORMOUSLY readable because there is NO AMBIGUITY about what
i
contains or what it means. Contrast this with:
uint8_t ambiguous_func_name(int r) {
if (L >= r / 12) {
G = L + r;
}
}
which could be considered nothing less than a disaster in terms of understandability. It leaves the reader wondering:
What is the meaning of
r
s contents?Why is it being divided by 12?
What is
L
?What is
G
?What impacts to the system does this have?
And yes, I have seen this kind of code and had to decipher what it was doing,
and yes L
and G
were global variables. Bugs love to lurk in code like this.