6. Variable Names

Variable names are almost always prefixed with SCOPE, TYPE, POINTER and ARRAY prefixes so as to make the code significantly easier to understand without having to look elsewhere in the code to discover the type and size of the variable. This, as a side effect, speeds up code development that uses this standard, as well as providing a buffer against future bugs caused by the code not being correctly understood by the programmer using or modifying them.

There are valid exceptions to this where readability, and thus understandability, are both increased. See Prefix Exceptions for details.

6.1. Variable Name Parts

The pattern is:

 _____________Prefix____________  ____Name____
/                               \/            \
<scope><type>[<pointer>][<array>]<precise_name>

where [...] indicates an optional part of the prefix.

6.1.1. Scope Prefixes

g  = global          (in global storage)
s  = static          (in global storage, but with module [C file] scope)
i  = instance        (instance or struct scope; "instance" is from O-O terminology)
l  = local           (on stack or in registers)
ls = local static    (in global storage, but only accessible locally)
a  = argument        (on stack or in registers)
k  = linker          (symbol is provided by linker script)
r  = row or record   (in a field name of a database table, a column name of a
                      .NET System.Data.DataTable or a column name of a
                      Windows.Forms.DataGridView column)

Note

Pointers that point to dynamically-allocated objects in memory are always going to be in one of the above scopes, and thus their names will be so prefixed, but they will point to objects that are on the heap instead of to fixed locations in RAM.

6.1.2. Type Prefixes

c    = char               (character value or a single element of a string)
uc   = unsigned char      ('u' only used when unsigned-ness is important)
b    = unsigned char      (byte. i.e. uint8_t)
s    = string             (C string; NUL-terminated char array)
f    = float              (IEEE 754 32-bit single-precision floating point value)
f32  = float              (IEEE 754 32-bit single-precision floating point value)
d    = double             (caution:  see Ambiguous Meaning of double with XC Compilers below)
ld   = long double        (IEEE 754 64-bit double-precision floating point value)
d64  = long double        (IEEE 754 64-bit double-precision floating point value)
h    = handle
q15  = Q15 (_Q0_15) fixed-point (an int16_t under C30 & XC16 compilers)
q16  = Q16 (_Q15_16) fixed-point (an int32_t under C30 & XC16 compilers)
v    = void
v2   = 2D vector
v3   = 3D vector
que  = Queue              (e.g. in a .NET language)
re   = Regular Expression (e.g. in a .NET language or Python)
lst  = List(Of ...)       (e.g. in a .NET language or Python)
dec  = Decimal            (a .NET and SQL Server data type adept at
                           handling money values and is the data type
                           used by Numeric Up-Down Controls)
hash = hash table or .NET / Python dictionary (which is a hash table internally)
tpl  = tuple (.NET / Python)
e    = enum(eration)      (an int under most compilers)
u    = Union (Note there is no type prefix for a struct, since the name
              itself IS THE DATA TYPE.  A lone 'u' prefix alerts the programmer
              to the caveats associated with unions.)
bv   = bit-vector types   (e.g. arrays of unsigned integral types like u32).
dma  = variables defined as __attribute__((coherent)) or CACHE_ALIGN
       (or otherwise) that are special because they help the CPU accurately
       (coherently) share with RAM with DMA such that both entities are
       seeing "the same" data (CPU bypasses L1 cache to access this RAM).
4b   = 4-bit (used in e.g. i4baUsedModesMap where this field is the
       base of a nibble array).

6.1.2.1. Portable Integer Type Prefixes

These are heavily used for clarity and portability:

Prefix  GenericTypeDefs.h  stdint.h
------  -----------------  --------
i8      INT8               int8_t
i16     INT16              int16_t
i32     INT32              int32_t
i64     INT64              int64_t
iptr    n/a                intptr_t
ui8     UINT8    BYTE      uint8_t
ui16    UINT16   WORD      uint16_t
ui32    UINT32   DWORD     uint32_t
ui64    UINT64   QWORD     uint64_t
uiptr   n/a                uintptr_t

Circa 2013, before the release of the Microchip XC family of compilers, Microchip favored the use of the capitalized versions of the above (center column) for portability across multiple processors, from either Generic.h or GenericTypeDefs.h (one of which shipped in the ./include/ directory with each compiler). However with the release of the XC family of compilers, Microchip now favors the integer types in stdint.h for portability, which ships in the ./include/ directory with each of the XC compilers.

Texas Instruments appears to have favored the stdint.h version for portability across multiple processors for a long time.

Therefore, all new HM code and libraries being written will use the stdint.h version starting in May 2016.

6.1.2.2. Non-Portable Integer Type Prefixes

Portable integer types are normally preferred over non-portable types, but sometimes non-portable types are used intentionally for sake of speed across different platforms when the size of the integer does not matter. For example in a case where a loop counts from 0-99, then you can use int or unsigned int so that the loop is equally as efficient on an 8-, 16-, 32- or 64-bit platform.

In most cases, int and unsigned int are the size of the CPU’s registers as well as the size of (number of bits in) the data bus between the CPU and RAM, which (in most cases) causes writes to int and unsigned int values in RAM to also be an atomic operation since they are carried out with a single CPU instruction.

When these types are used, these are their type prefixes:

i    = int                (CAUTION:  not portable except where number of bits is not important!)
ui   = unsigned int       (CAUTION:  not portable except where number of bits is not important!)
l    = long               (CAUTION:  not portable across all C compilers)
ul   = unsigned long      (CAUTION:  not portable across all C compilers)
ll   = long long          (CAUTION:  not portable across all C compilers)
ull  = unsigned long long (CAUTION:  not portable across all C compilers)

6.1.2.3. Enumeration Prefix

e:

enumeration

Initially I had a difference between naming enumerations in C vs VB.NET. Specifically, in VB.NET, it “reads nicely” to have something like this:

Public Enum eButtonState_t
    Open
    Closed
    Held
    ...
End Enum

Public Struct Buttons
    eButtonState_t  iiLeftHandlebarCenterButtonState
    ...
End Struct

gButtonStates As Buttons

gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.Open
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.Closed
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.Held
etc.

However, in C, including the eButtonState_t in naming the enumeration value is a syntax error, so the advantage of the ‘e’ type prefix would be lost:

typedef enum {
    Open
    Closed
    Held
    ...
} eButtonState_t;

typedef struct {
    eButtonState_t  iiLeftHandlebarCenterButtonState
    ...
} buttons_t;

buttons_t  gButtonStates;

gButtonStates.iiLeftHandlebarCenterButtonState = Open;   /* Type knowledge lost here! */
gButtonStates.iiLeftHandlebarCenterButtonState = Closed;
gButtonStates.iiLeftHandlebarCenterButtonState = Held;
etc.

...not to mention that we VERY MUCH DO NOT WANT a name like “Open” and “Closed” in the global namespace! (God forbid!)

Over the years, I have found that in many cases, especially when writing VB.NET or C# software to work with C firmware, it can be exceedingly helpful to use enumerations with identical value names, so that I can literally copy/paste potentially large enumerations between C- and VB.NET code without having to change the names. I also find that I get the best of BOTH worlds (code readability) when I not only include the ‘e’ prefix on both the type and the enumeration names, but also include the type in the prefix of the enumeration name. Some of this might seem redundant until you see all cases and find that the code remains very readable (with no ambiguity) in all cases. Mere knowledge that it serves both the C, C# and VB.NET worlds with identical enumeration value names is enough to eliminate the annoying feeling you might get when you see the redundancy in the VB.NET code:

So in VB.NET:

Public Enum eButtonState_t
    eButtonState_Open
    eButtonState_Closed
    eButtonState_Held
    ...
End Enum
...
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.eButtonState_Open
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.eButtonState_Closed
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_t.eButtonState_Held

and in C:

typedef enum {
    eButtonState_Open
    eButtonState_Closed
    eButtonState_Held
    ...
} eButtonState_t;
...
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_Open
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_Closed
gButtonStates.iiLeftHandlebarCenterButtonState = eButtonState_Held

And so I have adopted that the ‘e’ prefix will go on BOTH type name AND enumeration value name. In my opinion, under extensive actual use, the benefit gained by being able to almost verbatim copy/paste enumerations between C and VB.NET source code files outweighs the reading redundancy in the VB.NET version, and the source code does not suffer from ambiguity in either case. So that is the policy.

See also Prefix Exceptions below for valuable and desirable exceptions to this.

6.1.2.4. Supplemental Type Prefixes (see GenericTypeDefs.h and stdbool.h)

  • bool = bool

    Caution re BOOL type: (enum values: TRUE, FALSE), because BOOL is an enumeration type, it is an int and therefore does not have consistent size across platforms. XC compilers brought bool type (from stdbool.h) which is consistently an 8-bit value, but legacy apps (before XC compilers) still use BOOL. Just be aware of the size issue.

  • bit = BIT (values: CLEAR, SET)

    Caution: bit size is not consistent across platforms — BIT is an enum and therefore an int). Both of the above (BOOL and BIT) are used when the variable will only ever contain a 1 or 0. However, the concept of bool can extend further than this such as:

    /* Don't do this. */
    lboolNandChipReady = PORTE & _PORTE_RE4_MASK;
    if (lboolNandChipReady) {
        ...
    }
    

    in such a case, lboolNandChipReady should be declared as uint32_t, (or uint16_t or uint8_t, depending on the microcontroller), to ensure it has the capacity to receive the indicated bits or whatever value is being assigned. See also note on bool below. Because of this, it can sometimes be ambiguous to use bool as a TYPE Prefix, and if it is, then disambiguating (by using the appropriate integer TYPE Prefix) is more important than continuing to use bool. Thus, for clarity, the above might well have been instead coded as:

    /* Do this instead. */
    lui32NandChipReady = PORTE & _PORTE_RE4_MASK;
    if (lui32NandChipReady) {
        ...
    }
    

    The above code makes it clear to the reader that the bit will not be lost because the variable DOES have the capacity for it. On the other hand, sometimes it is desirable to maintain the bool prefix for clarity of meaning, that it is designed to carry a TRUE or FALSE value. In the end, the CLARITY of the code, that it is VISIBLE that there are no bugs, takes precedence in such choices.

6.1.2.4.1. Note and Warning About bool

On a PIC32 platform, as an enum, BOOL is technically an int and therefore occupies 32 bits of RAM. If memory is scarce, this can be wasteful. And while some of Microchip’s compiler/linkers appear to have optimizations enabling it to only use 8 bits for this, instead of relying on those optimizations, if memory is precious, this author finds it clearer to simply declare a bool variable as UINT8 or uint8_t. When the linker groups these together (which it does when they are declared together, or when certain optimizations are turned on), then they indeed only occupy 8 bits of RAM.

Of note is the fact that Microchip’s new XC-series compiler/linkers now ship with <stdbool.h> which defines the bool type (not BOOL) as having 8 bits. Problem solved.

More importantly however is CORRECTNESS! With the above-mentioned optimization, and it is now confirmed that TI’s GNU tool chain optimizes a BOOL into something smaller, so in low-level design, the policy is never assume that something like:

lboolWasPreviouslyVisible = aHwnd->iui32Style & WS_VISIBLE;

is going to give you the intended results!

Case in point, on the TI GNU tool chain:

Given:

#define WS_VISIBLE          0x10000000ul
aHwnd->iui32Style == 0x10800000ul
BOOL  lboolWasPreviouslyVisible;  /* Local variable definition. */
lboolWasPreviouslyVisible = aHwnd->iui32Style & WS_VISIBLE;

produces a lboolWasPreviouslyVisible that is FALSE (0) on the TI GNU tool chain, whereas on the PIC32 tool chain it is TRUE! But if the variable is defined as bool with the XC32 compiler (1 byte), then the result is FALSE!

Instead, do it this way, and it will be both reliable and portable:

lboolWasPreviouslyVisible = (aHwnd->iui32Style & WS_VISIBLE) ? true : false;

or

lui32WasPreviouslyVisible = aHwnd->iui32Style & WS_VISIBLE;

See also Prefix Exceptions below for valuable and desirable exceptions to this.

6.1.3. Pointer Prefix

p:

Pointer, if the variable is a pointer, ‘p’ is always placed after the TYPE Prefix. Reason: even though it makes the ‘p’ slightly less visible in the context of a long TYPE Prefix, it follows English grammar sequence for “pronouncing” the meaning of the variable. Example: sui8paEccBytes is thus pronounced left-to-right as: “static byte pointer array Ecc Bytes”: exactly what it means in correct grammar.

pe:

Pointer in Extended Data Space (EDS) RAM. This is applicable to a certain microcontroller that we are using (dsPIC33E/PIC24E) which has RAM above the 32KB boundary addressed using 2 additional registers (DSRPAG and DSWPAG) to contain part of the full address. In this case, passing an EDS pointer as an argument to a function is indeed passing a 32-bit value instead of a 16-bit value.

fxnp:

Function pointer; here we do not use a lone ‘f’ to mean “function” since ‘f’ is already used to mean float, and thus ‘fp’ prefix would mean “float pointer”.

See also Prefix Exceptions below for valuable and desirable exceptions to this.

6.1.4. Array Prefix

a:

array; the type of the array elements is contained in the TYPE Prefix.

Examples:

lcaVarName   = local character array
li16aVarName = local signed 16-bit array
sui32VarName = static uint32_t

Exception:

If this rule and/or capitalization (e.g. of a proper name that is all caps, or a well-understood acronym) ever makes the meaning of the name difficult to read, it is entire acceptable to insert an underscore (‘_’) to separate words to make it more readable. Reason: understandability (the primary purpose of having a Coding Standard at all) is senior to Policy.

Example:

gui32SYSCLKHz    /* Difficult to read; slower to understand. */

changes to:

gui32_SYSCLK_Hz  /* Understanding is faster and clearer. */

See also Prefix Exceptions below for valuable and desirable exceptions to this.

6.2. Constants

Constants, defined with const in the C language, are named like any other variables. The gain is that the compiler will issue an error if any assignment other than an initializer is attempted. Further, for most Microchip microcontrollers, defining variable storage with the const modifier causes them to be stored in program space, which is important to know, as this can free up RAM resources for other uses, when a variable will have the same value throughout system execution. Example: the unchanging binary data used in screen fonts.

6.3. DMA Buffers

There is another kind of memory — sort of — actually it is an “address space” which is special. It’s not the memory itself, but how the CPU treats it. On the PIC32M microcontrollers, RAM is read THROUGH an L1 cache. However, this does not work with DMA. When DMA has written to an area of RAM, the L1 cache does not “already know about it” and so if the CPU reads from it, it will read zeros, for about 40 bytes until the L1 cache has to start fetching data from actual RAM again! For this type of RAM, we define it as __attribute__((coherent)) or CACHE_ALIGN (a macro in toolchain_specifics.h for the same thing). What this does is it gets the CPU to not use the L1 cache for that area of RAM and instead read and write to it directly. While it is a tiny bit slower than accessing RAM through the L1 cache, it is a requirement when exchanging data with a peripheral (and usually an external chip or device) using DMA.

And I have missed this more times than I care to admit. Had we had a variable prefix for this type of RAM before, hours and possibly days could have been saved! Thus...

‘dma’ is the prefix we now use to label this kind of RAM.

Exception:

If this rule and/or capitalization (e.g. of a proper name that is all caps, or a well-understood acronym) ever makes the meaning of the name difficult to read, it is entire acceptable to insert an underscore (‘_’) to separate prefixes to make it more readable. Reason: understandability (the primary purpose of having a Coding Standard at all) is senior to Policy.

Example:

CACHE_ALIGN
uint8_t      gdmaui8aPixelBuffer[mLLSM__OUTPUT_BUFFER_SIZE_IN_BYTES];

and

CACHE_ALIGN
uint8_t      gdmaui8aJEDEC_ID[4];

change to:

CACHE_ALIGN
uint8_t      gdma_ui8aPixelBuffer[mLLSM__OUTPUT_BUFFER_SIZE_IN_BYTES];

and

CACHE_ALIGN
uint8_t      gdma_ui8aJEDEC_ID[4];

whereas

CACHE_ALIGN
FlashImageAfterMatter_t      gdmaFlashImageAfterMatter;

is okay as it is.

See also Prefix Exceptions below for valuable and desirable exceptions to this, although this author does not expect these to be useful for COHERENT RAM, as this is always stored in global RAM space, not local.

6.4. Prefix Exceptions

Sometimes very short local or argument variable names are used IF AND ONLY IF it makes the code easier to read and understand. Reason: Purpose is senior to Policy.

There are a few “customs” that go with this — some come from the mathematics world, others from the computing world. For example:

i,j,k,ii,jj - temporary integers
i8,i16,i32..- temporary signed int with specified number of bits.
u           - temporary unsigned int
u8,u16,u32..- temporary unsigned int with specified number of bits.
              Note carefully that these can also be handily used as bit vector variables.
bv          - Where actual bit-vector types are used (arrays of unsigned integral types like u32).
c           - temporary char (possibly unsigned char if the distinction is not important)
cp          - temporary char *
b           - temporary byte (i.e. unsigned char)
bp          - temporary byte pointer (i.e. unsigned char *)
a,b,c,d     - temporary types understood from the context (e.g. point2_t)
d           - delta
n,s,e,w     - another example understood from the context (type long double in
              struct aabb2_t meaning north, south, east and west respectively).
f,d,ld      - temporary float, double and long double types. (Note
              avoidance of use of double type described above.)
p,v         - temporary pointer
x,y         - temporary Cartesian coordinates (e.g. screen location)
w,h         - temporary width and height
x,y,z,w     - often understood as Cartesian coordinates in 'vector...' types
              ('w' is used in vector4_t and is pertinent to 3D spatial
              manipulations [transformations] via matrix multiplication).
v1,v2,vOut  - local or argument vectors
other       - Sometimes a math context which by convention uses certain letters or Greek
              letters suggests certain single letters or short abbreviations be used.
              This is condoned ONLY when it makes it easier to read and understand.
e           - temporary pointer to an event, often used as argument to
              event call-back functions.
f           - sometimes used as a FILE * where <stdio.h> is being
              used for file I/O.

Mathematics also uses several short standard abbreviations for types of variables, and this is also desirable to use IF AND ONLY IF it makes the code easier to read and understand for all who will be needing to do so.

6.4.1. Illustrative Example

Compare the readability of this

for (liIndexToNextItem = 1;  liIndexToNextItem < 5;  liIndexToNextItem++) {
    lui8CheckSum = lui8CheckSum + lpBuf->iui8aDataBytes[liIndexToNextItem];
}

to this:

for (i = 1;  i < 5;  i++) {
    lui8ChkSum = lui8ChkSum + lpBuf->iui8aDataBytes[i];
}

or even better:

for (i = 1;  i < 5;  i++) {
    lui8ChkSum += lpBuf->iui8aDataBytes[i];
}

Which one is more readable? Which one takes less time to fully understand?

Exceptions: all of the above naming conventions may have exceptions and this is condoned when the exception without question increases the ease of reading and understanding the source code. Example: type aabb2_t has members n, s, e and w (not prefixed with the ‘i’ [instance member] prefix). Condoned justification: from the math context regarding AABB objects (axis-aligned bounding box), it is clear that these are type long double and mean north, south, east and west respectively, and

lMyAabb.n = 1.0;
lMyAabb.e = 0.0;

is clearly easier to read and understand than

lMyAabb.ildNorth = 1.0;
lMyAabb.ildEast  = 0.0;

or worse

lMyAabb.in = 1.0;
lMyAabb.ie = 0.0;