HOME BLOG ARCHIVE TAGS

The Best Optimizer Is between Your Ears”

April 07, 2016

When I started to study software optimization, Michael Abrash was the go-to reference, specially for low-level graphics programming. His “zen” series is priceless (one Abrash classic can be read online, highly recommended [1]). And his old adage is still very much valid: “the best optimizer is between your ears”.

Last month, I helped to optimize a task that took more than 10 minutes. After the usual heavy-lifting, it went down to ~7.5 minutes. Finally, the profiler narrowed one of the remaining resource hogs: atoi(), consuming 2+ minutes.

Typical libc atoi() implementations delegate to one of the strtol() variants (or to a common core, like in Microsoft CRT case).

This raised the issue of generality. These primitives were designed to handle a variety of input formats, but libc was called only to do controlled year conversions. Leveraging this fact, following “dumb” procedure replaced atoi():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
//
// year == "[D][D]DD", len(year) == 2|3|4;
//
static inline int year_2_int(const char *year)
{
    char year_buf[] = { "0000" };

    const size_t YEAR_LEN = strlen(year);
    const size_t OFFSET   = strlen(year_buf) - YEAR_LEN;

    for ( size_t i = 0; i < YEAR_LEN; i++ ) {
        year_buf[OFFSET + i] = year[i];
    }

    year_buf[0] -= '0';
    year_buf[1] -= '0';
    year_buf[2] -= '0';
    year_buf[3] -= '0';

    return ( (year_buf[0] * 1000) +
             (year_buf[1] * 100) +
             (year_buf[2] * 10) +
             (year_buf[3] * 1) );
}

Without extra checks and unnecessary function calls, some “magic” kicked in. Optimizing compilers just inlined and unrolled resulting code, making the transformation cost almost vanish.

Coding optimization is a complex process. Know your data better than any software, measuring and understanding your execution context.


Notes:
[1] - another great reference is Inner Loops, by Rick Booth; it’s filled with good theory, practical examples, and tricky assembly language;