HOME BLOG ARCHIVE TAGS

Code Pinpointing - Zen Approach #4

June 14, 2013

This is part of a series of posts dedicated to code-pinpointing. If you missed Zen Approach #1, Zen Approach #2, or Zen Approach #3, please check them out. We may build upon it.

Today we’ll talk about special breakpoints (bpxs). As GNU/Linux techniques are always welcome by readers, that’s the environment of choice for the post. Known debugging facilities will be discussed, and an (almost) undocumented one will be presented. As usual, let’s catch up with the basics first.

Bpxs are widely used by developers to support the “break, step, inspect” debugging cycle. They are usually underrated, and programmers fall into the trap that there isn’t much more to say about them.

At this point in time, loyal blog readers already know that I rarely use the aforementioned (boring) cycle to improve/fix the software I have to work with. For me, debuggers are reserved to the more noble tasks of doing reverse-engineering, aiding in post-mortem debugging, and/or digging into the internals of programs/operating-systems. With this perspective in mind, regular bpxs are normally less useful to pinpoint relevant code. The special ones are more productive and straight.

Common bpxs fire every time a program reaches a determined place. But (good) debuggers offer plenty of more sophisticated options. Among them, two are of great importance: the so called “data bpx”, and the “conditional bpx”.

Data bpxs fire when a memory address is read from and/or written to. They are very useful to let the hardware track data-structure/memory access or change for us (even if the CPU doesn’t provide specialized support for this, the functionality can be emulated with code tracing; it’s slower, but works, nonetheless).

Conditional bpxs fire when a predicate expression is evaluated to true. Each debugger has its own syntax for creating them, and our focus will be gdb. In real-world scenarios, programs may crash (or start to misbehave) only after long iterations. That’s why debuggers can attach conditions to such “pause” events.

These two classes of bpxs are really cool. They help us minimize the time spent stepping. In some cases, they’re the only way to break/pinpoint problematic code, without resorting to special tools (e.g., with memory corruption issues). A small test program helps to illustrate their usage (compiled with “gcc -g -O0 bpx_tester.c -o bpx_tester.exe“):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    int x = -1;
    int i = -1;

    srand( (unsigned int)time(NULL) );

    for ( i = 0; i < 100000; i++ ) {
        x = rand();
    }

    return 0;
}

The command “break bpx_tester.c:12 if ( (x % 2) == 0 )” does the job of creating a conditional bpx. It tells gdb to stop at line 12, any time ‘x’ is about to be set, and has an even value. For bpx removal, we use “del 1” (command “info b” lists bpxs and watchpoints).

To exercise a data bpx (a watchpoint, in gdb parlance), once again, we resort to the dumb tester above. As rand() is called to generate a pseudo-random integer, ‘x’ receives a non-deterministic value after each loop round. A regular data bpx would be overkill here; the effect of tracking ‘x’ accesses/changes would be the same of setting a regular bpx at line 12. How can we deal with situations like this? The answer lies in gdb rich cmd-set. It allows conditions/predicates to be attached to watchpoints!

There are three types of data bpx commands supported by gdb: watch, rwatch, and awatch. In practice, their usage/syntax is identical. Only their meaning is different. When we want to track data access (reads, e.g.), rwatch is the way to go. For data change (memory writes, e.g.), watch is our guy. For both reads or writes to memory/data, awatch does the trick.

The command “awatch x if ( x > 10000 && x < 100000)” creates the watchpoint (the condition choice was arbitrary, intended just to exemplify the technique).

The coolness factor of these powerful bpxs doesn’t stop here. Consulting the online documentation, it’s possible to see that threads, variable scopes, emulation, hardware registers, strings, etc., are taken into account by gdb, whenever a bpx is created.

UNDOCUMENTED HAZARDS

It’s not uncommon for C/C++ programmers to deal with memory leaks. I believe code pinpointing techniques should also be useful to help tracking them. Wouldn’t it be awesome if we could employ some special bpxs for this?

Knowing that glibc has some powerful memory manager hooks (see mtrace and mcheck), I went for its source code, looking for something to break into. Using glibc 2.17 mtrace.c, I found pretty interesting stuff, exactly the way I was expecting - a known memory location to put some special bpxs in, and be called whenever something important happened.

After some spelunking, I noticed that this infrastructure was almost undocumented (some hints from the code itself can be found on missing linux manual pages). And realized that the implementation is not supposed to be used with data bpxs.

A special glibc function, called tr_break(), should be used as a bpx target (a pinpointing location!), to break into debugger every time memory management functions are called. It’s a mechanism similar to the way glibc and gdb “talk” to each other when doing dynamic linker event notification/interception.

These hooks are easy to trigger: in a debug session, before calling mtrace(), a special glibc variable (called mallwatch) should be set to a non NULL value (see gdb’s doc section Assignment to Variables). This is one of the conditions that is supposed to activate malloc hooks (and their tracing logic). After that, a regular bpx can be set on tr_break(). To signal our interest in any special address (what would trigger the malloc hook calls to tr_break), once again, we set mallwatch to this special value.

After messing around with some tests, unfortunately, the bpxs to tr_break() didn’t seem to fire in any possible way. A little frustrated, and with only an old CentOS 5.6 box at hand, I leveraged the debugging sessions to investigate glibc release code. Apparently, the malloc hooks do not have the tracing code compiled in! It may be a gcc issue (optimizing away calls to “nop” functions like tr_break), or a problem with the way I was trying to approach this specific tracing functionality.

I asked for the help of glibc malloc subsystem maintainer, that was promptly received. He told me to file a bug report (which has more disasm details).

I’m still waiting to see the subject’s outcome, so I can conclude my research. I hope this is a misuse/misunderstanding on my part, instead of a real glibc bug (would mean less work to be done by the busy/competent community behind this major free-software component).