Instant backtraces in C/C++

June 15, 2012

In higher-level languages when there is a problem you get a nice backtrace pointing at the problem. In C or C++ you typically just get Segmentation fault. How can we improve this?

To start with, you can catch SEGV and try to do something useful. Call a function like the following as early as possible during startup:

void BacktraceOnSegv() {
  struct sigaction action = {};
  action.sa_handler = DumpBacktrace;
  if (sigaction(SIGSEGV, &action, NULL) < 0) {
    perror("sigaction(SEGV)");
  }
}

Then the problem is only as hard as dumping a backtrace. It turns out there isn't a straightforward way of doing this. glibc has a backtrace() family of functions but they don't tell you much: for example, you can get symbols but only those in your binary and no line numbers. Google's C++ utilities include code for mapping addresses to symbols that includes a ELF reader(!).

Really, doing the full job, like getting symbols from any loaded shared objects, requires a lot of trawling through files and DWARF and /usr/lib/debug and gdb does it best. So why not just use gdb? Define DumpBacktrace as follows:

void DumpBacktrace(int) {
  pid_t dying_pid = getpid();
  pid_t child_pid = fork();
  if (child_pid < 0) {
    perror("fork() while collecting backtrace:");
  } else if (child_pid == 0) {
    char buf[1024];
    sprintf(buf, "gdb -p %d -batch -ex bt 2>/dev/null | "
            "sed '0,/<signal handler/d'", dying_pid);
    const char* argv[] = {"sh", "-c", buf, NULL};
    execve("/bin/sh", (char**)argv, NULL);
    _exit(1);
  } else {
    waitpid(child_pid, NULL, 0);
  }
  _exit(1);
}

And with that, here's an example of a program that passes a NULL to getchar(); note how the dumped backtrace includes line numbers as well as code inside glibc:

$ ./parse_run
#3  _IO_getc (fp=0x0) at getc.c:40
#4  0x0000000000401bbb in Tokenizer::Read (this=0x7fff90656dd8, token=0x7fff90656d90) at build/scan.cc:40
#5  0x0000000000402913 in Parser::Parse (this=0x7fff90656dd0, tokenizer=0x7fff90656dd8) at src/parse.cc:51
#6  0x0000000000404db5 in main (argc=1, argv=0x7fff90656ed8) at src/parse_run.cc:39

Is doing all this work in a SEGV handler legal? From a glance online it does appear to be ok to fork from there, and most of the work is done in the separate exec'd process. But clearly if you're in a SEGV handler it's possible you don't have any stack, which means the above may not work at all in practice. It at least seems to work for these toy programs.