This article was original published on Medium

GDB is a very popular debugging tool that is supported in many programming languages. Even if you are not using GDB directly chances are that your IDE may be using it under the hood. For example vs-code extensions such as Microsoft C++ uses GDB (or lldb depending on configuration) under the hood. Then the next questions become, what is a pretty printer? and why you may need to write your own? To answer the first question when you set a breakpoint and watch a variable what happens under the hood is GDB first look for a register that is registered for that value (at the particular position where you set the breakpoint). Then based on the type information for that value (not to be confused with the type of the register) first it looks for if there is a “pretty printer” that has been registered to print that value. If it fails to find such a pretty printer then the value is printed to the best of its ability. (In case you are interacting with GDB via a tool such as vs code then the value returned by that pretty printer is what you see next to the variable name).

To answer the second question “why you may need to write your own?” there are several reasons.

Since a pretty printer determines what you see as the value of a variable when you are debugging if you want to change what you see then you can create your own pretty printer.
Despite the long list of languages GDB support natively your may be using (or like me actually involved in creating) a language that is not supported. In that case, writing a pretty printer to show the variables (assuming your language can be compiled into a binary that can be debugged with GDB, given that most languages are compiled using llvm or gcc should be true) correctly is useful to make debugging possible. (One good example of a language that does this is rust with rust-gdb

Setting up a pretty printer

gdb expects us to write our pretty printer in python so let’s start by creating one with a skeletal pretty-printer

import gdb
class TaggedPrinter:
    def __init__(self, val):
        self.val = val
    def to_string(self):
        return str(self.val)
def build_pretty_printer():
    pp = gdb.printing.RegexpCollectionPrettyPrinter("bal_pp")
    pp.add_printer('TaggedPtr', '^TaggedPtr$', TaggedPrinter)
    return pp
gdb.printing.register_pretty_printer(
    gdb.current_objfile(),
    build_pretty_printer())

First, we going to import gdb module (this exists inside gdb and you don’t need to install any new modules to use it) which gives us a few useful helper functions to hook into gdb. First, we define our pretty printer as the TaggedPrinter. In the least, a pretty printer needs to have a constructor which accepts a gdb.Value (which is the value of the register you are printing) and a to_string function that will return a string (which will be shown as the value of the variable in gdb). In the build_pretty_printer function we are defining our pretty printer as bal_pp and then adding TaggedPrinter as the printer to be used in the case of any variable whose type name matches the regex ‘^TaggedPtr$’. Finally, we are registering our pretty printer with gdb. In order to use this pretty printer you can run source inside `gdb`. However, you may find it more convenient to add that command to your `gdb` config (`~/.gdbinit` for global config).

Some language adds certain prefixes to differentiate their types so that the “actual” type name as far as gdb is concerned may not be the same as the what you used. The easiest way to figure out the correct type name is to look at debug information generated by your compiler if it has a human readable intermediate representation such as llvm-ir.

Performing pointer operation in the pretty printer

In most cases where you need to write your own pretty printer value held in the register is not the “actual” value you are interested in, but a pointer to the actual value you need. And in many cases, actual value maybe some sort of a structure as well.

If you want to do some direct manipulation of your pointer (maybe your pointer is a tagged pointer and you need to get rid of the tag before dereferencing) the easiest way to do that is to create an int out of your pointer and then perform your manipulations on it.

ptr_val = int(self.val)
new_ptr_val = ptr_val & POINTER_MASK # your pointer manipulations here

Now in order to dereference this new_ptr_val first, you need to create a new gdb.Value using it and cast it to a pointer of the destination type.

new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("type_name").pointer())

gdb.lookup_type similar to when we add our pretty printer checks the executable’s debug information to figure out the type. So the type name you need to use here could be different from the name of the type depending on your compiler.

Another interesting thing you can do is to provide a pointer offset here. For example to add an offset of 4 (ptr[4] in c) you can change the above code as fallows.

new_ptr = gdb.Value(new_ptr_val+4).cast(gdb.lookup_type("type_name").pointer())

Finally, in order to obtain the actual value you can finally dereference your pointer as fallows.

new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("double").pointer())
deref_val = new_ptr.dereference()
double_val = float(deref_val)

Please keep in mind what we get after the dereferencing is still a gdb.Value. So we may still need to create a python value corresponding to our value using the dereferenced value. Fortunately for most basic types such as float, int and bool we can do it using built-in functions as shown above.

Dealing with structure types

Sometimes your variable is not holding a simple value that can be represented by a float or int but some sort of structure. If your compiler includes information about your structure types you can do the following.

new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("myStruct").pointer())
deref_val = new_ptr.dereference()
b = int(deref_val["b"])

Assuming myStruct type is as follows

typedef struct {
    int64_t a;
    int64_t b;
} myStruct;

If this is not the case in languages like c you can also use the fact structs are continuous memory blocks and directly calculate the offset.

new_ptr = gdb.Value(new_ptr_val + 1).cast(gdb.lookup_type("int").pointer())
deref_val = new_ptr.dereference()
b = int(deref_val["b"])

[Optional] Integrating pretty printer with vs-code

In most cases, you may prefer to use an IDE/text editor to debug your programs instead of directly using GDB. However how to do this specific to the IDE/text editor in question. Therefore I’ll limit myself to discussing how to get our pretty printer working with vs-code. 1st thing we need is an extension that can communicate with gdb. For this, we will be using the Microsoft C++ extension. This extension implements a debug adapter for gdb. All we have to do is to modify the launch configuration (launch.json) as follows.

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "debug",
            "type": "cppdbg",
            "request": "launch",
            "program": "${workspaceFolder}/${fileBasenameNoExtension}.exe", // path to exe file
            "cwd": "${workspaceFolder}",
            "stopAtEntry": false,
            "setupCommands": [
                {
                    "text": "-enable-pretty-printing",
                    "description": "enable pretty printing",
                    "ignoreFailures": false
                }
            ],
            "linux": {
                "MIMode": "gdb",
                "miDebuggerPath": "/usr/bin/gdb"
            }
        }
    ]
}

Here the important part is the enable-pretty-printing setupCommand. With this, if you have configured your gdb config to source your pretty printer vs-code will use that to show variables.