Creating a GDB pretty printer from scratch
This article was original published on Medium
GDB
is a very popular debugging tool that is supported in many programming languages. Even if you are not using GDB
directly chances are that your IDE may be using it under the hood. For example vs-code extensions such as Microsoft C++ uses GDB
(or lldb
depending on configuration) under the hood. Then the next questions become, what is a pretty printer? and why you may need to write your own? To answer the first question when you set a breakpoint and watch a variable what happens under the hood is GDB
first look for a register that is registered for that value (at the particular position where you set the breakpoint). Then based on the type information for that value (not to be confused with the type of the register) first it looks for if there is a “pretty printer” that has been registered to print that value. If it fails to find such a pretty printer then the value is printed to the best of its ability. (In case you are interacting with GDB
via a tool such as vs code then the value returned by that pretty printer is what you see next to the variable name).
To answer the second question “why you may need to write your own?” there are several reasons.
- Since a pretty printer determines what you see as the value of a variable when you are debugging if you want to change what you see then you can create your own pretty printer.
- Despite the long list of languages GDB support natively your may be using (or like me actually involved in creating) a language that is not supported. In that case, writing a pretty printer to show the variables (assuming your language can be compiled into a binary that can be debugged with GDB, given that most languages are compiled using llvm or gcc should be true) correctly is useful to make debugging possible. (One good example of a language that does this is rust with rust-gdb
Setting up a pretty printer
gdb
expects us to write our pretty printer in python so let’s start by creating one with a skeletal pretty-printer
import gdb
class TaggedPrinter:
def __init__(self, val):
self.val = val
def to_string(self):
return str(self.val)
def build_pretty_printer():
pp = gdb.printing.RegexpCollectionPrettyPrinter("bal_pp")
pp.add_printer('TaggedPtr', '^TaggedPtr$', TaggedPrinter)
return pp
gdb.printing.register_pretty_printer(
gdb.current_objfile(),
build_pretty_printer())
First, we going to import gdb
module (this exists inside gdb
and you don’t need to install any new modules to use it) which gives us a few useful helper functions to hook into gdb
. First, we define our pretty printer as the TaggedPrinter
. In the least, a pretty printer needs to have a constructor which accepts a gdb.Value
(which is the value of the register you are printing) and a to_string
function that will return a string (which will be shown as the value of the variable in gdb
). In the build_pretty_printer
function we are defining our pretty printer as bal_pp
and then adding TaggedPrinter
as the printer to be used in the case of any variable whose type name matches the regex ‘^TaggedPtr$’. Finally, we are registering our pretty printer with gdb
. In order to use this pretty printer you can run source
Some language adds certain prefixes to differentiate their types so that the “actual” type name as far as gdb is concerned may not be the same as the what you used. The easiest way to figure out the correct type name is to look at debug information generated by your compiler if it has a human readable intermediate representation such as llvm-ir.
Performing pointer operation in the pretty printer
In most cases where you need to write your own pretty printer value held in the register is not the “actual” value you are interested in, but a pointer to the actual value you need. And in many cases, actual value maybe some sort of a structure as well.
If you want to do some direct manipulation of your pointer (maybe your pointer is a tagged pointer and you need to get rid of the tag before dereferencing) the easiest way to do that is to create an int out of your pointer and then perform your manipulations on it.
ptr_val = int(self.val)
new_ptr_val = ptr_val & POINTER_MASK # your pointer manipulations here
Now in order to dereference this new_ptr_val
first, you need to create a new gdb.Value
using it and cast it to a pointer of the destination type.
new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("type_name").pointer())
gdb.lookup_type
similar to when we add our pretty printer checks the executable’s debug information to figure out the type. So the type name you need to use here could be different from the name of the type depending on your compiler.
Another interesting thing you can do is to provide a pointer offset here. For example to add an offset of 4 (ptr[4]
in c) you can change the above code as fallows.
new_ptr = gdb.Value(new_ptr_val+4).cast(gdb.lookup_type("type_name").pointer())
Finally, in order to obtain the actual value you can finally dereference your pointer as fallows.
new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("double").pointer())
deref_val = new_ptr.dereference()
double_val = float(deref_val)
Please keep in mind what we get after the dereferencing is still a gdb.Value
. So we may still need to create a python value corresponding to our value using the dereferenced value. Fortunately for most basic types such as float
, int
and bool
we can do it using built-in functions as shown above.
Dealing with structure types
Sometimes your variable is not holding a simple value that can be represented by a float
or int
but some sort of structure. If your compiler includes information about your structure types you can do the following.
new_ptr = gdb.Value(new_ptr_val).cast(gdb.lookup_type("myStruct").pointer())
deref_val = new_ptr.dereference()
b = int(deref_val["b"])
Assuming myStruct
type is as follows
typedef struct {
int64_t a;
int64_t b;
} myStruct;
If this is not the case in languages like c you can also use the fact structs are continuous memory blocks and directly calculate the offset.
new_ptr = gdb.Value(new_ptr_val + 1).cast(gdb.lookup_type("int").pointer())
deref_val = new_ptr.dereference()
b = int(deref_val["b"])
[Optional] Integrating pretty printer with vs-code
In most cases, you may prefer to use an IDE/text editor to debug your programs instead of directly using GDB. However how to do this specific to the IDE/text editor in question. Therefore I’ll limit myself to discussing how to get our pretty printer working with vs-code. 1st thing we need is an extension that can communicate with gdb. For this, we will be using the Microsoft C++ extension. This extension implements a debug adapter for gdb. All we have to do is to modify the launch configuration (launch.json
) as follows.
{
"version": "0.2.0",
"configurations": [
{
"name": "debug",
"type": "cppdbg",
"request": "launch",
"program": "${workspaceFolder}/${fileBasenameNoExtension}.exe", // path to exe file
"cwd": "${workspaceFolder}",
"stopAtEntry": false,
"setupCommands": [
{
"text": "-enable-pretty-printing",
"description": "enable pretty printing",
"ignoreFailures": false
}
],
"linux": {
"MIMode": "gdb",
"miDebuggerPath": "/usr/bin/gdb"
}
}
]
}
Here the important part is the enable-pretty-printing
setupCommand
. With this, if you have configured your gdb config to source your pretty printer vs-code will use that to show variables.