A Beginner's Inspection of a Program's Memory with GDB

It interesting to see what programs actually do in the low level. Especially after all the time spent developing software on high level languages, and learning the theoretical aspects of computers of computers.

For instance, in The Rust Book, Chapter 15: Smart Pointer / Box, we have this simple program:

#[derive(Debug)]
enum List {
    Cons(i32, Box<List>),
    Nil,
}

use crate::List::{Cons, Nil};

fn main() { 
    let _list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
    dbg!(_list);
}

Let’s build them, resulting in an executable binary:

❯ cargo build
   Compiling rsplayground v0.1.0 (/home/riosw/Developer/rsplayground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s

We run the program with GDB:

> gdb /home/riosw/Developer/rsplayground/target/debug/rsplayground
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

In gdb, we type in commands to try to dissect the program’s execution. Here, we would like gdb to stop executing after the _list variable is created, in order to try and inspect its values. We can check our source code to find where _list is created by using GDB’s list command:

(gdb) list
1       #[derive(Debug)]
2       enum List {
3           Cons(i32, Box<List>),
4           Nil,
5       }
6
7       use crate::List::{Cons, Nil};
8
9       fn main() { 
10          let _list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
(gdb) list
11          dbg!(_list);
12      }

We type a break command of b 11 which means we attach a breakpoint to line 11 of the program. When we run this program with run, the program will stop just before executing the line 11, but after executing line 10, which initiates the _list variable.

(gdb) run
Starting program: /home/riosw/Developer/rsplayground/target/debug/rsplayground 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, rsplayground::main () at src/main.rs:11
11          dbg!(_list);

We confirm that the variable _list exists, for example by using print followed by variable name:

(gdb) print _list
$1 = rsplayground::List::Cons(1, 0x5555555aebe0)

Or, info locals, which “displays the local variable values in the current frame”.

(gdb) info locals
_list = rsplayground::List::Cons(1, 0x5555555aebe0)

In any case, we see that _list is a Cons with value 1 and a pointer to memory address 0x5555555aebe0 which should point to another Cons with value 2, as defined in line 10 of our program.

10          let _list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));

Let’s try to see what is in 0x5555555aebe0 by using x command which displays the memory content at given address. We also add /4x to check the next 4 memory address contents, with “x” or hexadecimal format:

(gdb) x /4x 0x5555555aebe0
0x5555555aebe0: 0x00000000      0x00000002      0x555aebc0      0x00005555

Here we see two interesting things, the 0x00000002 shows our second Cons value, that is 2. The third and fourth memory address, constructs another memory address of 0x5555555aebc0 (by combining 0x555aebc0 with 0x00005555 and read them in backwards) which points to our third Cons object:

(gdb) x /4x 0x5555555aebc0
0x5555555aebc0: 0x00000000      0x00000003      0x555aeba0      0x00005555

Here, we found our value of 3. And also pointer to the final object, 0x5555555aeba0 which is actually just an empty value:

(gdb) x /4x 0x5555555aeba0
0x5555555aeba0: 0x00000001      0x00000000      0x00000000      0x00000000

Now, we might wonder why is there a value of 0x00000001 in our 4th Cons, while our 1st-3rd Cons had the first 32-bit value of 0x00000001. To answer this, remember that our List enum is constructed like this:

enum List {
    Cons(i32, Box<List>),
    Nil,
}

Hence, the first value of 0x00000000 represents a Cons, while the first value of 0x00000001 represents a Nil. We can check this by reversing the order of the Enum:

enum List {
    Nil,
    Cons(i32, Box<List>),
}

which results in the reversed 0x00000000 and 0x00000001, as such:

(gdb) info locals
_list = rsplayground::List::Cons(1, 0x5555555aebe0)
(gdb) x /4x 0x5555555aebe0
0x5555555aebe0: 0x00000001      0x00000002      0x555aebc0      0x00005555
(gdb) x /4x 0x5555555aebc0
0x5555555aebc0: 0x00000001      0x00000003      0x555aeba0      0x00005555
(gdb) x /4x 0x5555555aeba0
0x5555555aeba0: 0x00000000      0x00000000      0x00000000      0x00000000

I definitely was not expecting the first value will be used for the purpose of determining the Enum variant. In fact, I totally forgot that there was an Enum to implement the recursive Cons and Nil. This really shows the insights and learnings which can be found when we delve deep into a program.

Note that this does not seem like to be the best way of analyzing and browsing through memory used by a program. Valgrind seems like an interesting tool for this. In any case, I hope this post illustrates the basic idea of using GDB, and that our program can be dissected as such that we can see what is actually happening inside.