Writing a debugger in Rust pt.2

This is the second article of a serie that will explore how we can build a simple debugger in Rust. In this article, we add the possibility to put breakpoints to our debugger, and we create a way to interact dynamically with our debugee.

We have seen in the previous post of the serie how to get started with our debugger, we’ve even been able to track syscalls made by our debugee ! Now, it would be great if we could interact dynamically with our debugee. This would allow us to explore the process state more in-depth when it is paused, to manually resume execution, possibly single-step through the process, and potentially add breakpoints afterwards !

Interacting dynamically with our child process

First of all, let’s start by cleaning our code a bit. We will certainly need to create functions to query user input, parse it, and act accordingly. It may be a bit too much to keep everything in main.rs. Let’s go ahead and create a debugger.rs file.
For now, we will just store a very simple run function in it.

use nix::sys::ptrace;
use nix::sys::wait::waitpid;
use nix::unistd::Pid;

pub fn run(child: Pid) -> () {
    let _ = waitpid(child, None).expect("Failed to wait");
    println!("Process started");
    loop {
        
    }
}

Because we will use the run function outside of our module, we need to mark it as public with the pub keyword. Let’s use this function in main.rs.

mod debugger;

fn main() {
    let fork_result = unsafe { fork() }.expect("Failed to fork");
    match fork_result {
        ForkResult::Parent { child } => {
            debugger::run(child);
        }
        ForkResult::Child => {
            ptrace::traceme().expect("Failed to call traceme in child");
            let path: &CStr = &CString::new("../debugee/target/release/debugee").unwrap();
            nix::unistd::execve::<&CStr, &CStr>(path, &[], &[]).unwrap();
            unreachable!("Execve should have replaced the program");
        }
    }
}

Note that because we created a new module, we need to import it in main.rs. Also, since we do not use waitpid in main.rs anymore, we can remove it from the list of imports.

Now, let’s actually do something in our run function.

Asking for user input on pause

Every time the child is stopped, we will ask the user what he wants to do. This will be done with a very simple text interface. The user will be able to ask any of the following instructions:

Continue the process until completion (or the next breakpoint)
Continue the process until the next start (or end) of a syscall
Make a single step in the process
Show the register states of the process
Show the content of a memory address (a byte)
Put a breakpoint (we will see later what this means)
Enter an instruction to get the list of available instructions.

We will represent these different instructions with a simple enum. We have not defined what a breakpoint is yet, so let’s forget about it.

enum UserInstruction {
    ContinueUntilBreakpoint,
    ContinueUntilSyscall,
    ShowHelp,
    ShowMemory { address: u64 },
    ShowRegisters,
    SingleStep,
}

Most instructions are very simple and do not require any additional data to be defined, expect for ShowMemory, that requires the address of the memory byte to be fetched. Now, the user will input a String, and we will need to parse it to recover the instruction. If we cannot parse it we will need to return an error and warn it that its command is incorrect. Rust standard-library provides a Trait that fits this behavior, the FromStr trait. Let’s implement it for our UserInstruction type.

impl FromStr for UserInstruction {
    type Err = ();

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        todo!()
    }
}

Let’s pause for a bit and analyse this. The trait consists in a simple from_str function, that takes a &str as argument, and return a result, which is Self on success, or Self::Err on error. In this context, Self means UserInstruction. And Self::Err is a type that we need to specify. Let’s create an empty enum that will contain all the ways our parsing could go wrong.

enum UserInstructionParseError {

}

impl FromStr for UserInstruction {
    type Err = UserInstructionParseError;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        todo!()
    }
}

Let’s now define our actual parsing logic:

n will mean Continue to next instruction (single-step)
h will mean Display help
m will mean Display memory at the address specified in hex
r will stands for Show registers
c will be Continue until next breakpoint
s will be Continue until next syscall


enum UserInstructionParseError {
    UnknownInstruction,
    AddressShouldStartWith0x,
    UnparseableAddress(ParseIntError),
}

impl FromStr for UserInstruction {
    type Err = UserInstructionParseError;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        match s.trim() {
            "c" => Ok(UserInstruction::ContinueUntilBreakpoint),
            "s" => Ok(UserInstruction::ContinueUntilSyscall),
            "h" => Ok(UserInstruction::ShowHelp),
            "r" => Ok(UserInstruction::ShowRegisters),
            "n" => Ok(UserInstruction::SingleStep),
            s if s.starts_with("m ") => {
                let hex_address = s.trim_start_matches("m ");
                if !hex_address.starts_with("0x") {
                    return Err(UserInstructionParseError::AddressShouldStartWith0x)
                }
                let hex_address = hex_address.trim_start_matches("0x");
                let address = u64::from_str_radix(hex_address, 16).map_err(UserInstructionParseError::UnparseableAddress)?;
                Ok(UserInstruction::ShowMemory { address }) 
            }
            _ => Err(UserInstructionParseError::UnknownInstruction)
        }
    }
}

Everything looks straightforward, we try to match the instruction with any string that we specified earlier. For ShowMemory, this is a bit more complicated. We start by checking if the command starts with m . If it is the case, we remove m from the string and should obtain an hex address. We check that it starts with 0x. If it does not, we return an error AddressShouldStartWith0x, that we define in our UserInstructionParseError enum. If it does, we try to parse the address as an hex string. If it is not a valid hex string, we return UnparseableAddress. Note that UnparseableAddress contains ParseIntError, so we can propagate what went wrong when parsing the address.
Finally, if none of the previous matches worked, we return an error UnknownInstruction.

Cool, we can convert a String slice to a UserInstruction, and manage parsing error properly. Let’s read the user data and convert it to our UserInstruction.
Everything will go in a get_user_input function, that takes no argument, and return a Result<UserInstruction, InputError>. Our InputError is an enum with two variants:

enum InputError {
    InvalidInput,
    UserInstructionParseError(UserInstructionParseError),
}

There is one variant if we could not get the input for some reason. For example, strings in Rust must be valid UTF-8, so this could be an error. The second variant is if we could not parse the input, as we have seen earlier.

const PREFIX: &'static str = "(drs)";

fn get_user_input() -> Result<UserInstruction, InputError> {
    use std::io::{stdin, stdout};
    print!("{PREFIX} ");
    let _ = stdout().flush();
    let mut raw_input = String::new();
    stdin().read_line(&mut raw_input).map_err(|_| InputError::InvalidInput)?;
    UserInstruction::from_str(&raw_input).map_err(InputError::UserInstructionParseError)
}

We read the user input into a buffer raw_input, and try to parse it with the function that we defined previously. Now, let’s go back to the main loop of our program, that will now look like this:

pub fn run(child: Pid) -> () {
    let _ = waitpid(child, None).expect("Failed to wait");
    println!("Process started");
    loop {
        match get_user_input() {
            Ok(user_instruction) => todo!(),
            Err(err) => println!("{err}")
        }
    }
}

Unfortunately, this does not compile yet, because we haven’t told our program how to display errors just yet. To do so, let’s implement the Error trait to our enums. We could do it manually, but the crate thiserror provides a convenient utility to implement it automatically. Let’s import it. Our dependency section in our Cargo.toml should look like this:

[dependencies]
thiserror = "1.0"
nix = {version = "0.26.2", features = ["ptrace", "process"]}

We’ll skip over implementation details because that’s not the topic of this blog post, but here they are:

#[derive(Error, Debug)]
enum UserInstructionParseError {
    #[error("Unknown instruction")]
    UnknownInstruction,
    #[error("Address should start with `0x`")]
    AddressShouldStartWith0x,
    #[error("Could not parse address: {0}")]
    UnparseableAddress(#[from] ParseIntError),
}

#[derive(Error, Debug)]
enum InputError {
    #[error("Invalid input")]
    InvalidInput,
    #[error("Could not parse instruction: {0}")]
    UserInstructionParseError(#[from] UserInstructionParseError),
}

We can then test it briefly by trying some wrong commands and a succesful one:

$ cargo run
Process started
(drs) j
Could not parse instruction: Unknown instruction
(drs) m 445
Could not parse instruction: Address should start with `0x`
(drs) m 0xkgf
Could not parse instruction: Could not parse address: invalid digit found in string
(drs) m 0xffffffff
thread 'main' panicked at 'not yet implemented', src/debugger.rs:35:37
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Hello, world!

Of course, our program crashes, because we still have our dear todo!(), but we will take care of that now.

Executing user instructions

We now need to check what instruction we received, and process it accordingly. To do so, we will create a simple process_user_instruction function that takes the instruction and the child PID, and processes the instruction. It returns nothing, because everything is will be sent directly to stdout. However it can fail, for example if our child has been killed by an external program, so we create an error type to manage that.

#[derive(Error, Debug)]
enum ProcessingError {
    #[error("Error using ptrace syscall: {0}")]
    Errno(#[from] Errno)
}

fn process_user_instruction(pid: Pid, user_instruction: &UserInstruction) -> Result<(), ProcessingError> {
    match user_instruction {
        UserInstruction::ContinueUntilBreakpoint => ptrace::cont(pid, None)?,
        UserInstruction::ContinueUntilSyscall => ptrace::syscall(pid, None)?,
        UserInstruction::ShowHelp => todo!(),
        UserInstruction::ShowMemory { address } => todo!(),
        UserInstruction::ShowRegisters => {
            let regs = ptrace::getregs(pid)?;
            println!("{regs}");
        },
        UserInstruction::SingleStep => ptrace::step(pid, None)?,
    };
    Ok(())
}

For ContinueUntilBreakpoint, ContinueUntilSyscall and SingleStep, the procedure is quite simple, we just need to call ptrace with the right argument. We will do ShowHelp later, because that’s not the most interesting one.

Displaying the registers

Right now, let’s work on ShowRegisters, because we already know how to get the registers values. However, we don’t know how to print them yet. If we tried to compile our program right now, the compiler would tell us that it doesn’t know how display the type user_regs_struct, that is the type of our variable regs.

$ cargo run
error[E0277]: `user_regs_struct` doesn't implement `std::fmt::Display`
   --> src/debugger.rs:107:23
    |
107 |             println!("{regs}");
    |                       ^^^^^^ `user_regs_struct` cannot be formatted with the default formatter
    |
    = help: the trait `std::fmt::Display` is not implemented for `user_regs_struct`
    = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
    = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

The help invites us to implement the std::fmt::Display trait for user_regs_struct. However, if we tried to do so, the compiler will gently tell us that we are trying to implement a trait coming for an external crate (std) for a struct coming from an external crate (libc). We cannot do that, otherwise, multiple crates could reimplement the same trait for the same data structure, and the compiler would not know which one to pick. Again, we are going to follow the compiler advice and use the newtype pattern. This consists in wrapping the external type in a new type that is defined in our crate, so we can implement the Display trait for this newtype.

struct UserRegsStruct(user_regs_struct);

impl std::fmt::Display for UserRegsStruct {
   fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        let user_regs_struct {
            r15,
            r14,
            r13,
            r12,
            rbp,
            rbx,
            r11,
            r10,
            r9,
            r8,
            rax,
            rcx,
            rdx,
            rsi,
            rdi,
            orig_rax,
            rip,
            cs,
            eflags,
            rsp,
            ss,
            fs_base,
            gs_base,
            ds,
            es,
            fs,
            gs,
        } = self.0;
        write!(f, "rax: {rax}\norig_rax: {orig_rax}\nrcx: {rcx}\nrdx: {rdx}\nrsi: {rsi}\nrdi: {rdi}\nrip: {rip:#X}\nr15: {r15}\nr14: {r14}")
    }
}

We use pattern destructuring to get all the registers as local variable, and we display some of them. I am not showing all registers, because this is an example, and by looking at the list of all registers, I have the strange feeling that we will need to change the ShowRegisters. We will need a command that can take an argument to display only some registers, only one, or all of them. And the Display trait does not allow this customization, so we will change this in the future. It was just a pretext to show you the newtype pattern.

Reading memory

Now, let’s tackle the last command: let’s read 8 bytes at a certain memory address. Back to our ptrace manpage, we find PTRACE_PEEKDATA.

use nix::sys::ptrace::AddressType;

        /** ... **/
        UserInstruction::ShowMemory { address } => {
            let value = ptrace::read(pid, *address as AddressType)?;
            println!("{value:#018x}")
        },

Cool, we have written all of our instructions ! We just need to fix a last issue. Currently, our ContinueUntilBreakpoint, ContinueUntilSyscall, SingleStep instructions restart the process that was stopped, but they never call waitpid again. We should be waiting for the child, and reading it’s status when it stops. This way, we could know if it exited, if it paused, and for what reason. Let’s modify our run function

pub fn run(child: Pid) -> () {
    let _ = waitpid(child, None).expect("Failed to wait");
    println!("Process started");
    loop {
        match get_user_input() {
            Ok(user_instruction) => {
                process_user_instruction(child, &user_instruction)
                    .unwrap_or_else(|err| println!("Encountered error: {err}"));
                match user_instruction {
                    UserInstruction::ContinueUntilBreakpoint
                    | UserInstruction::SingleStep
                    | UserInstruction::ContinueUntilSyscall => {}
                    _ => continue,
                };
                let wait_result = waitpid(child, None).expect("Failed to wait");
                match wait_result {
                    WaitStatus::Exited(child, status) => {
                        println!("Child {child} exited with status {status}, quitting...");
                        break
                    }
                    _ => continue
                }
            }
            Err(err) => println!("{err}"),
        }
    }
}

Cool, it should be working. This does not look really nice, so we come back to it later, when we know more about breakpoints.

Reading our machine code in memory

Let’s know play a bit with our fresh debugger full of capabilities ! Let’s start by looking at the registers.

$ cargo run
Process started
(drs) r
rax: 0x0
orig_rax: 0x3b
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x0
rip: 0x7f303a07d200
rsp: 0x7ffe91a205d0
r15: 0x0
r14: 0x0

rip stands for Re-Extended Instruction Pointer. It contains the address of the next instruction to execute. It could be interesting to look at what’s there.

(drs) m 0x7f303a07d200
0x00000c98e8e78948

This is the code that will be executed by our processor next. Because of endianess, we actually need to read it backwards. The first byte is 0x48. If we look it up on x86asm.net, we can see that it is the REX.W prefix. This means that we will be working on 64-bit operations. Then, 0x89 is the mov opcode. According to this CS course, the next byte is for the MOD-R/M. It is going to tell us what we are moving, and to what destination. If we look at the bits of 0xe7, we find 11100111. The first two bits are the MOD field. 11 means that we are in register adsressing mode. This means that we will be moving value from register to register. The next 3 bits 100 specifies the source register. 100 represents the rsp (in 64-bits) register. The next 3-bits represents the destination register. 111 is the rdi register. Our instruction in then mov rdi,rsp in intel disassembly flavor. We should be able to check that easily by stepping to the next instruction, and displaying our registers again.

(drs) n
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffe91a205d0
rip: 0x7f303a07d203
rsp: 0x7ffe91a205d0
r15: 0x0
r14: 0x0

Nice, the content of our rsp register has been moved to our rdi register, our debugger seems to be working ! Let’s continue and check what the next instruction is.

(drs) m 0x7f303a07d203
0xc4894900000c98e8

This time, let’s get some help and use a disassembler. For example the one from defuse.ca.
Apparently ou next instruction will be call 0xc9d.

(drs) n
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffe91a205d0
rip: 0x7f303a07dea0
rsp: 0x7ffe91a205c8
r15: 0x0
r14: 0x0

Indeed, the stack pointer rsp has been incremented (or decremented to be exact, because the stack grows downwards), and our rip pointing to the next instruction has also been moved. The offset of new_rip - old_rip is exactly 0xc9d.

>>> int("0x7f303a07dea0", 16) - int("0x7f303a07d203", 16)
3229
>>> hex(3229)
'0xc9d'

And if we look at the memory near the stack pointer rsp, we should be able to find our old rip + the size of the call instruction, on top of the stack.

(drs) m 0x7ffe91a205c8
0x00007f303a07d208

This is 5 bytes bigger than our previous rip, and the call instruction was encoded on 5 bytes, so everything adds up !

We had enough fun with our debugger for now, and now is time for us to look at breakpoints, so we can have even more fun later.

Adding breakpoints

Adding breakpoints means adding the possibility to stop at any instruction in the machine code. This user will specify an address in memory, and when it’s time for the CPU instruction present at this address to be executed, we will stop the child process. The user will then be able to analyze register values, memory, and single-step through the process starting this point.

What is a breakpoint ?

There is an assembly instruction that is dedicated to interrupting software. The int instruction. This instruction is of the form int <x>, where x is a byte representing what software interrupt should be generated. When the processor encounters this instruction, it will lookup in a table provided by the OS what it should do. We will not dive into too much details about how this Interrupt Descriptor Table is created now, because it would take at least another blog post to analyze it thoroughly. You can think of it at a lookup table of callbacks that the CPU will call when it encounters an interrupt (an interrupt can come from the int instruction, but also from CPU exceptions, such as division by zero). We will admit that for int 3, most Linux-like systems send a SIGTRAP signal to the process that is responsible for the interrupt (if the fourth IDT entry has not be modified).

When a (possibly multithreaded) process receives any signal except SIGKILL, the kernel selects an arbitrary thread which handles the signal. (If the signal is generated with tgkill(2), the target thread can be explicitly selected by the caller.) If the selected thread is traced, it enters signal-delivery-stop. At this point, the signal is not yet delivered to the process, and can be suppressed by the tracer. If the tracer doesn’t suppress the signal, it passes the signal to the tracee in the next ptrace restart request. This second step of signal delivery is called signal injection in this manual page. Note that if the signal is blocked, signal-delivery-stop doesn’t happen until the signal is unblocked, with the usual exception that SIGSTOP can’t be blocked. Signal-delivery-stop is observed by the tracer as waitpid(2) returning with WIFSTOPPED(status) true, with the signal returned by WSTOPSIG(status).

Let’s skip over the multithreading part for now. What we see is that when a thread that is being traced should receive a signal, it is first stopped in signal-delivery-stop. The tracer receives the signal in waitpid first, and decides what to do with it. We can forward the signal by passing it to the child alongside with PTRACE_CONT, or we can just pass nothing and continue like nothing ever happened.

So we just need to find a way to add int 3 instructions in our child code, and it will be paused by the OS. Then, we can analyze its state in our parent, and resume it when we want, without forwarding the signal. Indeed, if the process did not define any particular way to handle the SIGTRAP signal, the default action is to Terminate (core dump). So we definitely do not want to forward the signal.

Actually breaking on points

Now is time to allow the user to insert breakpoints in our code. We will need to create a command that allow us to create breakpoints. We first create a new user instruction

enum UserInstruction {
    AddBreakpoint { address: u64 },
    /** 
        old instructions 
    **/
}

Because we need the same address parsing than we used for the ShowMemory user instruction, we will quickly factorize everything in a function.

fn parse_hex_address(hex: &str) -> Result<u64, UserInstructionParseError> {
    if !hex.starts_with("0x") {
        return Err(UserInstructionParseError::AddressShouldStartWith0x);
    }
    let hex_address = hex.trim_start_matches("0x");
    u64::from_str_radix(hex_address, 16).map_err(UserInstructionParseError::UnparseableAddress)
}

fn from_str(s: &str) -> Result<Self, Self::Err> {
        match s.trim() {
            /** other instructions **/
            s if s.starts_with("m ") => {
                let hex_address = s.trim_start_matches("m ");
                let address = parse_hex_address(hex_address)?;
                Ok(UserInstruction::ShowMemory { address })
            }
            s if s.starts_with("b ") => {
                let hex_address = s.trim_start_matches("b ");
                let address = parse_hex_address(hex_address)?;
                Ok(UserInstruction::AddBreakpoint { address })
            }
            _ => Err(UserInstructionParseError::UnknownInstruction),
        }
    }

Now is the time for us add AddBreakpoint in our process_user_instruction function. What we need to do is to change the instruction at the provided and replace it by int 3. For convenience, the x86 instruction set provides an instruction int3 that fits on one byte, which is opcode 0xCC. Otherwise, we would need to write two bytes (one for the opcode int, and one for 3). To write something in the tracee, we can use PTRACE_POKEDATA. It allows us to write a word of data to the given address. Because we can only write a word, and what we wanted to write is actually a byte, we need to read the value of the word first, and apply a mask to modify only the first byte, then write it back.

UserInstruction::AddBreakpoint { address } => {
    let previous_word = ptrace::read(pid, *address as AddressType)?;
    let word_to_write = (previous_word & !0xff) | 0xcc;
    unsafe { ptrace::write(pid, *address as AddressType, word_to_write as AddressType) }?;
}

Cool, let’s try this out !

$ cargo run
Process started
(drs) r
rax: 0x0
orig_rax: 0x3b
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x0
rip: 0x7fb275749200
rsp: 0x7ffcc473e780
r15: 0x0
r14: 0x0

Ok, we remember from the previous chapter that our first instruction was a mov, encoded on 3 bytes. The following instruction was a call instruction. Let’s try to break just before that.

(drs) b 0x7fb275749203
(drs) m 0x7fb275749203
0xc4894900000c98cc

We add a breakpoint 3 bytes after or current rip. We can check that memory was indeed written at the right place with our ShowMemory user instruction. The cc at the end tells us that this is indeed the case, and that we only modified the first byte of the whole word. Sweet !

If we enter c, we can see that our program indeed stops, instead of continuing until the end. We can then check the registers.

(drs) c
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffcc473e780
rip: 0x7fb275749204
rsp: 0x7ffcc473e780
r15: 0x0
r14: 0x0

We are stopped just after our int3 instruction has been executed. The content of the rdi register shows that our first mov instruction has been executed, as we could expect. Well, there is not much to say, but we can rejoice, our breakpoint worked ! Let’s finish the execution of our program.

(drs) c

Wait, nothing happened, let’s try again ?

(drs) c
(drs) c
(drs) c
(drs) c
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffcc473e780
rip: 0x7fb275749207
rsp: 0x7ffcc473e780
r15: 0x0
r14: 0x0
(drs) ^C

Instead of running until the end, our rip seems to be just incremented by 1 every time we press c. This means that our child stops at every instruction. One capability of waitpid that we have not exploited yet is that it provides us with the status of the child. Let’s print this status to see clearer.

let wait_result = waitpid(child, None).expect("Failed to wait");
match wait_result {
    WaitStatus::Exited(child, status) => {
        println!("Child {child} exited with status {status}, quitting...");
        break;
    }
    wait_status => {
        println!("{wait_status:?}");
        continue
    },
}

Now, if we frenetically press c on our keyboard, we can see the following.

(drs) c
Stopped(Pid(310341), SIGSEGV)
(drs) c
Stopped(Pid(310341), SIGSEGV)
(drs) c
Stopped(Pid(310341), SIGSEGV)
(drs) c
Stopped(Pid(310341), SIGSEGV)

SIGSEGV means that there a Segmentation Violation. Our process is trying to access memory that it does not have access to, or write to read-only memory, or other weird shenanigans. If we proceed more carefully by using our SingleStep instruction, we reach the following.

(drs) c
Stopped(Pid(313507), SIGTRAP)
(drs) n
Stopped(Pid(313507), SIGTRAP)
(drs) n
Stopped(Pid(313507), SIGTRAP)
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffc7a1144d0
rip: 0x7fdcc8def207
rsp: 0x7ffc7a1144d0
r15: 0x0
r14: 0x0
(drs) m 0x7fdcc8def207
0x24148b48c4894900

We have not encountered any SIGSEGV yet, but if we decode the next instruction with our favorite disassembler, we see that it will be add BYTE PTR [ecx-0x77],cl. What is in ecx ?

(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffc7a1144d0
rip: 0x7fdcc8def207
rsp: 0x7ffc7a1144d0
r15: 0x0
r14: 0x0

rcx is null, so ecx will be the same, because its a part of rcx. So yes, we will try to add the content of cl and the content of the memory at the address [ecx-0x77], and it will certainly not go well. I assume that the operation will underflow and that the result of [ecx-0x77] will be 0xffffffffffffff87. If we look at the memory mappings of our process, we can see that we have indeed nothing mapped here.

$ cat /proc/313507/maps
<some memory mappings>
7fdcc8dd4000-7fdcc8dd5000 r--p 00000000 fd:00 3421756                    /usr/lib64/ld-linux-x86-64.so.2
7fdcc8dd5000-7fdcc8dfb000 r-xp 00001000 fd:00 3421756                    /usr/lib64/ld-linux-x86-64.so.2
7fdcc8dfb000-7fdcc8e05000 r--p 00027000 fd:00 3421756                    /usr/lib64/ld-linux-x86-64.so.2
7fdcc8e05000-7fdcc8e09000 rw-p 00030000 fd:00 3421756                    /usr/lib64/ld-linux-x86-64.so.2
7ffc7a0f4000-7ffc7a115000 rw-p 00000000 00:00 0                          [stack]
7ffc7a172000-7ffc7a176000 r--p 00000000 00:00 0                          [vvar]
7ffc7a176000-7ffc7a178000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

We can also see that code that we were executing previously around 0x7fdcc8def207 is part of the ld-linux-x86-64.so.2 shared object. So, why is out program crashing ? The problem is certainly not in linux.

I am sure you knew the answer from the beginning, but it would not have been fun it we fixed it immediately, right ? Here, we had a pretext to explore disassembling, segmentation faults, and memory mappings.

Of course, we remember from our previous chapter that the instruction following our first mov instruction was supposed to be a call instruction. So we should never reach the code just after this call instruction, we should be somewhere else in memory, as we have seen in the previous chapter. The ptrace::write function is not marked as unsafe without reason !
Here, two things happened:

After the int3 instruction, our rip points one byte after where our call opcode was supposed to be.
Our call instruction does not even exist anymore ! We replaced its opcode with our int3.

Restoring the process state

To fix this, we need to decrement the value of the rip register after hitting our breakpoint, and we should also restore the instruction that was here before we replaced it with int3. To do this, we will store a list of breakpoints with their old address. To make our code a bit more flexible for the future, we will define a Context struct, even if it currently just contains a list of breakpoints.

struct Breakpoint {
    address: u64,
    /// Byte that was here before we replace it with 0xcc
    previous_byte: i8,
}

impl Context {
    fn new() -> Self {
        Context {
            breakpoints: Vec::new(),
        }
    }

    fn add_breakpoint(&mut self, breakpoint: Breakpoint) {
        if self
            .breakpoints
            .iter()
            .find(|b| b.address == breakpoint.address)
            .is_none()
        {
            self.breakpoints.push(breakpoint)
        }
    }

}

We define this add_breakpoint method for convenience. This will avoid adding multiple breakpoint at the same place. This could lead to unwanted behavior with our current approach. If we added a breakpoint at address a, we would create a Breakpoint with previous_byte equals to the value of byte at address a. But we would also replace this byte with 0xCC. So if we added another breakpoint at the same address a, this breakpoint would have previous_byte set to 0xCC, which could lead to issues if we forget about this. We will fix this issue later, but for now, we will use add_breakpoint as a safeguard.

We can also define the methods that will be used to insert and remove breakpoints.

impl Breakpoint {
    fn insert(&self, pid: Pid) -> Result<(), Errno> {
        let Self {
            address,
            ..
        } = *self;
        let current_word = ptrace::read(pid, address as AddressType)?;
        let word_to_write = (current_word & !0xff) | 0xcc;
        unsafe { ptrace::write(pid, address as AddressType, word_to_write as AddressType) }?;
        Ok(())
    }

    fn remove(&self, pid: Pid) -> Result<(), Errno> {
        let Self {
            address,
            previous_byte,
        } = *self;
        let current_word = ptrace::read(pid, address as AddressType)?;
        let word_to_write = (current_word & !0xff) | (0xff & previous_byte as i64);
        unsafe { ptrace::write(pid, address as AddressType, word_to_write as AddressType) }?;
        Ok(())
    }
}

fn process_user_instruction(
    pid: Pid,
    user_instruction: &UserInstruction,
    context: &mut Context,
) -> Result<(), ProcessingError> {
    match user_instruction {
    /* ... */
        UserInstruction::AddBreakpoint { address } => {
            let previous_word = ptrace::read(pid, *address as AddressType)?;
            let breakpoint = Breakpoint {
                address: *address,
                previous_byte: (previous_word & 0xff) as i8,
            };
            //breakpoint.insert(pid)?;
            context.add_breakpoint(breakpoint);
        }
    }
    /* ... */
}

Here, we commented the breakpoint insertion, because it has two issues:

The one we mentioned before when talking about setting two breakpoints at the same place
The memory is modified as soon as we ask the debugger to create a breakpoint. So if we want to inspect memory with ShowMemory, the old opcode will be gone. So, instead of inserting the breakpoint in memory instantly, we will keep it in memory, and insert it on the fly when we ask the process to restart.

impl Context {
    /* ... */
    fn apply_breakpoints(&self, pid: Pid) {
        self.breakpoints
            .iter()
            .for_each(|breakpoint| breakpoint.insert(pid).unwrap())
    }
}

fn process_user_instruction(
    pid: Pid,
    user_instruction: &UserInstruction,
    context: &mut Context,
) -> Result<(), ProcessingError> {
    match user_instruction {
        UserInstruction::ContinueUntilBreakpoint => {
            context.apply_breakpoints(pid);
            ptrace::cont(pid, None)?;
        }
        UserInstruction::ContinueUntilSyscall => {
            context.apply_breakpoints(pid);
            ptrace::syscall(pid, None)?;
        }
    /* ... */
}

When the process stops, we need to remove all breakpoints so we can inspect the memory peacefully.

impl Context {
    /* ... */
    fn remove_breakpoints(&self, pid: Pid) {
        self.breakpoints
            .iter()
            .for_each(|breakpoint| breakpoint.remove(pid).unwrap())
    }
}

pub fn run(child: Pid) -> () {
    /* ... */
    let wait_result = waitpid(child, None).expect("Failed to wait");
    match wait_result {
        WaitStatus::Exited(child, status) => {
            println!("Child {child} exited with status {status}, quitting...");
            break;
        }
        WaitStatus::Stopped(_child, Signal::SIGTRAP) => {
            context.remove_breakpoints(child); // Remove breakpoint so we can inspect memory without seeing them
            // We need to check if we stopped on a breakpoint, and restore it in this case
            let restored_breakpoint = restore_breakpoint_if_needed(child, &mut context)
                .expect("Failed to check for breakpoints");
            if let Some(breakpoint) = restored_breakpoint {
                println!("Hit breakpoint at 0x{:x}", breakpoint.address)
            }
            continue;
        }
        wait_status => {
            context.remove_breakpoints(child); // Remove breakpoint so we can inspect memory without seeing them
            println!("{wait_status:?}");
            continue;
        }
    }
}

Now, we can create a function that restore the state of the process after a SIGTRAP. It will first detect if we hit a breakpoint, and in this case, decrement rip, and restore the instruction to what it should be.

fn restore_breakpoint_if_needed(pid: Pid, context: &Context) -> Result<Option<&Breakpoint>, Errno> {
    let mut regs = ptrace::getregs(pid)?;
    let previous_rip = regs.rip - 1;
    match context.breakpoints.iter().find(|breakpoint| breakpoint.address == previous_rip) {
        Some(breakpoint) => {
            breakpoint.remove(pid)?;
            regs.rip = previous_rip;
            ptrace::setregs(pid, regs)?; // Restore rip as it was
            Ok(Some(breakpoint))
        }
        None => Ok(None)
    }
}

/** some code **/
let wait_result = waitpid(child, None).expect("Failed to wait");
match wait_result {
    WaitStatus::Exited(child, status) => {
        println!("Child {child} exited with status {status}, quitting...");
        break;
    }
    WaitStatus::Stopped(_child, Signal::SIGTRAP) => {
        // We need to check if we stopped on a breakpoint, and restore it in this case
        let restored_breakpoint = restore_breakpoint_if_needed(child, &context).expect("Failed to check for breakpoints");
        if let Some(breakpoint) = restored_breakpoint {
            println!("Hit breakpoint at 0x{:x}", breakpoint.address)
        }
        continue;
    }
    wait_status => {
        println!("{wait_status:?}");
        continue;
    }
}
/** more code **/

That’s cool, but after hitting the breakpoint, we restored the previous value of the opcode. So if we had placed our breakpoint within a loop or a function, we would only stop the first time we encountered it. We also need to set the breakpoint again once we passed over the instruction. So, once we removed the breakpoint, we will singlestep the program, and then place it back, before continuing.

fn process_user_instruction(
    pid: Pid,
    user_instruction: &UserInstruction,
    context: &mut Context,
) -> Result<(), ProcessingError> {
    match user_instruction {
        UserInstruction::ContinueUntilBreakpoint => {
            ptrace::step(pid, None)?;
            let _ = waitpid(pid, None).expect("Failed to wait");
            context.apply_breakpoints(pid);
            ptrace::cont(pid, None)?;
        }
        UserInstruction::ContinueUntilSyscall => {
            ptrace::step(pid, None)?;
            let _ = waitpid(pid, None).expect("Failed to wait");
            context.apply_breakpoints(pid);
            ptrace::syscall(pid, None)?;
        }
        /* ... */
    }
}

Trying our debugger

We can now try this out.

$ cargo run
Process started
(drs) r
rax: 0x0
orig_rax: 0x3b
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x0
rip: 0x7f68afaa4200
rsp: 0x7ffd7133f410
r15: 0x0
r14: 0x0
(drs) b 0x7f68afaa4203
(drs) c
Hit breakpoint at 0x7f68afaa4203
(drs) r
rax: 0x0
orig_rax: 0xffffffffffffffff
rcx: 0x0
rdx: 0x0
rsi: 0x0
rdi: 0x7ffd7133f410
rip: 0x7f68afaa4203
rsp: 0x7ffd7133f410
r15: 0x0
r14: 0x0
(drs) m 0x7f68afaa4203              
0xc4894900000c98e8
(drs) c
Hello, world!
Child 51604 exited with status 0, quitting...

Here we set a breakpoint and inspect the memory after hitting it. We can see that we do not see the breakpoint in memory and that its usage is transparent. When inspecting registers after hitting the breakpoint, we see that rip as been decremented, and if we resume execution, the program executes sucessfully.

Cool, we have a somewhat-working debugger ! There is one small issue remaining, but we will just mention it without acting on it.

Currently, we trust the user to put the breakpoint at the beginning of an instruction (this means, the user must provide the address of an opcode when adding a breakpoint). If we had a disassembler, we could disassemble the program, and look for the closest opcode to replace. But let’s keep it as it is for now, and trust the user.

Now, it’s sometimes cool to be able to debug your assembly, especially if you are working on low-level computing, but for most software developpers, you want to be place breakpoints at certain lines of the code, and not at certain instructions. You want to be able to read the value of a variable, and not some place in memory.
In the next article of the serie, we will look at how to do that. Until then, take care and see you soon !

Code for this serie is available on Github.

2023-10-04

https://www.0xatticus.com/posts/debugger/debugger_pt2/ 0xAtticus