Blacksmith (HackTheBox)
Hey all. Today we're going to discuss the retired Blacksmith challenge on HackTheBox. The description on HackTheBox is as follows:
You are the only one who is capable of saving this town and bringing peace upon this land! You found a blacksmith who can create the most powerful weapon in the world! You can find him under the label "./flag.txt".
In this write-up, we will learn about seccomp, writing assembly, and performing syscalls.
Summary
- First looks
- Finding vulnerability primitives
- Developing AMD64 (x86_64) assembly
- Retrieving the flag
First looks
We are given the blacksmith
executable binary. Upon running the binary, we are presented with a menu to trade items:
Usually, I start by checking the binary's security using pwntools' checksec
. In this case, the security of blacksmith
binary is:
The fields in checksec mean the following:
- Arch: the CPU architecture and instruction set (x86, ARM, MIPS, ...)
- RELRO: Relocation Read-Only - secures the dynamic linking process
- Stack Canaries: protects against stack buffer overflow attacks
- NX: No eXecute - write-able memory cannot be executed
- PIE: Position Independable Executable - address randomization
- RWX: Read Write Execute - there's memory that's RWX
The logical conclusion is that we need to write a shellcode to the RWX memory to read out flag.txt
(based on the challenge description).
Finding vulnerability primitives
To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork. An example of primitives working together is as follows:
- an information leak primitive to leak an address
- an arbitrary write primitive to control the execution flow
... which can work together by controlling the execution flow by writing a leaked address.
Main analysis
When I want to find vulnerability primitives, I open the binary in Ghidra, Ghidra is a reverse engineering tool developed by the NSA (yes, that NSA). I start off analyzing a binary at the main
function. In this case, it looked like the following:
So, the main
function does the following:
- setup()
- sec()
- shield(), bow() or sword()
In addition to that, the main function uses canary tokens in variable __can_token
. As you can see, if __can_token
is not equal to the original value, it means that stack corruption has been detected and hence, __stack_chk_fail
is called which exits the program.
The function setup
removes the buffer for stdout and stdin, which is standard and hence not interesting. In contrast, the sec
function is interesting.
Sec function
We can see that the sec
function primarily creates an allow list using seccomp
of the syscalls sys_read
, sys_write
, sys_open
, and sys_exit
. (Note that the naming convention for internal syscall functions is a sys_
prefix. When we say sys_read
, we mean the syscall read
.) By doing this, the developer of the program prevents us from executing our shell on the server since we would need to sys_execve("/bin/sh", NULL, NULL)
for that. Because sys_execve
is not on the allow list, we cannot use it. Remember this for later.
Shield analysis
Furthermore, we have the shield()
, bow()
or sword()
calls in main()
. The bow()
and sword()
functions crash the program before a user can give input, which means that's irrelevant. So basically, the vulnerability must be in shield()
.
What sticks out to me in this function is that we have user input and are calling a variable like a function using (*(code *)buf)();
. The code (*(code *)buf)();
is equivalent to the ASM below:
The (*(code *)buf)();
function call executes the buf
variable on the stack as if it was assembly. This means we can inject assembly into the program.
Developing AMD64 (x64_86) assembly
We have an arbitrary execution primitive so we need to write an assembly payload. The difficulty with this is that:
- We have
63
bytes to work with:
- We can only use
sys_read
,sys_write
,sys_open
andsys_exit
:
- We do not have a stack address (ASLR)
However, the challenge description told us that we need to read the flag.txt
file. Hence, the strategy for this payload is opening flag.txt
, reading flag.txt
into a buffer, and writing the buffer to stdout
.
To interact with those files, we need to utilize system calls ("syscalls"). Syscalls are essentially an ABI (binary API) with the Linux kernel which is like the god of the operating system. The kernel provides memory management, CPU scheduling, driver management, hardware IO, et cetera. If you want to learn more about the kernel, the book "Linux Kernel Development" by Robert Love is an excellent way to learn more about the kernel (I've read it).
I used a Linux x64 syscall table as a reference for using the syscalls. Essentially the code should do the following:
I came up with the following ASM:
Since we have only 63 bytes to work with, I had to be creative. In assembly, most bytes are allocated to constant values like mov rax, 2
since it will store an 8-byte 0x00000000 00000002
into the instruction. That means we can save a lot of bytes by reusing register values.
I eventually refactored the payload into 46 bytes:
Retrieving the flag
Now we have a steady payload, we need to send it to the application. I made the following script using pwntools: