As with the QuickScan challenge, my approach on FollowthePath also differs from the original authors solution. Thus another writeup had to be done!
The challenge presents itself as a relatively simple binary. The main function (starting at 140001960
) is shown in the screenshot below.
I have named some of the easily identifiable functions already. First we are promted to enter a flag which gets read by the appropriately named get_input
function at 14000199f
. After that, some variables are set up. Most notably the functions printing “Correct” and “Nope” are put into registers r11
and r10
respectively. So any jump to r10
indicates a wrong input. Furthermore the input is stored in the register r12
(this in not immediately obvious from the disassembly, so just trust me on this one).
Now lets investigate the assembly block starting at 140001000
. A byte from our input is read and then a xor
and cmp
are performed. To calculate the necessary input we can just do 0x8c ^ 0xc4
which yields 0x48
and thus an H
in ASCII1.
If this check fails a jump to r10
is taken (remember, r10
stores the pointer to the function printing “Nope”) thus this is not a desireable path to follow. If the check is passed, a jump to 14000101e
is taken. This assembly block increases rcx by one (which used to index the input array), loads a pointer to 140001039
into r8 and sets rdx to zero.
The next block starting at 14000102b
loops 0x39 times and xors the data pointed to by r8 with 0xde. The next assembly block starting at 140001039
looks like gargabe instruction and makes little sense. But the address coincides with the pointer previously loaded into r8 (before the xor happened). Thus we are dealing with self modifying code.
from unicorn import *
from unicorn.x86_const import *
import capstone
# memory address where emulation starts
ADDRESS = 0x140001000
STACK_ADDR = 0x2000000
STACK_SIZE = 0x10000
# i snipped the payload, its basically an escaped ascii string from
# everything between 140001000 and 140001960
payload = b"M1[...]x01"
# small macro to disassemble opcodes
def disassemble(code, addr):
cs = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_64 + capstone.CS_MODE_LITTLE_ENDIAN)
for i in cs.disasm(code, addr):
return i
# custom code hook
def hook_code(emu, address, size, user_data):
code = emu.mem_read(address, size)
insn = disassemble(code, address)
print(">>> {:#x}: {:s} {:s}".format(insn.address, insn.mnemonic, insn.op_str))
print(f"r8 = {hex(emu.reg_read(UC_X86_REG_R8))}, rdx = {hex(emu.reg_read(UC_X86_REG_RDX))}")
if insn.mnemonic == "xor" and "r8, r8" in insn.op_str:
user_data.trace_instructions = True
if user_data.trace_instructions == True:
print(">>> {:#x}: {:s} {:s}".format(insn.address, insn.mnemonic, insn.op_str))
if insn.mnemonic == "xor" and "r8, r8" not in insn.op_str:
user_data.xor_key = int(insn.op_str.split(", ")[1],16)
if insn.mnemonic == "cmp":
cmp_val = int(insn.op_str.split(", ")[1],16)
emu.reg_write(UC_X86_REG_R8, cmp_val)
user_data.flag[user_data.pos] = user_data.xor_key ^ cmp_val
user_data.pos += 1
if insn.mnemonic == "je":
user_data.trace_instructions = False
print(b"".join(c.to_bytes(1,"little") for c in user_data.flag))
return
class SharedData:
def __init__(self):
self.flag = [0] * 0x40
self.trace_instructions = False
self.xor_key = 0
self.pos = 0
mu = Uc(UC_ARCH_X86, UC_MODE_64)
# map memory
mu.mem_map(ADDRESS, 2 * 1024 * 1024)
mu.mem_map(STACK_ADDR, STACK_SIZE)
# store payload in allocated region
mu.mem_write(ADDRESS, payload)
# Write a fake flag to the "stack"
mu.mem_write(STACK_ADDR, b"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA")
# initialize registers. r12 contains flag location, rcs is zero and RIP is set to the start of the payload
mu.reg_write(UC_X86_REG_R12, STACK_ADDR)
mu.reg_write(UC_X86_REG_RCX, 0)
mu.reg_write(UC_X86_REG_RIP, 0x140001000)
# instantiate the shared data struct
user_data = SharedData()
# add our code hook
mu.hook_add(UC_HOOK_CODE, hook_code, user_data)
# this try is kinda hacky, but it works
try:
mu.emu_start(ADDRESS, len(payload))
except UcError as e:
print("".join(chr(c) for c in user_data.flag))
For anyone confused why this is the case, the math is quite simple. I will denote the input as a char array, thus the formula in pseudo c is input[0] ^ 0xc4 = 0x8c
. This formula can be rearranged (due to the XOR operation being associative and commutative) to input[0] = 0x8c ^ 0xc4
. ↩︎