Table of Contents
Initial Analyisis
Challenge handout files: BIGbadEASYvm
Challenge source files: BIGbadEASYvm
Following files are given to us:
$ ls
BIGbadEASYvm crackme.i
$ file BIGbadEASYvm
BIGbadEASYvm: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, stripped
$ file crackme.i
crackme.i: data
On running the binary by itself, it exits warning us of Incorrect usage
.
It can be easily gathered that the binary is meant executed with the crackme.i
(to be given as the first argument) and that it is a VM executing the custom bytecode contained in the crackme.i
file.
$ ./BIGbadEASYvm crackme.i
[*] Setting up
[*] Welcome:
test
When executed properly we are greeted with a welcome message, evidently asking us for the flag. On giving an input the binary seems to hang.
On further analysis of the binary, It can be seen that the binary has been heavily obfuscated using the LLVM Obfuscator.
A deobfuscator is available as a plugin for Binary Ninja, but that is not really required to solve the challenge. The challenge can be solved in the traditional manner by reversing the the VM and writing a parser for the bytecode. This approach can be a bit painful and tiresome if one does not deobfuscate. The challenge source has been made available for anyone who was approaching the challenge this way and wants to have a peek at the source code.
One other aproach is to analyse the bytecode independently and retrive the flag. Although this may seem really difficult at the onset, if the bytecode is simple enough it can be and has been done for other similar CTF challenges.
This writeup on the other hand will illustrate an alternative method of solving the challenge by instrumenting the binary using FRIDA. A common pattern that can be noticed in VM challenges such as this is that it is quite difficult to implement a complex crypto algorithm in custom bytecode. So an instruction count based approach will most likely work as it is very likely that the bytecode is just a series of comparisons.
Before that can be done, we need to have a rough idea of what is happening. A good look at the main function will reveal that the function at 0x00401010
gets called for every opcode parsed from crackme.i
and the global value at the address 0x006ca5b0
gets updated each time, we can safely consider this value the to be the program counter register (lets call it RIP) and the function as a parser for individual instructions.
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
void FUN_00401010(undefined8 param_1)
{
uint uVar1;
uVar1 = (uint)((ulong)param_1 >> 0x20);
DAT_006ca584 = ((uint)((ulong)param_1 >> 0x18) ^ 0xffffffff | 0xfffff000) ^ 0xffffffff;
DAT_006ca580 = (int)DAT_006ca584 / 5;
DAT_006ca588 = ((uVar1 ^ 0xffff0fff) & uVar1) >> 0xc;
DAT_006ca58c = ((uVar1 ^ 0xfffff0ff) & uVar1) >> 8;
DAT_006ca590 = ((uVar1 ^ 0xffffffff | 0xffffff0f) ^ 0xffffffff) >> 4;
DAT_006ca594 = ((uint)param_1 ^ 0xffffffff | 0xff000000) ^ 0xffffffff;
DAT_006ca59c = ((uVar1 ^ 0xffffffff | 0xf000ffff) ^ 0xffffffff) >> 0x14;
_DAT_006ca5a0 = ((uVar1 ^ 0xfff0ffff) & uVar1) >> 0x10;
_DAT_006ca5d8 = param_1;
return;
}
The function seems to parse the instruction and store the parts at global locations to be used later while processing the instruction.
This information is enough for us to start writing a script for initial analysis:
import frida
import sys
def on_message(message, data):
print(message['payload'])
session = frida.attach("BIGbadEASYvm")
contents = '''
Interceptor.attach(ptr("0x00401010"), {
onLeave: function(retval) {
send("RIP: "+ptr("0x006ca5b0").readPointer())
}
});
'''
script = session.create_script(contents)
script.on('message', on_message)
script.load()
sys.stdin.read()
This is a simple FRIDA script that intercepts the aforementioned function and prints the RIP. By running this script a few times and analysing the output for a failing case, it becomes obvious that the program goes into an infinite loop because of a failing check that returns RIP to 0x100
from RIP > 0x100
. Now at this point we can try with an input in the flag format inctf{
to see if an instruction count based approach will work, and you will notice that it does!
Scripting
All that remains to be done is to write a script to solve the challenge, this can be done with a gdb-python/angr/IDA script or with any other binary analysis framework of your choice, I decided to go with FRIDA as I had not seen anything like this done in FRIDA before. As evident by the hackish code that follows, there is a reason why, but it was something new and fun so why the hell not!
// process.js
var inscount = 0;
var prev_rip = -1;
var flag = -1;
var g_chr = '-';
Interceptor.attach(ptr("0x00401010"), {
onLeave: function(retval) {
inscount = inscount + 1;
var rip = ptr("0x006ca5b0").readPointer().toInt32();
if(prev_rip > 0x100 && rip == 0x100 && flag == -1) {
send(g_chr + " " + inscount);
flag = 1;
}
prev_rip = rip;
}
});
rpc.exports = {
setchar: function (chr) {
g_chr=chr;
}
};
# instrument.py
from __future__ import print_function
from pwn import *
from time import sleep
import operator
import string
import frida
import sys
context.log_level = "error"
password = ""
charset = string.letters + string.digits + "{}_&!-"
inscount = {}
def on_message(message, data):
context.log_level = "error"
io.close()
char, count = message['payload'].split()
inscount[char] = int(count)
print("Iterating ...")
print("Password: ",end="")
while(1):
for i in charset:
io = process(["./BIGbadEASYvm", "crackme.i"])
io.recvuntil(":\n")
session = frida.attach(io.pid)
contents = open("process.js").read()
script = session.create_script(contents)
script.on('message', on_message)
script.load()
api = script.exports
api.setchar(i)
io.sendline(password + i)
sleep(0.1)
password += max(inscount.iteritems(), key=operator.itemgetter(1))[0]
print(password[-1:],end="")
By fiddling a bit for the final character, you get the flag:
inctf{1_kN0w_1t5_R3411y_3z_&_fuNNy_but_1ts_h0n3st_w0rk!}