# miasm

From cea-sec github

Miasm is a free and open source (GPLv2) reverse engineering framework. Miasm aims to analyze / modify / generate binary programs.

It has a bunch of cool features ranging like static binary analysis, IR representation, unpacking, de-obfuscation etc

I usually like to try out a new tool and have a look at it features if they are later needed in a project or CTFs.

A couple of points to note if you’re new to Symbolic Execution

• Symbolize - mark memory/register as “symbol” - like a mathematical variable. Symbolized memory doesn’t have a value. The solvers use this variable to calculate constraints and solve to get values.
• Concretize - set “actual” value to memory like 0x41 for a byte, -22 for an int, etc.

The general idea of a symbolic execution engine is to let the user select parts of a program that needs to be evaluated, then symbolize parts of the memory and let the engine evaluate expressions for memory to a final state. On this final state, query the constraints solver to solve for selected memory to set condition/s true/false. In this challenge there were some equations that need to be verified for bytes of the flag. The symbolic execution engine should automatically calculate these equations and solve them to get the flag

# Problem

A friend participated in a CTF and got a reverse task which was a trivial solve for angr. While solving that task I thought why not give `miasm` a try.

There was only 1 function of interest - `main` which looked like Each smaller basic block connected with a red line is a classic “Bad Boy”, final one at the bottom right is the “Good Boy”. `flag` is read as `argv` and its length is compared to 9 in the first basic block. Each basic block then had a small equation for consecutive pair of bytes. Solving with `angr` was trivial.

``````p = angr.Project('/tmp/ctf-bin')

base = 0x400000

state.regs.rbp = 0x1200

for i in range(9):
state.mem[flag_addr + i].byte = state.solver.BVS('C_%d' %i, 8)

ex = p.factory.simulation_manager(state)
ex.explore(find=base+0x1454)
if ex.found:
s=""
for i in range(9):
print(s)
``````

yielded the flag

Since I already have knowledge on how the challenge works and have a solution I can give `miasm` a try. The documentation is quite lacking and there are very little number of blogs out including their own. The source is somewhat documented and can be worked with.

My script was heavily based on this

If you want to learn about a tool you should first start by solving small challenges with it.

### Sandbox

To execute a binary within miasm’s env a `sandbox` is needed. It has a `jitter` that jits the elf that can be then “executed”.

``````from miasm.analysis.sandbox import Sandbox_Linux_x86_64
parser = Sandbox_Linux_x86_64.parser()
parser.add_argument("--f", default="/tmp/ctf-bin", required=False) # set a default elf path
options = parser.parse_args()

solution = "........."
options.mimic_env = True
options.command_line = [solution] # set argv for the elf
sb = Sandbox_Linux_x86_64(options.f, options, globals())

sb.jitter.init_run(sb.entry_point)
``````

Here we setup the `sandbox` sb and let it setup the `entry_point`. I have also setup a dummy solution as it goes into `argv`

### Dynamic Symbolic Execution(DSE) with miasm

Based on the emulation we can set up constraints on the memory by hooking functions or setting breakpoints in the code. If nothing is done, concrete values can be used by `miasm`

We need a `DSEPathConstraint` hooked up with our `sandbox` sb. According to its documentation

DSEPathConstraint is Dynamic Symbolic Execution Engine keeping the path constraint. Possible new “solutions” are produced along the path, by inversing concrete path constraint. Thus, a “solution” is a potential initial context leading to a new path.

While evaluating a path we can set constraints that can be solved by the backend `z3` solver to produce new paths. Its similar to `angr`’s simulation manager but we need to symbolize and concretize the memory. It has 3 evaluating techniques - PRODUCE_SOLUTION_CODE_COV, PRODUCE_SOLUTION_BRANCH_COV and PRODUCE_SOLUTION_PATH_COV For this simple and small CFG all 3 performed same.

So we setup the necessary DSE variables and attach the `jitter` with the `dse` instance. A `Machine` is a generic wrapper over all architectures supported by `miasm` - x86/64, arm, ppc etc.

``````from miasm.analysis.dse import DSEPathConstraint
from miasm.analysis.machine import Machine

machine = Machine("x86_64")
dse = DSEPathConstraint(
machine, produce_solution=DSEPathConstraint.PRODUCE_SOLUTION_BRANCH_COV)

dse.attach(sb.jitter)
dse.update_state_from_concrete()
sb.run()
``````

`update_state_from_concrete` updates the values from the CPU, so the symbolic execution will be completely concrete from that point.

If the script fails upto running here change `dse` with an instance of `DSEEngine`

``````dse = DSEEngine(
machine)
``````

### Hooking Functions

if we run the script we have now we might have an error like

``````RuntimeError: Symbolic stub 'b'xxx___libc_start_main_symb'' not found
``````

This means that `__libc_start_main` which is a part of the libc will have to be implemented in the miasm env. This can be copied from a similar blog All the functions which are dynamically linked have to be implemented in the script as `xxx_<function_name>_symb` for symbolic execution.

``````def xxx___libc_start_main_symb(dse):
# ['RDI', 'RSI', 'RDX', 'RCX', 'R8', 'R9']
regs = dse.ir_arch.arch.regs
top_stack = dse.eval_expr(regs.RSP) # read registers to replicate while returning
argc = dse.eval_expr(regs.RSI)
argv = dse.eval_expr(regs.RDX)

dse.update_state({
regs.RDI: argc,
regs.RSI: argv,
})
``````

This updates the dse as if `__libc_start_main` has been executed. This needs to be implemented before

``````dse.add_lib_handler(sb.libs, globals())
``````

We also know that the paths have a “Bad Boy” path which has termination with a call to `puts` followed by `exit`. We will hook them too and setup such that we can find what paths to not finish on.

``````class FinishOn(Exception):

def __init__(self, string):
self.string = string
super(FinishOn, self).__init__()

def xxx_exit_symb(dse):
raise FinishOn("Fail")

from miasm.os_dep.win_api_x86_32 import get_win_str_a

def xxx_puts_symb(dse):
string = get_win_str_a(dse.jitter, dse.jitter.cpu.RDI)
raise FinishOn(string)
``````

Next error that we hit is

``````RuntimeError: Symbolic stub 'b'xxx_strlen_symb'' not found
``````

`strlen` is called to verify if the length of the string is 9. This will be an apt location to set the string as symbolic and setup constraints that can be later solved.

``````from miasm.expression.expression import *

str_ptr = 0
curr = b""
expr_map = dict()

strln = ExprId("strln", 64) # symbolize string length
z3_strln = dse.z3_trans.from_expr(strln)

def xxx_strlen_symb(dse):
global str_ptr
regs = dse.ir_arch.arch.regs
ptr = dse.eval_expr(regs.RDI) # read string pointer
str_ptr = int(ptr) # save string pointer to inspect later
ret_value = strln
update = {}

# symbolize the string buffer as bytes
for i, content in enumerate(curr):
addr = dse.symb.expr_simp(ptr + ExprInt(i, ptr.size))
expr_map[i] = ExprId("C_%d" % (i), 8) # save expressions so that we can query the solver for these

dse.cur_solver.add(0x2f < dse.z3_trans.from_expr(expr_map[i])) # should be printable ascii alphanumeric
# update the state and emulate strlen call
dse.update_state({
regs.RSP: dse.eval_expr(regs.RSP + ExprInt(8, regs.RSP.size)),
regs.RAX: ret_value,
})
dse.update_state(update)
``````

Here we do a couple of things

• Symbolize the string length as the return values in `rax` when the function returns
• Set limits on the length as we already know
• Symbolize the bytes of the string so that constraints can be calculated by the z3 solver in `DSEPathConstraint`
• Add constraints on bytes to be alpha numeric(optional)

This serves as the init logic for our solver. `update_state` takes a map of `expression` to `expression` instead of integer values.

### Path exploration

With all the needed functions hooked, we can proceed to evaluate the paths. For quicker emulation we can take a snapshot of the `dse` and restore it when needed from init. We then solve constraints from the last path, concretize the values from the solution and add it to a queue which can be then precessed.

``````dse.add_lib_handler(sb.libs, globals())
snapshot = dse.take_snapshot()
found = False
curr = b""
todo = set([b""])

while todo:
flag = todo.pop() # take 1 input to process
curr = flag # set curr so that xxx_strlen_symb can iterate over it
dse.restore_snapshot(snapshot, keep_known_solutions=True) # restore to "init" snapshot
# Concretize the known bytes from curr
if str_ptr:
for idx, content in enumerate(curr):
currb = sb.jitter.vm.get_mem(int(str_ptr) + idx, 1)
# . was the original value set in argv
if currb == b'.' and content > 0x30:
# update if a better solution was found
sb.jitter.vm.set_mem(int(str_ptr) + idx, bytes([content]))
# read the updated bytes of the string
print("Done {}".format(sb.jitter.vm.get_mem(int(str_ptr), 9)))
try:
sb.run()
# puts and exit trigger a FinishOn which also has good boy/bad boy string
except FinishOn as finish_info:
print(finish_info.string)
if "Level Unlocked" in finish_info.string:
found = True
break
# iterate over all solutions and query the expr_map expressions for new candidate string bytes
for sol_ident, model in viewitems(dse.new_solutions):
candidate = []
ln = max(model.eval(dse.z3_trans.from_expr(strln)).as_long(), 9)
for i in range(ln):
try:
candidate.append(int_to_byte(model.eval(
dse.z3_trans.from_expr(expr_map[i])).as_long()))
except (KeyError, AttributeError) as _:
candidate.append(b"\x00")
# add the new candidate to queue
Once a solution is found for the 2 paths from a basic block, constraints on expressions from `expr_map` - bytes of “flag” string can be solved and concretized. Since the checks are progressive, this will eventually solve all the constraints upto the good boy block - Level Unlocked