Reverse Engineering Go, Part II



This post is on how the Ghidra decompiler works, and how to make it work for Go. Part 1

At the time of writing, the most recent major version of Go is Go 1.13 golang.org

The most recent major version of Ghidra is Ghidra 9.11 github.com/NationalSecurityAgency/ghidra/releases

Ghidra

Ghidra is a decompiler recently open sourced by the NSA, and comes with support for C-style and Java binaries. Our goal is to add support to Ghidra for binaries written and compiled with the Go compiler.

This includes supporting Go’s calling convention, its type system, channels, maps, slices, defer and goroutines.

Calling Convention

See the previous article for how the Go calling convention works.

We are looking at how Ghidra decides how many arguments a function has, and when it is called, what local variables to assign as parameters to that function call.

To understand how to get Ghidra to change calling conventions, we need to dive into Ghidra’s decompiler and take a look at how they find parameters.

An interesting tidbit about how Ghidra does decompilation: since the main Ghidra application is a Swing applet written in Java, and the decompiler itself is written in C++, Ghidra starts up a separate process for the decompiler, and then talks to it using XMLRPC. This model means that whenever we want to decompile a program, Ghidra first will RPC registerProgram to the decompiler, and then call decompileFunction(address) to get the decompiler’s output for a specific function.

Ghidra also has a preprocessing step before it decompiles, where it lifts the underlying raw binary instructions into PCode. This allows it to translate it into an internal language (IL), and gives it more portability (and extensibility).

Arguments

Ghidra attempts to detect parameters for a function when you decompile by going down the call-tree and attempting to discover calls through what it knows about calling conventions and different compilers.

Architectures and Compilers

Ghidra also holds some specific information about widely used compilers and architectures to help it decompile.

This is a file found at ghidra/Ghidra/Processors/x86/data/languages/x86gcc.cspec


<?xml version="1.0" encoding="UTF-8"?>
<compiler_spec>
  <data_organization>
     <absolute_max_alignment value="0" />
     <machine_alignment value="2" />
     <default_alignment value="1" />
     <default_pointer_alignment value="4" />
     <pointer_size value="4" />
     <wchar_size value="4" />
     <short_size value="2" />
     <integer_size value="4" />
     <long_size value="4" />
     <long_long_size value="8" />
     <float_size value="4" />
     <double_size value="8" />
     <long_double_size value="12" />
     <size_alignment_map>
          <entry size="1" alignment="1" />
          <entry size="2" alignment="2" />
          <entry size="4" alignment="4" />
          <entry size="8" alignment="4" />
     </size_alignment_map>
  </data_organization>
  <global>
    <range space="ram"/>
    <range space="OTHER"/>
  </global>
  <stackpointer register="ESP" space="ram"/>
  <returnaddress>
    <varnode space="stack" offset="0" size="4"/>
  </returnaddress>
  <default_proto>
    <prototype name="__cdecl" extrapop="4" stackshift="4">
      <input>
        <pentry minsize="1" maxsize="500" align="4">
          <addr offset="4" space="stack"/>
        </pentry>
      </input>
      <output killedbycall="true">
        <pentry minsize="4" maxsize="10" metatype="float" extension="float">
          <register name="ST0"/>
        </pentry>
        <pentry minsize="1" maxsize="4">
          <register name="EAX"/>
        </pentry>
        <pentry minsize="5" maxsize="8">
          <addr space="join" piece1="EDX" piece2="EAX"/>
        </pentry>
      </output>
      <unaffected>
        <register name="ESP"/>
        <register name="EBP"/>
        <register name="ESI"/>
        <register name="EDI"/>
        <register name="EBX"/>
      </unaffected>
      <killedbycall>
        <register name="ECX"/>
        <register name="EDX"/>
        <register name="ST0"/>
        <register name="ST1"/>
      </killedbycall>
      <likelytrash>
        <register name="EAX"/>
      </likelytrash>
    </prototype>
  </default_proto>
  <prototype name="__cdeclf" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="1" maxsize="10">
        <register name="ST0"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="ECX"/>
      <register name="EDX"/>
    </killedbycall>
    <likelytrash>
      <register name="EAX"/>
    </likelytrash>
  </prototype>
  <prototype name="__thiscall" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
    <likelytrash>
      <register name="EAX"/>
    </likelytrash>
  </prototype>
  <prototype name="__regparm3" extrapop="4" stackshift="4">   <!-- Used particularly by linux kernel -->
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="ECX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
    <likelytrash>
      <register name="EAX"/>
    </likelytrash>
  </prototype>
  <prototype name="__regparm2" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
    <likelytrash>
      <register name="EAX"/>
    </likelytrash>
  </prototype>
  <prototype name="__regparm1" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
    <likelytrash>
      <register name="EAX"/>
    </likelytrash>
  </prototype>
  <prototype name="syscall" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EBX"/>
      </pentry>
       <pentry minsize="1" maxsize="4">
        <register name="ECX"/>
      </pentry>
       <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
       <pentry minsize="1" maxsize="4">
        <register name="ESI"/>
      </pentry>
       <pentry minsize="1" maxsize="4">
        <register name="EDI"/>
      </pentry>
       <pentry minsize="1" maxsize="4">
        <register name="EBP"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="EBX"/>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="EBP"/>
      <register name="EDI"/>
      <register name="ESI"/>
      <register name="ESP"/>
      <register name="DF"/>
    </unaffected>
    <killedbycall>
      <register name="EAX"/>
    </killedbycall>
  </prototype>

  <resolveprototype name="__cdecl/__regparm">
    <model name="__cdecl"/>        <!-- The default case -->
    <model name="__regparm3"/>
    <model name="__regparm2"/>
    <model name="__regparm1"/>
  </resolveprototype>
  <eval_current_prototype name="__cdecl/__regparm"/>

  <callfixup name="get_pc_thunk_bx">
    <target name="__i686.get_pc_thunk.bx"/>
    <target name="__x86.get_pc_thunk.bx"/>
    <pcode>
      <body><![CDATA[
      EBX = * ESP;
      ESP = ESP + 4;
      ]]></body>
    </pcode>
  </callfixup>

  <callfixup name="get_pc_thunk_cx">
    <target name="__i686.get_pc_thunk.cx"/>
    <target name="__x86.get_pc_thunk.cx"/>
    <pcode>
      <body><![CDATA[
      ECX = * ESP;
      ESP = ESP + 4;
      ]]></body>
    </pcode>
  </callfixup>

  <callfixup name="get_pc_thunk_dx">
    <target name="__i686.get_pc_thunk.dx"/>
    <target name="__x86.get_pc_thunk.dx"/>
    <pcode>
      <body><![CDATA[
      EDX = * ESP;
      ESP = ESP + 4;
      ]]></body>
    </pcode>
  </callfixup>
</compiler_spec>

Go Decompilation

Before

A basic Go program looks like this:

Compiled will be around 1.9Mb

After

Decompiled using Ghidra, using their pseudo C-like decompiler:

A few issues we can see right off the bat:

main.main() calls main.main() before it returns runtime.morestack_noctxt() is called fmt.Fprintln() has no arguments And there are several offsets and constants being used that are unrecognizable.

This leads us to a couple of well defined goals:

  1. Detecting and restoring arguments

Clearly Ghidra has no idea what the arguments to these function are, so we need to help Ghidra figure that out.

  1. Get rid off Go’s stack fluff

We can tell Go to get rid of the preamble where it checks if each function has enough stack before running the function

The most useful of these seems to be argument detection, so we will proceed with that.

Small Easy Wins

During Auto-Analysis, if you check Ghidra’s Decompiler Parameter ID option, it will attempt to restore parameters for you. As an example:

Go Binary:

And decompiled with the option turned on:

You will notice that there are quite a few more params detected, but it DOES detect 1, 2, 3 being passed in the first function, and 2, 3, 4 passed to the second. This small fix is good enough to detect all of the actual parameters being passed, but it still generates a lot of noise.

Detecting Arguments

Looking at samples of code, we can see that functions pull arguments off the stack using the pattern MOV RAX, [RSP + offset].

We can get the stack offset using the initial SUB RSP, offset.

Once that is calculated, we then run through the rest of the function programmatically using Ghidra’s API, and fetch all of the other arguments.

Ghidra allows us to define stack offset parameters using custom storage and their internal class ParameterImpl, which we can then put into a List, sort by stack offset, and replace the normal parameters with.

This gives us something more reasonable:

The specific script we are going to be walking through can be found here.

RestorePrototype is the critical function:

Listing listing = currentProgram.getListing();
ReferenceManager refMan = currentProgram.getReferenceManager();
AddressSetView addressSetView = function.getBody();

InstructionIterator iter = listing.getInstructions(addressSetView, true);
// Find first SUB RSP
Scalar stackOffset = new Scalar(0, 0L, true);
while(iter.hasNext()) {
    Instruction curr = iter.next();

    if (curr.getMnemonicString().equals("SUB")) {
        Object[] objs = curr.getOpObjects(1);
        if (!(objs[0] instanceof Scalar)) return;
        stackOffset = (Scalar) objs[0];
        break;
    }
}

We initialize our managers at the top, and then use an InstructionIterator to go through each instruction in the function. Once we reach the first SUB, we stop, parse an offset (and quit if we find none).

List<Scalar> arguments = new ArrayList<>();
while(iter.hasNext()) {
    if (monitor.isCancelled()) break;
    Instruction curr = iter.next();

    try {

        Object[] opObjects = curr.getOpObjects(1);
        if (opObjects.length == 0) continue;

        if (opObjects.length < 2) continue; // we need something with RSP+offset

        Object reg = opObjects[0];
        if (!(reg instanceof Register) || !reg.toString().equals("RSP")) continue;

        if (!(opObjects[1] instanceof Scalar)) continue; // RSP + register is hard
        Scalar offset = (Scalar) opObjects[1];

        if (offset.compareTo(stackOffset) > 0) {
//                    printf("%s is reference to argument %d\n", curr.toString(), (offset.getValue() - stackOffset.getValue()) / 8);
            arguments.add(offset);
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}

We then parse arguments by iterating over the rest of the instructions and grabbing instructions that reference RSP, and then verify the offset from RSP is greater than the size of our stack.

Then we can add those offenders to our list of arguments to add.

if (arguments.size() > 0) {
  function.setCustomVariableStorage(true);

  List<Variable> parameters = new ArrayList<>();
  for (Scalar off : arguments) {
      try {
          Variable v = new ParameterImpl(String.format("arg%d", (off.getValue() - stackOffset.getValue()) / 8), DataType.DEFAULT, (int)(off.getValue() - stackOffset.getValue()), currentProgram);
          parameters.add(v);
      } catch (InvalidInputException e) {
          e.printStackTrace();
      }
  }

  try {
      function.replaceParameters(
              parameters,
              Function.FunctionUpdateType.CUSTOM_STORAGE,
              true,
              SourceType.ANALYSIS
      );
  } catch (InvalidInputException e) {
      e.printStackTrace();
  } catch (DuplicateNameException e) {
      e.printStackTrace();
  }
}

Ghidra allows us to repace entire lists of arguments, so we use ParameterImpl to insert “stack offset” variables into the currentProgram, and this allows us to mark this function as processed and move on.

Some more internal information

In an attempt to also modify Ghidra’s source to support Go, we took a look at the internals.

Ghidra will decompile each function lazily, so only when you request a function to be decompiled. Each function also has a datatype to keep track of the current state of analysis for that function; FuncData.

Ghidra source has a very long description of FuncData:

The important thing to note here is “recover parameters”.

We want each function to be able to recover a sane prototype from the binary, so we need to figure out how to get Ghidra to detect the called function popping arguments off of the stack.

We also want parent function to be able to tell when arguments are being put onto the stack for a function that is being called.

The FuncData struct also holds these fields (for reference):

The interesting method in Funcdata is FuncData::startProcessing:

This function looks like the entrypoint to actual decompilation (and lifting).

Since its only called in one place, we can jump back and look at the function(or class) that is calling it:

This class lives in a file named Ghidra/Features/Decompiler/src/decompile/cpp/coreaction.hh, and lives with a bunch of other classes that all inherit from action. This file defines a variety of actions that can be taken on Funcdata, and allows each action to be defined and turned on or off at will.

This file defines many different entrypoints for Ghidra to request actions to be taken on the functions. Most of these actions are either setups for Pcode, or optimizations to make the resulting C-like psuedo code look “better”.

Going through each individual action reveals quite a bit about Ghidra internal functionality, but we are interesting in recovering arguments, so a very interesting class is ActionFuncLink:

This gives us access to function subcalls before Ghidra makes any real decisions about arguments or return value (which is also worthy of mention).

Checking out its apply(Funcdata&) definition reveals more about how Ghidra decides to deal with functions:

References

@remco_verhoef wrote a nice Medium artiicle describing how Ghidra interacts with the decompiler https://medium.com/@remco_verhoef/ghidra-decompiler-wireformat-ed9b11a793ec

Ghidra source code at github.com/NationalSecurityAgency/ghidra