CommWarrior.B Thorough IDB (ARM/C++)

This was originally posted on January 3rd, 2008 on OpenRCE.

This is the IDB for a nasty little SymbianOS worm that I reverse engineered in February of 2006.  The project was more difficult than most in several respects.  I'd only ever done one ARM project before this, and so I found myself referencing the ARM documentation.  I had no familiarity with the SymbianOS API, which turns out to be object-oriented from start to finish.  Apart from that, the author made extensive use of the object-oriented features of C++ in his non-API-related code; the project was the most intensely object-oriented one that I had done up until that time.  Plus, this excellent document on SymbianOS reversing had not been released yet.  I also did not have access to hardware upon which to run the worm, and so the project had to be conducted purely statically.  Finally, I had never used a mobile phone before and was unfamiliar with all of this fancy SMS and BlueTooth stuff -- yeah, I'm a luddite.

I also did a decompilation for this, but I think that releasing it would do more harm than good.  Mobile phone worms are lame, and the world does not need more of them.

Make sure to check out the database notepad.  Enjoy!

ProcDump 1.62 Thorough IDB

Originally published October 7, 2007 on OpenRCE

After some deliberation, I have decided to release my thorough IDB for ProcDump 1.62 Finalwhich is substantially more detailed than the original ASM source code itself.  If you care to study it, you can learn a great deal about coding dynamic reversing tools and static reversing.  

At the time I analyzed this, in late 2003, it was the largest binary that I'd attempted.  My analysis style was somewhat immature and sporadic, and so you shouldn't try to emulate anything you see inside of it.  (It took another six months after this to perfect my static technique.)

I hope that the ProcDump authors aren't upset about this; after all, ProcDump is nine years old and has since been succeeded by ImpRec, OllyDump, NTICEDUMP, etc.  Greets to the ProcDump team, and thanks for their valuable contribution (which ultimately shaped the direction of dynamic tools for years to come).

IDA's IDS Files

Originally published June 7, 2007 on OpenRCE.

This topic comes up occasionally, so it's worth a quick investigation.  Your IDA directory has a subdirectory called 'ids' that contains more directories, which in turn contain .IDS files.  .IDS files do two things:  they define a mapping between ordinal numbers and symbol names (which may be mangled, and may contain the number of function arguments and their types), and secondly they allow (optional) comments for those functions.

The IDSUtil Package from Hex-Rays' website (only available to customers) provides tools to create .IDT files from statically-linked libraries and then to convert those into .IDS files.  .IDT files are flat text files whose syntax is described in the readme.txt inside of the IDSUTIL package.  

The 'ar2idt' tool produces an .IDT file from a .LIB.  Its command-line syntax is "ar2idt [filename].[lib/obj/o/etc.]" to produce [filename].IDT.  This tool supports several different object-file formats, as different compiler vendors use different ones.

Here's a sample from an .IDT file:

0 Name=MSGS.DLL
1 Name=??0CBaseMtm@@IAE@AAVCRegisteredMtmDll@@AAVCMsvSession@@@Z
2 Name=??0CBaseServerMtm@@IAE@AAVCRegisteredMtmDll@@PAVCMsvServerEntry@@@Z
3 Name=??0CMsgActive@@IAE@H@Z
4 Name=??0CMsvDefaultServices@@QAE@XZ
5 Name=??0CMsvEntrySelection@@QAE@XZ
313 Name=??0CMsvFindOperation@@IAE@AAVCMsvSession@@ABVTDesC16@@IAAVTRequestStatus@@@Z
314 Name=??0CMsvFindResultSelection@@QAE@XZ
6 Name=??0CMsvOperation@@QAE@AAVCMsvSession@@HAAVTRequestStatus@@@Z

After you have an .IDT file, the zipids.exe tool is used to turn an .IDT file into an .IDS file.  Its command-line is simply "zipids [filename].IDT" to create [filename].IDS.

A SymbianOS Example

While reverse engineering a SymbianOS worm in February 2006, I noticed that IDA wouldn't convert some by-ordinal imports from SymbianOS DLLs into their real names:

.idata:00405678 ;
.idata:00405678 ; Imports from PBKENG[101f4cce].DLL
.idata:00405678 ;
.idata:00405678 IMPORT __imp_PBKENG_18; DATA XREF: .text:off_404568
.idata:0040567C IMPORT __imp_PBKENG_21; DATA XREF: .text:off_4045A8
.idata:00405680 IMPORT __imp_PBKENG_43; DATA XREF: .text:off_404518
.idata:00405684 IMPORT __imp_PBKENG_72; DATA XREF: .text:off_404588
.idata:00405688 IMPORT __imp_PBKENG_73; DATA XREF: .text:off_404578
.idata:0040568C IMPORT __imp_PBKENG_101 ; DATA XREF: .text:off_404528
.idata:00405690 IMPORT __imp_PBKENG_110 ; DATA XREF: .text:off_404538
.idata:00405694 IMPORT __imp_PBKENG_173 ; DATA XREF: .text:off_404508
.idata:00405698 IMPORT __imp_PBKENG_180 ; DATA XREF: .text:off_404548
.idata:0040569C IMPORT __imp_PBKENG_185 ; DATA XREF: .text:off_404558
.idata:004056A0 IMPORT __imp_PBKENG_254 ; DATA XREF: .text:off_404598

I installed the SymbianOS SDK and then came up with a convoluted series of scripts wrapped around the GNU tool suite that would extract the function names and their ordinals from the relevant .LIB, and then create an IDC script that would rename any import-by-ordinal to its real name.  A friend chuckled at this Rube Goldberg-esque contraption and suggested that I use the IDSUTIL package instead.

It couldn't be easier:  just type "ar2idt pbkeng.lib && zipids pbkeng.idt" to produce an .IDS file for the pbkeng.lib static library.  Now inside of IDA, go to File->Load File->IDS File, and select the .IDS file that was created.  Alternatively, you can put this in the %IDA%\ids\epoc6\arm directory to have IDA load it automatically (after a restart).  Here are the results of applying it:

.idata:00405678 ;
.idata:00405678 ; Imports from PBKENG[101f4cce].DLL
.idata:00405678 ;
.idata:00405678 ; CPbkContactItem::CardFields(void)const
.idata:00405678 IMPORT CardFields__C15CPbkContactItem
.idata:00405678 ; DATA XREF: .text:off_404568
.idata:0040567C ; CPbkContactEngine::CloseContactL(long)
.idata:0040567C IMPORT CloseContactL__17CPbkContactEnginel
.idata:0040567C ; DATA XREF: .text:off_4045A8
.idata:00405680 ; CPbkContactEngine::CreateContactIteratorLC(int)
.idata:00405680 IMPORT CreateContactIteratorLC__17CPbkContactEnginei
.idata:00405680 ; DATA XREF: .text:off_404518
.idata:00405684 ; CPbkFieldInfo::FieldId(void)const
.idata:00405684 IMPORT FieldId__C13CPbkFieldInfo
.idata:00405684 ; DATA XREF: .text:off_404588

MFC Example

Let's see how to convert the MFC .DEF file into an .IDS file.  First, here's a snippet from the .DEF file:

; This is a part of the Microsoft Foundation Classes C++ library.
; Copyright (C) 1992-1998 Microsoft Corporation
; All rights reserved.

LIBRARY MFC42 

EXPORTS
DllGetClassObject @ 1 PRIVATE
DllCanUnloadNow @ 2 PRIVATE
DllRegisterServer @ 3 PRIVATE
DllUnregisterServer @ 4 PRIVATE
?classCCachedDataPathProperty@CCachedDataPathProperty@@2UCRuntimeClass@@B @ 5 DATA
?classCDataPathProperty@CDataPathProperty@@2UCRuntimeClass@@B @ 6 DATA
; MFC 4.2(final release)
??0_AFX_CHECKLIST_STATE@@QAE@XZ @ 256 NONAME

We can see that lines starting with a ";" are comments, any line containing the string " @ " is an actual export declaration, and everything else is part of the DEF file structure.  We only want the export declarations.  Let's run a quick sed/awk script on the .DEF file:

sed -e '/^ *;/d' MFC42.def | sed -n -e '/ @ /p' | gawk '{ print $3 " Name="$1 }' > MFC42.idt && zipids MFC42.idt

The first part of that command erases any comment-lines (those that begin with any number of spaces and then a semi-colon); the second part accepts any line that contains the string " @ "; and the third part converts the results into the .IDT file format.

To complete the job, we need to manually add a line that says "0 Name=MFC42.dll" to the top of the file.  Also, be sure to name the .IDT file the same as the DLL/LIB base name, e.g. mfc42.idt.  As before, we then run zipids on it to produce an .IDS file, which can be loaded into IDA and/or put into the %IDA%\ids directory to have it loaded automatically when appropriate.

Before applying the .IDS file:

.idata:4BB710DC extrn __imp_MFC42_6467:dword ; DATA XREF: MFC42_6467

Afterwards:

.idata:4BB710DC ; public: __thiscall AFX_MAINTAIN_STATE2::AFX_MAINTAIN_STATE2(class AFX_MODULE_STATE *)
.idata:4BB710DC extrn ??0AFX_MAINTAIN_STATE2@@QAE@PAVAFX_MODULE_STATE@@@Z:dword

Shellcode Analysis

Originally published April 3, 2007 on OpenRCE.

Here is a simple IDA trick that I use for shellcode analysis.  API functions in shellcode are typically looked up dynamically based upon the DLL's base address and a 32-bit hash of the function's name (GetProcAddress via hashing), like such:

seg000:00000268    push    0EC0E4E8Eh ; is actually LoadLibraryA
seg000:0000026D    push    eax
seg000:0000026E    call    sub_29D

The most commonly-seen API hashing function is the following one:

seg000:000002C6 loc_2C6:
seg000:000002C6    lodsb
seg000:000002C7    test    al, al
seg000:000002C9    jz      short loc_2D2
seg000:000002CB    ror     edi, 0Dh
seg000:000002CE    add     edi, eax
seg000:000002D0    jmp     short loc_2C6

Since shellcode often makes use of well-known subsets of the Windows API (such as WinExec, CreateProcess, CreateFile, MapViewOfFile, the sockets API, the wininet API, etc), it is often obvious from the context which API functions are being used.  Nevertheless, occasionally you'll have to reverse a hash into an API name, and that can quickly become annoying.

My solution to this is a small python script, based upon Ero's pefile, that creates an IDC declaration of an IDA enumeration for each DLL.  The enum serves as a mapping between each exported name and its hash.  Since the API hashing function may change, the Python function to do this is extensible via a function pointer which defaults to the standard hash presented above.

After creating the IDC script and loading it into IDA, simply press 'm' with your cursor over the hash value.  IDA will either find the hash in one of the enumerations or tell you that it can't find it, in which case either your hash function implementation is buggy, or the function lies inside of a DLL whose hashed export names are not yet loaded as an enum.  A successful result is:

seg000:00000268 push kernel32_apihashes_LoadLibraryA
seg000:0000026D push eax
seg000:0000026E call sub_29D

One nice thing about this method is that IDA will search all loaded enumerations for the hash value.  I.e. you don't need to tell IDA which DLL base address is being passed as arg_0:  if it knows the hash, it will tell you both the name of the export and the name of the enumeration that it came from.  Using a named enumeration element eliminates the need for a comment at each API call site.  Another nice thing is that, since the hash presented above is so common, you can create the enumerations once and put them into a big 'shellcode.idc' file and then immediately apply them to any shellcode using this hash (such as some recent in-the-wild ANI exploits, or the HyperUnpackMe2 VM from my most recent OpenRCE article) without missing a beat.

Porting everything to IDAPython and thereby removing the dependency upon the IDC script is left as a simple exercise for the reader.