News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

workings of disassembler

Started by FlySky, May 04, 2013, 05:15:39 PM

Previous topic - Next topic

FlySky

Hey guys,

I am thinking about writing a disassembler to play more with assembly.
What I was wondering with IDA for example:

.text:00CD28B0 55                                                           push    ebp
.text:00CD28B1 8B EC                                                      mov     ebp, esp

If I use CreateFileMapping to map a file into memory. Is it possible to find a reference to the function start?.

The above portion is called from:
.text:0062CA6B E8 40 5E 6A 00                                               call    sub_CD28B0

But what if I only have the function start, how would I go on on finding the caller, without breakpointing the code offcourse.
It's something that has me busy for a while now and I am wondering how it works.


qWord

Quote from: FlySky on May 04, 2013, 05:15:39 PM
Hey guys,But what if I only have the function start, how would I go on on finding the caller, without breakpointing the code offcourse.
simple in theory: You take the entry point ( if available: export directory) and start a analyses whereas you follow all jumps and calls if possible. The problem is code that get/calculate its addresses dynamically.
MREAL macros - when you need floating point arithmetic while assembling!

dedndave

as qWord says, you cannot depend on finding all of them
the vectors for routines might be in a table, for example

also, just because the numbers add up, doesn't mean it's a real call
there might be other code or data that appears to call a certain address
in practice, you might find it rare, though

having said that, you can calculate the near displacement for each address and see if there is a call instruction
near relative calls in 32-bit code are 5 bytes long:

00402B00:
E8DE000000        call    sub_00402BE3
NextEip:
00402B05:


E8h is the call opcode, and the operand is 000000DEh
DEh is calculated as the address of the called routine, minus the address of the return
00402BE3 - 00402B05 = 000000DE
notice that the operand can be negative if the routine's address is lower than the call   :P

Tedd

You're thinking about it backwards. You don't want to find all the callers of a given function, you want to find all calls to all functions - some of which will be to the same function, so then you can group those together.

Anyway, before you get to that point, you have a lot more to do :P

Before you even start the disassembly, you'll need to decode the PE format. Map the exe's sections, calculate their addresses, etc. By then you'll have the entry point, so that's where you start decoding instructions.

You can do that quite linearly until you get to a jump or call. Branch off and follow those paths, but still keep note of the path you were following. Eventually, you will have followed all reasonable code paths and have the information to define the bounds of functions and their callers, and similarly with jumps.

After that, you can start doing fancy stuff; but it should take you a while to get there first :badgrin:

I said reasonable because some code will be purposely obfuscated to make disassembly difficult, but don't worry about that for now - 90% of programs are 'reasonable.' You can try to deal with other tricks as part of the "fancy stuff."
Potato2

ragdog

Hi

Use a open source disassembler library and look into

http://www.beaengine.org/

TouEnMasm


There is two samples of it.
The objconv come with is source code in C.
the dumpbin of the c++ had no source code.

A usefull comparison can be made of them.


Fa is a musical note to play with CL

FlySky

Thanks for the usefull information.
Will bounce my head into this stuff, was just wondering how IDA / Olly are finding everything.
This will keep me busy for a while.

Stan

Quote from: ragdog on May 05, 2013, 04:29:03 PM
Hi

Use a open source disassembler library and look into

http://www.beaengine.org/

Thank you for the reference.

hfheatherfox07

Hello ,
Sorry for being a way for a while , but screwed up life lately...nothing but stuff keeps happening .... With hat said I have a few minutes to check on the forum.... And I saw this thread , I have some stuff that I can contribute , that I think Might help.... It Is in C++ (targets are in MASM)...some one might have time right now to translate...
I hope this is of some help .....

;)
Your code and your skills will be assimilated. Your programming language is irrelevant.
We are the ASM Borg and you will become part of us. Compile and be assembled.