News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

New program

Started by Grincheux, December 21, 2015, 08:45:12 AM

Previous topic - Next topic

Grincheux

I am writting a kind of desassembler.
I have done the first part wich consists in decoding the PE files.
I have not used formulas explained by Iczelion but only windows API function (imagehelp and dbghelp)
The program creates a subfolder "Work" and stores in it the resulting files.
The program does not display anything.
For viewing the results you must dump the diffrent *.bin files
If a program is packed I don't know to find the import directory.
I am searching some people to make tests.


My Pgm


Mikl__


ragdog

QuoteI am writting a kind of disassembler.

Write a disassembler is not easy the best way is use a finish disassembler sdk
like BeaEngine or OllyDisasm.

Grincheux

That's what I told to myself

TouEnMasm

already written,objconv with source codes in c++.
http://www.agner.org/optimize/
Fa is a musical note to play with CL

Grincheux

I don't like c++. That's spaghetti

TWell

C++ is good for big programs, but those who get micro-orgasms with it, try use it everywhere :P
Right tools in to right place :t
I can get my beers with bicycle, i don't need to use truck ;)

Grincheux

I made this program to find stange code like virus. Rather making a search on many bytes I thought that if I found a particular windows call, that means there is something strange.


The first approch was to load the file, get the import functions then I run the program and I would read the IAT to find changes.

ragdog

Quoterun the program

To find a malware in a code ohhaa

For a Disasm think i the best way to use BeaEngine

guga

Take a look at RosAsm disassembler code.

So far, i made it pretty stable and fast. I´m working on it to improve the accuracy x speed.

About packed data, symbols recognized, name types and so on, let this development at the last of the last of the last part to development. Those things are not that important at the initial stage of the devs, IMHO .  Ex: on Ida it displays a nice and "pretty" result, but this, mainly is due to the recognition of the symbols and library used which "masks" the real disassemblement process.

Comparing RosAsm disassembler x IdaPro disassembler on the "naked" mode, i can safelly say that RosAsm is better then Ida, on the sense it can correctly interpret what is code/data in 97-98 % of the cases in large and "normal made" apps (Not packed, i mean). Of course, it have flaws, of course, if you feed a packed app it won´t recognize which packer at the actual stage of development.

But, anyway...if you want the code of a more stable disassembler, take a look at RosAsm source. But...get ready, it is a tedious task, specially when working on real life applications. (I mean, disassembling and reassembling "normal apps")

For a better performance you can use maps to hold the different types of the code you are analyzing. basically each map is a copy of the app, but containing flags to you use during the disassembling. For example, you create a section map and on each byte you insert flags, to represent the Code, Data, Virtual Data, Import Section, Resources and so on. On other map you create for the size of the data (Byte, Word, Dword, Float, Real, etc), other map can be used for strings (Although this can be included on the size type map), another map for the routing type (I mean, Including the type of data. I.e: a instruction, if it is accessed, insertion of labels, if it is a indirect or direct pointer and so on).

Also, to speed up the disassembling tasks, you need to identify whatever you can of what is code and what is data/virtual data before the real disassemblement job. This can be done, analysing the characteristics of the section of the PE. For example, when the function identifies data belonging to the PE/MZ header, so, you actually don´t need them to be disassembled as code. So, most likely, they are Data. The same for when you identify a true data section (Like the IAT, or Export table, or rfesources) all of it is data. So, the better is identifying 1st the type of data on each section and setting the flag accordly (on the maps, ex: the section map).

Once this is finished, you have on the proper maps, many data quickly identified, and all the rest that need to be analyzed is exactly the code/data interpretation that needs to be done. Some files, for example, when you previously identify the contents of the section may represent 70% of the job done without you actually "touched" the core of the disassembler tasks.

You need to be careful with: "self-modifiable" apps, obfuscation code inside, functions with hundreds of nested loops inside, functions containing Data in the middle of them. (I mean, you have a Procedure containing data inside of it), false pointers, errors on the program (I.e: Some apps maybe made with broken libraries leading the analysis to incorrect results. For example: if you identify a function that contains a pointer to data but, in fact, this ia not a pointer, but a simple value: 0401256 maybe mistaken with a real address inside the app), Data that can be either a string and pointer and a Float etc etc..

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Grincheux

I already made one, a long time ago when there was only 8086 instruction set.
At this date Mod Reg R/M was easy to decode...
Every day I kill one or two hundreds of virus (it's my job...) and I asked to myself why the antivirus are so bad?
And why the cleaners are so good... When the antivirus does not kill the cleaner...

guga

Indeed, many anti-virus apps are real bad for the task. Maybe because on the way they uses to identify the viruses. perhaps heavy usage of "heuristics" instead a better data bank containing the possible data chains that can be identified as a potential virus or ... a better approach to identify those data chains would be more appropriate

Sometimes, simplicity is better, IMHO. Unfortunately viruses are craps that we need to dealt with every day. I´m amazed how people lost their times creating apps that can potentially harm users PCs. Worst is when the virus is used to commit crimes, per se, like stealing your data bank account, credit card info and so on.

Of course, creating a "virus" or app to identify flaws in security applications is one thing but, using it to intentionally harm other people´s computers are another.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

I forgot about this: "If a program is packed I don't know to find the import directory."

I tried that a long time ago. One way i found is dump the file to disk and analyse the contents of known packers. For example, if you dump a upx packed app, it does have inside the addresses of the used apis from the original IAT as long with the name of the dll used by the packed app (And, if i remember well, older versions of upx kept the original section inside the dumped data) . One of it´s section contained things like 0100AFBCD, 06FBDFEFE and so on. Once found those long chains of data, i then loaded the libraries (previously found) and compared the addresses to search for the proper Api function. Example, the unpacked data contains gdi32.dll, and one of the addresses are related to that module. (Example..starting with 0600000), then i checked the IAT of the module (gdi32) and looked for the name related to the address 06FBDFEFE).

I actually don´t remember the exact technique i used, since i didn´t implemented it inside the official version of RosAsm at the time but, it shouldn't be hard to identify/reconstruct the IAT table. Except, perhaps, if the unpacked data, creates the IAT in memory and have no traces of it, so you may need to debug the application and analyze the results, i suppose.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

Grincheux

Thank you Guga for all these infos.
The two last virus (2 pcs) I had to kill, i can't but had to reinstall w8 for one and w10 for the second!
I am searching a tool for reading the registry database (NTUSER.DAT...) form an other disk than the one windows is running.
Do you know such a software?


ragdog

QuoteIf a program is packed I don't know to find the import directory
QuoteI tried that a long time ago. One way i found is dump the file to disk and analyse the contents of known packers. For example, if you dump a upx packed app, it does have inside the addresses of the used apis from the original IAT as long with the name of the dll used by the packed app

Look in import rebuilder or hook Getproccaddress.