The MASM Forum

General => The Campus => Topic started by: Grincheux on January 13, 2016, 03:13:01 PM

Title: The fool encoder
Post by: Grincheux on January 13, 2016, 03:13:01 PM
Quote0040102B: 67 81 06 11 01 78 56 34 12 add         dword ptr ds:[0000h],12345678h

I don't understand this form of coding, I change bytes 3 and 4, I always get the same result!

It lloks like this form

Quote00401001 81 05 00 00 00 00 78 56 34 12 add         dword ptr ds:[0],12345678h 
0040100B 81 05 FE CA 00 00 78 56 34 12 add         dword ptr ds:[0CAFEh],12345678h

Look at : Here (http://www.phrio.biz/mediawiki/Strange_Codings)
Title: Re: The fool encoder
Post by: dedndave on January 14, 2016, 12:10:13 AM
perhaps you've found a bug in either the assembler or disassembler

Quote0040102B: 67 81 06 11 01 78 56 34 12 add         dword ptr ds:[0000h],12345678h

i believe that the 67h is a size override operator - which is out of place
as though you are assembling 32-bit code in a 16-bit segment

if we throw that away, i get

81 06 11 01 78 56 add dword ptr ds:[esi],56780111h ;(DS segment override implied)
34 12             xor     al,12h
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 01:17:24 AM
Exactly.

It have only 1 way to it be a valid instruction (packuswb) when used after 0F (or with other escape prefix 066)
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 02:37:46 AM
http://www.phrio.biz/mediawiki/Strange_Codings

I list here all the codes that seem strange

That could be a way for a terrorist to pass a message
That could serve to install a protection...
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 04:59:26 AM
Strange codings updated for codes 81h and 82h (http://www.phrio.biz/mediawiki/Strange_Codings)
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 06:03:22 AM
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 07:40:15 AM
VS and dumppe are far from being a usable disassemblers. At most you can use them for having some basic notions of some small parts of chunk but it is not to be used as a regular daily basis.

Borg is extremelly old. The last time i used it was more then 15 years ago :greensml: But, why are you using such tools ? There are much much better ones for you to start.

2 of them are free and opensource (RosAsm and Olly - the disasm engine, i mean). other is commercial and extremelly expensive (IdaPro - but...you can find it ;) )

If you want to write your own disassembler, i strongly suggest you to read Rosasm source code. It is way more easier then it seems  :icon_mrgreen:
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 08:10:35 AM
I don't know how to use the RosAsm Disassembler :eusa_snooty:
Title: Re: The fool encoder
Post by: sinsi on January 14, 2016, 08:11:47 AM
IDA 5 free (from the maker) (https://www.hex-rays.com/products/ida/support/download_freeware.shtml)
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 08:23:46 AM
Use the disassembler or it´s source ???

To use all you have to do is open a PE file on it...It will disassemble it automatically.

About the source code., you need to study the syntax, but it is not hard to follow. Look at small examples 1st (Iczelion´s, Test Department etc)
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 08:27:17 AM
First use the disassembler
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 08:45:45 AM
Quote[Data04265CC: D$ 01000000, 03000200, 0400, 0BBCCDD05, 0600AA, 080007, 0A0009
                 0C000B, 0CCDD0D00, 0E00AABB, 010000F00, 012001100, 014001300
                 0DD150000, 0AABBCC, 0170016, 0190018, 01B001A, 01D00001C
                 0AABBCCDD, 01F001E00]
Code0426620: A0:
    push esi
    push ebx
    fxsave X$Virtual0463020
    xchg eax ebx
    xor edx edx
    cmp ebx 0B | jne C8>  ; Code042663C
    dec ebx
   
Code0426632: B8:
    call Code042664B
    dec ebx | jns B8<  ; Code0426632
jmp D3>  ; Code042664

RosAsm DisAssembly!
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 09:21:25 AM
 :t

To see how accurate it was, I would need to look at the file to see the rest of the code but...

[Data04265CC: D$ 01000000, 03000200, 0400, 0BBCCDD05, 0600AA, 080007, 0A0009
                 0C000B, 0CCDD0D00, 0E00AABB, 010000F00, 012001100, 014001300
                 0DD150000, 0AABBCC, 0170016, 0190018, 01B001A, 01D00001C
                 0AABBCCDD, 01F001E00]



This is decoded as Data. Everything in between brackets "[" "]" remains to the data section. They are data. On this case, the data chain is formed by a array of DWORDS (D$)



Code0426620: A0:
    push esi
    push ebx
    fxsave X$Virtual0463020
    xchg eax ebx
    xor edx edx
    cmp ebx 0B | jne C8>  ; Code042663C
    dec ebx
   
Code0426632: B8:
    call Code042664B
    dec ebx | jns B8<  ; Code0426632
jmp D3>  ; Code042664


"X$ data type. This can be with any size. Since the target is a memory location and the opcode allow storing in bigger sizes (512-byte ) on this case, the data type used is "X$" meaning that it uses a size not "conventional". Conventional i mean: dword, qword, word etc

jne C8>  ... It is performing a jmp below that line. In case, C8 is a address labeled on a short form. The token ">"  means the direction of the jmp. In case it is below that line of code (Go down). If it was jumping before it the sign would be "<" . Same as forward/backward (or up/down)

Virtual0463020 A address in the virtual data section of the PE.


Code0426632: B8:  The disassembler uses readable labels. So, "Code" means that the address belongs to the code section and it is, in fact, code and not data. The number after it, is the address.  And the label "Data" (Like in Data04265CC) means that the address is, in fact, data and belongs to the data section. "Virtual": the same concept, but is a virtual address. The goal of any disassembler is basically distinguish what is code and what is data so...they are labeled accord to what they are.

The next token "B8" is just the short form of that address. Useful for jumps to that location. (More readable then we have in a source tons of "je CodeXXXX" "jne "CodeYYYYY" all over the place. (Nevertheless, we make a reference to the a dress in comments on the same line as in "jns B8<  ; Code0426632" . (It will jmp to Code0426632 that is also labeled as "B8") - That address is written as "Code0426632: B8:".

The ":" sign means the address is a label. (In case, a code label)

Basically, all the values in the disassembly data are in hexadecimal form (0 in front of the value as in  01F001E00) (There are few cases when it disassembles as decimal, but in RosAsm the syntax of the data is trivial. 0 in front for hex (0A, 0B, 0FFFF etc etc) and without zero for decimal (9, 1, 125256, 777 etc). For binary are double zeroes after a "_" sign "00_" (00__0001, 00__0000_0001__0000_0000 etc)

Also, for hex the "h" char at the end is acceptable (but, needs 0 at the 1st). 0FFFFFEFFh for example.

call Code042664B a call to a function labeled as "Code042664B" meaning that at that address 042664B there is a function.
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 10:00:34 AM
Here is all the project with RosAsm source file
Use 7zip to decompress

http://www.7-zip.org/download.html
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 10:18:07 AM
Like i said..it is data correctly interpreted as such:

Rosasm listing:

Main:
Code0426580: A0:
    push ebp
    mov ebp esp
    sub esp 08
    push 00
    push 080
    push 02
    push 00
    push 00
    push 040000000
    push Data0462260
    call 'kernel32.CreateFileA'
    mov D$ebp-04 eax
    push 00
    lea eax D$ebp-08
    push eax
    push 036260
    push Data042C000
    push D$ebp-04
    call 'kernel32.WriteFile'
    push D$ebp-04
    call 'kernel32.CloseHandle'
    push 00
    call 'kernel32.ExitProcess'

[Data04265CC: D$ 01000000, 03000200, 0400, 0BBCCDD05, 0600AA, 080007, 0A0009
                 0C000B, 0CCDD0D00, 0E00AABB, 010000F00, 012001100, 014001300
                 0DD150000, 0AABBCC, 0170016, 0190018, 01B001A, 01D00001C
                 0AABBCCDD, 01F001E00]

Code0426620: A0:
    push esi
    push ebx
    fxsave X$Virtual0463020
    xchg eax ebx
    xor edx edx
    cmp ebx 0B | jne C8>  ; Code042663C
    dec ebx
   
Code0426632: B8:
    call Code042664B
    dec ebx | jns B8<  ; Code0426632
jmp D3>  ; Code0426641
   
Code042663C: C8:
    call Code042664B
   
Code0426641: D3:
    fxrstor X$Virtual0463020
    pop ebx
    pop esi
    ret


IdaPro listing



; =============== S U B R O U T I N E =======================================

; Attributes: noreturn bp-based frame

public start
start proc near

NumberOfBytesWritten= dword ptr -8
hFile = dword ptr -4

push ebp
mov ebp, esp
sub esp, 8
push 0 ; hTemplateFile
push 80h ; dwFlagsAndAttributes
push 2 ; dwCreationDisposition
push 0 ; lpSecurityAttributes
push 0 ; dwShareMode
push 40000000h ; dwDesiredAccess
push offset asc_462260 ; "C:\\Users\\Grincheux\\Documents\\PrjJwA"...
call CreateFileA
mov [ebp+hFile], eax
push 0 ; lpOverlapped
lea eax, [ebp+NumberOfBytesWritten]
push eax ; lpNumberOfBytesWritten
push 36260h ; nNumberOfBytesToWrite
push offset unk_42C000 ; lpBuffer
push [ebp+hFile] ; hFile
call WriteFile
push [ebp+hFile] ; hObject
call CloseHandle
push 0 ; uExitCode
call ExitProcess
start endp

; ---------------------------------------------------------------------------
dd 1000000h, 3000200h, 400h, 0BBCCDD05h, 600AAh, 80007h
dd 0A0009h, 0C000Bh, 0CCDD0D00h, 0E00AABBh, 10000F00h
dd 12001100h, 14001300h, 0DD150000h, 0AABBCCh, 170016h
dd 190018h, 1B001Ah, 1D00001Ch, 0AABBCCDDh, 1F001E00h

; =============== S U B R O U T I N E =======================================


sub_426620 proc near ; CODE XREF: sub_426DC8+9AFp
; sub_427EB7+18Bp ...
push esi
push ebx
fxsave ds:dword_463020
xchg eax, ebx
xor edx, edx
cmp ebx, 0Bh
jnz short loc_42663C
dec ebx

loc_426632: ; CODE XREF: sub_426620+18j
call sub_42664B
dec ebx
jns short loc_426632
jmp short loc_426641
; ---------------------------------------------------------------------------

loc_42663C: ; CODE XREF: sub_426620+Fj
call sub_42664B

loc_426641: ; CODE XREF: sub_426620+1Aj
fxrstor ds:dword_463020
pop ebx
pop esi
retn
sub_426620 endp



The only difference is that i didn´t implemented yet the macro and api recognition, the DIS system (Digital DNA - similar as flair). But the raw interpretation of what is code/data is exactly the same.
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 10:42:20 AM
It found only 3 errors. One with a XMM1 data size (I´ll fix that later - movd XMM1 D$esp+014 ; instead of movd XMM1 W$esp+014) and other with Jochen´s library here:


Code04295F7: I7:
    test cl 05 | je L9>  ; Code0429617
    or B$ebp-034 04
    lea eax D$edx*4+Data0429914 <----- This address does not exists !!!! It is a simple value. Something wrong with the linker happened ? Because masm (or jwasm) should be using that address or insert the proper error code. Idapro produces the same result. Olly too.



Code0429E28: K4:
    lea eax D$edi*4+Data0429E6D <----- same as above. This address does not exists. How the linker assembled it ?
    call eax


About the above 2 problems regarding the address, this is a problematic decision. If we simply disallow the disassembler to interpret any non referenced address as data/code, we are easily leading to errors on the rest of the code, because this value is both (a address or a immediate). So, it is more a matter of choice of interpretation then a error per se. Can be enhanced i guess, to overcome those problems by some linkers (i saw some of those things too in watcom files - rare to happens, fortunately), but i´ll think on a solution later.


But for your project of writing a disassembler keep in mind that what you need to do at the very 1st place is analyze the contents of the PE sections). So, the better technique is use maps (as i explained earlier in another post).

On the sections map you flag what is a resource (that is data), IAT (also data), data section (also ata), virtual section (as the name says...virtual data), the PE header itself (data....but unused most of the time), the MZ header (idem), etc..For that, the better is check the characteristics of the section, regardless the name it was labeled (.text., .data., .idata., .potato., .orange, whatever  :icon_mrgreen: ). So, you must flag everything that you know that belongs to data on the very 1st place.

All that left can be either code or data and this is where the disassembler will works to try to separate data in the middle of code.
Title: Re: The fool encoder
Post by: jj2007 on January 14, 2016, 02:44:46 PM
Quote from: guga on January 14, 2016, 10:42:20 AM
It found only 3 errors ... with Jochen´s library here:


Code04295F7: I7:
    test cl 05 | je L9>  ; Code0429617
    or B$ebp-034 04
    lea eax D$edx*4+Data0429914 <----- This address does not exists !!!! It is a simple value. Something wrong with the linker happened ? Because masm (or jwasm) should be using that address or insert the proper error code. Idapro produces the same result. Olly too.



Code0429E28: K4:
    lea eax D$edi*4+Data0429E6D <----- same as above. This address does not exists. How the linker assembled it ?
    call eax

Nice find, Gustavo :t

(the code works like a charm, of course. This is Float2Asc, tested a thousand times...)

#1:
            test cl, 4+1      ; MbXmmR or MbXmmI
            .if !Zero?
                  or byte ptr f2sInt, 4      ; prevent %u correction below
                  lea eax, [MovXmmStr+4*edx-80]
                  lea edx, f2sTmp64
                  call eax
                  test cl, 1      ; odd or even?
                  .if Zero?
                        fld REAL8 ptr [edx]
                  .else
                        fild QWORD ptr [edx]
                  .endif
            .endif


Olly:
00406F48   ³.  F6C1 05       ³test cl, 05
00406F4B   ³. 74 1B         ³jz short 00406F68
00406F4D   ³.  804D CC 04    ³or byte ptr [ebp-34], 04
00406F51   ³.  8D0495 647240 ³lea eax, [edx*4+407264]
00406F58   ³.  8D55 F8       ³lea edx, [ebp-8]
00406F5B   ³.  FFD0          ³call eax


MovXmmStr:
      movlps qword ptr [edx], xmm0            ; 4 bytes incl. ret
      retn
      movlps qword ptr [edx], xmm1
      retn


Olly:
004072B4   Ú.  0F1302        movlps [edx], xmm0
004072B7   À.  C3            retn
004072B8   Ú.  0F130A        movlps [edx], xmm1
004072BB   À.  C3            retn


#2:
            test dl, 4+1      ; MbXmmR or MbXmmI
            .if !Zero?
                  test dl, 1
                  .if Zero?
                     fstp QWORD ptr [ebx]
                  .else
                     fistp REAL8 ptr [ebx]      ; use integer for r format
                  .endif
                  lea eax, [MovXmm+4*edi-80]      ; 7 bytes
                  call eax
            .endif


Olly:
0040776D   ³.  F6C2 05       test dl, 05
00407770   ³. 74 14         jz short 00407786
00407772   ³.  F6C2 01       test dl, 01
00407775   ³. 75 04         jnz short 0040777B
00407777   ³.  DD1B          fstp qword ptr [ebx]
00407779   ³. EB 02         jmp short 0040777D
0040777B   ³>  DF3B          fistp qword ptr [ebx]
0040777D   ³>  8D04BD C27740 lea eax, [edi*4+4077C2]
00407784   ³.  FFD0          call eax


MovXmm:
      movlps xmm0, qword ptr [ebx]      ; 4 bytes each incl. ret
      retn


Olly:
00407812   Ú.  0F1203        movlps xmm0, [ebx]
00407815   À.  C3            retn
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 03:31:42 PM
Thank you sinsi for the link. I downloaded the file but when I run it it crashes!
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 05:06:16 PM
What do you want to check here ?

QuoteDirectory Name                          VirtAddr  VirtSize
--------------------------------------  --------  --------
Export                                  00000000  00000000
Import                                  00002000  00000534
Resource                                00004000  00034D90
Exception                               00000000  00000000
Security                                00000000  00000000
Base Relocation                         00000000  00000000
Debug                                   00000000  00000000
Decription/Architecture                 00000000  00000000
Machine Value (MIPS GP)                 00000000  00000000
Thread Storage                          00000000  00000000
Load Configuration                      00000000  00000000
Bound Import                            00000000  00000000
Import Address Table                    00000000  00000000
Delay Import                            00000000  00000000
COM Runtime Descriptor                  00000000  00000000
(reserved)                              00000000  00000000

No IAT

Only

QuoteImport                                  00002000  00000534

So I must look for the datas section (intialized and unitialized), the code section.

QuoteSize of Code                            00000C00
Size of Initialized Data                00000A00
Size of Uninitialized Data              00000000
Address of Entry Point                  00001880
Base of Code                            00001000
Base of Data                            00002000
Image Base                              00400000

Here is the command lines that created the file analyzed by dumppe

QuoteC:\JWasm\Bin\JWASM.EXE -9 -Fl -c -zlf -zlp -zls -W3 -coff -Cp -nologo /I"C:\JWasm\Include" "ASD.asm"
C:\JWasm\Bin\JWlink.EXE FORMAT WINDOWS PE LIBPATH C:\JWasm\Lib OPTION SHOWDEAD OPTION NXCOMPAT OPTION NORELOCS OPTION ELIMINATE OPTION CHECKSUM RESOURCE ASD.res RUNTIME WINDOWS NAME ASD.exe FILE ASD.obj
Title: Re: The fool encoder
Post by: guga on January 14, 2016, 05:48:07 PM
You must analyze 1st what is data and what is code. The easier way to do that is seeing what are the contents of the data. For example, you start at the very 1st byte (MZ), this belongs to the IMAGE_DOS_HEADER structure. Since you know it is all data, you flag it as such. Then you see the pointers to it´s member. In case, the next pointer is the PE header.

You go there and do the same, flag all this structure as data.
Then do the same as before.... check the pointers of the members.

If they points to virtual data, you flag it as such on the previous created map file.
If they are data you do the same.

How you know they which members points to data only ? Check the contents of _IMAGE_DATA_DIRECTORY. All those members are pointers to data (in form of structures)

The next thing is analyzing the contents of the IMAGE_SECTION_HEADER.

You start by seeing if at that section the EntryPoint is there or not. If is there, it _may_ be code section. To make sure, you check for the characteristics of that section. (IMAGE_SCN_MEM_EXECUTE or IMAGE_SCN_CNT_CODE).

If they do not contains...then the section is formed by data only. You flag it as such.

Go to the next section. See if there is the EP. Ok....Ep found there...then this sections is the one you must target the disassembler since it may contains code + data

Do the same for the remainder sections (perhaps you already flagged them when you checked IMAGE_DATA_DIRECTORY).

See ? You need to follow the contents of the PE structures....discard everything that may be data 1st and then only what is left is what you need to analyze. Much much faster then do a byte by byte scan in all data since the very 1st one ('MZ').

Try dl Ida somewhere, it really will help you understand what it is needed to do.(Since you are having difficulties with Olly and RosAsm)
Title: Re: The fool encoder
Post by: Grincheux on January 14, 2016, 06:07:19 PM
In the OptionalHeader I got the code and unitialized data


mov eax,lpNtHeader

INVOKE ImageRvaToSection,lpNtHeader,NULL,[eax].IMAGE_NT_HEADERS.OptionalHeader.BaseOfCode
mov lpSectionCode,eax

mov eax,lpNtHeader

INVOKE ImageRvaToSection,lpNtHeader,NULL,[eax].IMAGE_NT_HEADERS.OptionalHeader.BaseOfData
mov lpSectionUData,eax


Quote0x0513C180  41 55 54 4f 00 00 00 00 12 5b 00 00 00 10 00 00 00 5c 00 00 00 02 00 00 00 00 00  AUTO.....[.......\.........
0x0513C19B  00 00 00 00 00 00 00 00 00 20 00 00 60 2e 72 64 61 74 61 00 00 bd 10 00 00 00 70  ......... ..`.rdata.......p
0x0513C1B6  00 00 00 12 00 00 00 5e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 40 44  .......^..............@..@D
0x0513C1D1  47 52 4f 55 50 00 00 64 34 ab 00 00 90 00 00 00 06 00 00 00 70 00 00 00 00 00 00  GROUP..d4«..........p......
0x0513C1EC  00 00 00 00 00 00 00 00 40 00 00 c0 2e 72 73 72 63 00 00 00 c4 e1 17 00 00 d0 ab  ........@..À.rsrc...Äá...Ы
0x0513C207  00 00 e2 17 00 00 76 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 40 60 8b  ..â...v..............@..@`.
0x0513C222  54 24 2c 0f b6 0a 8b 7a 01 69 c9 01 01 01 01 66 0f 6e c1 66 0f 70 c0 00 8b c7 83  T$,.¶..z.iÉ....f.nÁf.pÀ..ǃ

I cannot check the EntryPoint because for a DLL it is 0, not always DllMain.
The only address I have not got again is for the initialized data.

I will go to see what are the "@DGROUP"  and "rdata".
Title: Re: The fool encoder
Post by: Grincheux on January 15, 2016, 04:31:09 AM
The pupils made a good work.
All is loaded in memory.
The section addresses are well known and checked as DATA or CODE
I join the part of the that make this part of the job  :eusa_clap: :eusa_dance: :badgrin: