News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Working 64 bit demo of POASM 64 bit.

Started by hutch--, December 02, 2014, 06:58:00 PM

Previous topic - Next topic

hutch--

Be warned that this example is extremely rudimentary, it may be technically wrong and it does nothing fancy but it at least works on my Win7 64. With thanks to sinsi for having suffered ML64 for long enough to understand the stack correction required, this is the first POASM 64 bit working EXE I have seen. Technique for getting it going was akin to flying blind in the dark with a blindfold on and both hands tied behind your back.  :P


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    includelib \pasm64\lib64\kernel32.lib
    includelib \pasm64\lib64\user32.lib

    MessageBoxA PROTO :QWORD,:QWORD,:QWORD,:QWORD
    MessageBox equ <MessageBoxA>
    ExitProcess PROTO :QWORD

    call_msgbox PROTO :QWORD,:QWORD,:QWORD,:QWORD

    MB_OK equ <0>

  .data
    tmsg db "POASM 64 bit MessageBox",0
    titl db "POASM 64 bit",0
    msg2 db "Called from a POASM 64 bit procedure",0
    ttl2 db "'call_msgbox' proc here",0

  .code

align 16
start:

    xor rax, rax
    sub rsp, 40     ; 28h
    invoke MessageBox,rax,ADDR tmsg,ADDR titl,MB_OK

    xor rax, rax
    sub rsp, 40     ; 28h
    invoke call_msgbox,rax,ADDR msg2,ADDR ttl2,MB_OK

    xor rax, rax
    sub rsp, 8
    invoke ExitProcess,rax

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

align 16

call_msgbox proc hndl:QWORD,txt:QWORD,ttl:QWORD,styl:QWORD

    invoke MessageBox,hndl,txt,ttl,styl

    ret

call_msgbox endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start

sinsi

Within a proc, if you call API functions, it's usually enough to align the stack on entry (+8), spill space (minimum 32) and enough for the max number of parameters (less the 4 spill params).


start:

    sub rsp, 40     ; 28h

    xor eax,eax    ;changing the low 32-bits of a register zero-extends to 64-bit - xor rax, rax is one byte longer too.
    invoke MessageBox,rax,ADDR tmsg,ADDR titl,MB_OK

    xor rax, rax
    ;sub rsp, 40     ;not needed, just reuse the same stack
    invoke call_msgbox,rax,ADDR msg2,ADDR ttl2,MB_OK

    xor rax, rax
    ; sub rsp, 8    ;every API call needsa a minimum of 40 bytes, even functions with no parameters.
    invoke ExitProcess,rax

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

align 16

call_msgbox proc hndl:QWORD,txt:QWORD,ttl:QWORD,styl:QWORD
    sub rsp,28h    ;this can be omitted if you don't call any API
    invoke MessageBox,hndl,txt,ttl,styl
    add rsp,28h    ;don't forget to balance!
    ret

call_msgbox endp


Here's my ML64 way

    sub rsp,28h

    sub ecx,ecx
    lea rdx,tmsg
    lea r8,titl
    mov r9d,MB_OK
    call MessageBoxA

As I've said plenty of times, I don't even like using invoke so ML64 is perfect  :biggrin:

Overview of x64 Calling Conventions

hutch--

This is the next try, it works but I am still guessing on the details of the stack shadowing.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    includelib \pasm64\lib64\kernel32.lib
    includelib \pasm64\lib64\user32.lib

    MessageBoxA PROTO :QWORD,:QWORD,:QWORD,:QWORD
    MessageBox equ <MessageBoxA>
    ExitProcess PROTO :QWORD

    call_msgbox PROTO :QWORD,:QWORD,:QWORD,:QWORD
    testproc    PROTO :QWORD

    MB_OK equ <0>

  .data
    tmsg db "POASM 64 bit MessageBox",0
    titl db "POASM 64 bit",0
    msg2 db "Called from a POASM 64 bit procedure",0
    ttl2 db "'call_msgbox' proc here",0

  .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

align 16
start:

    sub rsp, 40
    call main

    xor rax, rax
    sub rsp, 40     ; 28h
    invoke ExitProcess,rax

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    xor rax, rax
    sub rsp, 40     ; 28h
    invoke MessageBox,rax,ADDR tmsg,ADDR titl,MB_OK

    xor rax, rax
    sub rsp, 40     ; 28h
    invoke call_msgbox,rax,ADDR msg2,ADDR ttl2,MB_OK

    add rsp, 80
    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

align 16

call_msgbox proc hndl:QWORD,txt:QWORD,ttl:QWORD,styl:QWORD

    invoke MessageBox,hndl,txt,ttl,styl

    invoke testproc,16

    ret

call_msgbox endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

align 16

testproc proc arg1:QWORD

    mov rax, arg1
    add rax, rax

    ret

testproc endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start


I have learnt that if you want anything vaguely meaningful from PODUMP you use it on the object module. I have fed this through a PODUMP formater to get vaguely readable results.


    Dump of test.obj
    File type: OBJ

  start:
    sub rsp, 28
    call 0000000000000018
    xor rax, rax
    sub rsp, 28
    mov rcx, rax
    call ExitProcess

  main:
    xor rax, rax
    sub rsp, 28
    mov r9, 0
    mov r8, titl
    mov rdx, tmsg
    mov rcx, rax
    call MessageBoxA
    xor rax, rax
    sub rsp, 28
    mov r9, 0
    mov r8, ttl2
    mov rdx, msg2
    mov rcx, rax
    call 0000000000000080
    add rsp, 50
    ret

  call_msgbox:
    call MessageBoxA
    mov rcx, 10
    call 00000000000000A0
    ret

  testproc:
    mov rax, rcx
    add rax, rax
    ret
    SUMMARY
    62 .data
    59 .drectve
    A7 .text

anunitu

Here is a link to an Amazon page with two books on 64 bit assembly programing.

http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=64%20bit%20assembly%20programming%20windows

Haven't yet checked out the details for these,but at least a book on the subject.

Seems this guy has written a lot on 64 bit assembly. Ray Seyfarth

http://www.amazon.com/s/ref=dp_byline_sr_book_1?ie=UTF8&field-author=Ray+Seyfarth&search-alias=books&text=Ray+Seyfarth&sort=relevancerank


And he has a site to check out.

http://www.rayseyfarth.com/


Might be an up and comer in the 64 bit world.

hutch--

Looks like the guy is doing some interesting work but its in Linux/MAC OS. I don't have any problems with the mnemonics, its translating some consistent means of what specifically 64 bit Windows is doing and I am low of decent tools to find out. Have been using PODUMP.EXE and have written a test tool to format it into readable mnemonic code.

I found a disassembler called "arkdasm" which works OK but only produces a disassembly about as good as PODUMP and you cannot copy of save it so its not a lot of use at the moment. I have yet to fully digest Agner Fog's document on calling conventions for Win64. I know the register order but have yet to see how the shadow stack works apart from knowing that you must make stack space available before you call an API function.

For all of the world, the innards of Win64 look like the arse end of a RISC C compiler and getting a clear image of what it is doing is no real joy at the moment.

sinsi

Here's what I do when using ML64

1. On entry to a proc, align the stack to 16. This will ensure that if you call another local proc
you know in the second proc it is misaligned, so the first thing you do is align to 16...and so on.

2. Within the proc, find out which API call has the most parameters and adjust the stack for that call.
Even if an API has less than 4 parameters you still need the spill space for 4 parameters.

3. Adjust the stack on entry and reuse it for each API call, only clean up when you exit the proc.
This is why 64-bit code is full of "mov [rsp+xxh],y" and fewer "push x". More code bloat :biggrin:

4. It gets tricky if you push reserved registers and/or have locals. The registers are OK, push them first.
If it's an even number then the stack is still unaligned, you can either push one again or use sub rsp,8.
Then you allocate space for any locals but you need to calculate the offset from rsp which can be messy.
Finally you allocate the spill space to align the stack and reuse it for any API calls.
Do it in one hit - "sub rsp,odd uses+sizeof locals+sizeof spill"

5. If your proc uses no API calls there is no need to align the stack, but if you call another proc be
aware that it is more than likely aligned (see 1)

You can probably write a few macros to take care of USES and LOCAL but API parameters are a bit harder.


This is all from observing what my crashing programs do, it might not be the "official" way but so far, so good for me.
You really have to go back to basics, think of ML64 version 12 as MASM version 1.25  :P