News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

X64 Proc

Started by mabdelouahab, August 03, 2016, 08:22:01 AM

Previous topic - Next topic

mabdelouahab

I just want to learn the general rule to create procedure (Fastcall) without using "Proc" keyword, and I not want to use rbp to points to the stack.

MyProc:
Arg01 EQU <QWORD PTR [RSP+??]>
...
Arg10 EQU <QWORD PTR [RSP+??]>
Local01 EQU <QWORD PTR [RSP+??]>
...
Local10 EQU <QWORD PTR [RSP+??]>

SUB RSP,N
MOV Arg01,RCX ; /XMM0
MOV Arg02,RDX; /XMM1
MOV Arg03,R8;  /XMM2
MOV Arg04,R9;  /XMM3
...
ADD RSP,N
RET

How much should be equal N (arg+local var+ ???) ?
What about RBP ? Do I need to save it ?
What about RET,LEAVE ... ?
is there anything else?
I want the general rule that anyone can use it
for example:
Quotevoid MyFunction(int p1, int p2,.., int p10)
{
  int a, int b, int c
  a=p7
  b=p2
  return
}

hutch--

Its not as easy as that, the FASTCALL convention in x64 is a mix of registers then stack locations.

Args are in order rcx, rdx, r8, r9 then any further arguments are passed on the stack. Without the PROC keyword you have to both create your own stack frame AND align RSP to 16 bytes or the function call will not work.

TWell

That proc is good for reading and for debug symbols.
Rather use:
option epilogue:none
option prologue:none
...
MyProc proc
  ...
  ret
MyProc endp

jj2007

Quote from: hutch-- on August 03, 2016, 01:27:37 PM... AND align RSP to 16 bytes or the function call will not work.

Or rather, it will work most of the time, and the rest of the time mysterious bugs will drive you mad :P

hutch--

Here are the basics of a manual procedure call with more than 4 arguments.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    STACKFRAME

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    mov QWORD PTR [rsp+56], 8
    mov QWORD PTR [rsp+48], 7
    mov QWORD PTR [rsp+40], 6
    mov QWORD PTR [rsp+32], 5
    mov r9, 4
    mov r8, 3
    mov rdx, 2
    mov rcx, 1
    call testme

    waitkey

    void(ExitProcess,0)

    ret

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

testme proc a1:QWORD,a2:QWORD,a3:QWORD,a4:QWORD,a5:QWORD,a6:QWORD,a7:QWORD,a8:QWORD

    mov [rbp+10h], rcx
    mov [rbp+18h], rdx
    mov [rbp+20h], r8
    mov [rbp+28h], r9

    conout str$(a4),lf,lf

    ret

testme endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

hutch--

Here is a variation on the one before.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    STACKFRAME

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    mov rcx, 1
    mov rdx, 2
    mov r8, 3
    mov r9, 4
    mov QWORD PTR [rsp+32], 5
    mov QWORD PTR [rsp+40], 6
    mov QWORD PTR [rsp+48], 7
    mov QWORD PTR [rsp+56], 8

    call testme

    waitkey

    void(ExitProcess,0)

    ret

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

testme proc a1:QWORD,a2:QWORD,a3:QWORD,a4:QWORD,a5:QWORD,a6:QWORD,a7:QWORD,a8:QWORD

    mov [rbp+16], rcx
    mov [rbp+24], rdx
    mov [rbp+32], r8
    mov [rbp+40], r9

    conout lf,"    "

    conout str$(a1)," ",str$(a2)," ",str$(a3)," ",str$(a4)," ", \
           str$(a5)," ",str$(a6)," ",str$(a7)," ",str$(a8),lf,lf
    ret

testme endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end


This is the output.


    1 2 3 4 5 6 7 8

Press any key to continue...

mineiro

It's hard do this thing mabdelouahab, it's better you create a preprocessor instead.
you have 10 args, from arg1 to arg10. On start of your proc you reserve stack space to first fours, rcx,rdx,r8,r9. Some code called your function, this means that on stack is stored an offset of next instruction after a call that called your proc and other 6 args.
So, you start making your locals, from local1 to local10. And that's is just 10*8. But, you remove a local variable, let's say, local10 from your code. This means that all local addresses should be updated, removing local10 is like remove local1 and update addresses. Like an avalanche. So, if you removed local10 from your code, you should update all your code again, because variables are referencing to wrong positions. You can see a pattern here, like local10 and local8. Even members. And other pattern to odd members.
And look that I'm not talking about functions arguments inside this proc.
All said, now think other option, your procedure is not calling other procedure, so, it can be on an odd or even stack address before a call and will work fine. No need to align stack.

Hard job sir. I can't do this using macros, this is why I have suggested a 'modular macro'.
Something like:
You create a procedure(macro) that will do that hard job. Assembler then assemble your procedure. When assembler looks for 'proc' and 'endp' pairs, it will alloc memory to our procedure, set execution permission, copy our assembled code to that memory block, set rsi register with contents of proc and endp pair on source code. Set rdi with a zeroed memory, and set rcx with sizeof string. So, our code should works with strings. On return, assembler check rax register for some message that we can send and overwrite proc to endp code with contents on rdi.
This way you can create macro using any assembler. Something like, 'let's put an end into this macro war".
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

hutch--

With ML64 you write a prologue and epilogue to do this, it should provide the starting location for another proc to call it, align RSP to 16 and provide the entry and exit code if more than 4 arguments are used. With 4 or less arguments you don't need a stack frame as you pass in the 1st 4 registers.

mineiro

#8
Yes, but how to interate with arguments of functions inside this function?
If have 10 invokes inside this proc, we search for the function that needs more arguments, let's suppose 8 arguments, so functions that have 5 arguments will use the same reserved stack space of the biggest one. And this should be part of prologue and epilogue to us use only rsp register.
Well, instead of push things we can move to rsp things.

---edited---
Joining to this registers that should be preserved across function calls and we have a nice cake.  :dazzled:

If things really get this way so we can have just one qword as a maximum of lost on memory space on each procedure.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Quote from: mineiro on August 04, 2016, 12:50:47 AM
Yes, but how to interate with arguments of functions inside this function?

Three macros in \Masm32\MasmBasic\Res\JBasic.inc:
JBasicProlog
JBasicEpilog
jinvoke

Plus the flag jbCompStyle, which decides if the stack is built by pushing args or by moving args.

mabdelouahab

 jj2007, mineiro, hutch I thank you all, really valuable information, I will try to exploit
Thank you so much

mineiro

Quote from: jj2007 on August 04, 2016, 03:01:22 AM
Three macros in \Masm32\MasmBasic\Res\JBasic.inc:
JBasicProlog
JBasicEpilog
jinvoke
Plus the flag jbCompStyle, which decides if the stack is built by pushing args or by moving args.
:shock: These guys make me feel like an ant between Titans. It's not funny.
When I learn about macros I'll be back  8) . When I came back persons will run the most that he can saying:
- Run, run everybody, is mineirozzilla, the grandmaster of masm macros.

Hmm, time to return to reality, let me go back to my own insignificance on macros world.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Quote from: mineiro on August 04, 2016, 07:19:47 AMIt's not funny.

Well, your post is funny :P

Biggest problem with this stuff is that prolog + epilog are badly documented, so that was a lot of trial and error.

mineiro

To me, the most hard thing is debug macro facilities. Sometimes work, other not. Acts like a black box.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

#14
Quote from: mineiro on August 04, 2016, 10:13:31 PM
To me, the most hard thing is debug macro facilities. Sometimes work, other not. Acts like a black box.

Can you post an example that doesn't work? I haven't had problems for a long time, and I use it a lot.

Today I've posted a new MB version with the dual 64-/32-bit version of the deb macro. Not yet as powerful as the original, but it is easy to use and may help to debug code in the console without launching a debugger:
call LoadVals ; roll your own - put some nice values into the regs
.DATA
MyR4 REAL4 12345678.12345678
MyR8 REAL8 12345678.12345678
MyDw DWORD 1234567890
MyQw QWORD 1234567890123456789
txMy$ db "This is a string in a global variable. Below is the current program counter:", 0
My$ DefSize txMy$
.CODE

movlps xmm4, MyR8
mov rdx, Chr$("This is a string. You can display strings in the macro by putting a '$' before the variable.")
mov rdi, 123
  deb 4, "Testing the debug macro:", x:rax, rcx, $rdx, rsi, $rdi, $My$, x:rip, xmm1, f:xmm4, xmm5, ST(0), e:ST(1), ST(2), ST(3), MyR4, MyR8, MyDw, MyQw
  deb 4, "Testing the debug macro:", x:rax, rcx, $rdx, rsi, $rdi, $My$, x:rip, xmm1, f:xmm4, xmm5, ST(0), e:ST(1), ST(2), ST(3), MyR4, MyR8, MyDw, MyQw
deb 4, "Single line A:", x:rax
deb 4, "Single line B:", rcx
deb 4, "Single line C:", $rdx


Output:
Testing the debug macro:
x:rax   1111111111111111h
rcx     2222222222222222
$rdx    This is a string. You can display strings in the macro by putting a '$' before the variable.
rsi     5555555555555555
$rdi    ** not a pointer: 7bh
$My$    This is a string in a global variable. Below is the current program counter:
x:rip   1400010aah
xmm1    1111111111111111111
f:xmm4  12345678.123456780
xmm5    5555555555555555555
ST(0)   3.141592654
e:ST(1) 9.876543210e+004
ST(2)   12345.678900000
ST(3)   3.141592654
MyR4    12345678.000000000
MyR8    12345678.123456780
MyDw    1234567890
MyQw    1234567890123456789

Testing the debug macro:
x:rax   1111111111111111h
rcx     2222222222222222
$rdx    This is a string. You can display strings in the macro by putting a '$' before the variable.
rsi     5555555555555555
$rdi    ** not a pointer: 7bh
$My$    This is a string in a global variable. Below is the current program counter:
x:rip   14000226dh
xmm1    1111111111111111111
f:xmm4  12345678.123456780
xmm5    5555555555555555555
ST(0)   3.141592654
e:ST(1) 9.876543210e+004
ST(2)   12345.678900000
ST(3)   3.141592654
MyR4    12345678.000000000
MyR8    12345678.123456780
MyDw    1234567890
MyQw    1234567890123456789

Single line A:  x:rax   1111111111111111h
Single line B:  rcx     2222222222222222
Single line C:  $rdx    This is a string. You can display strings in the macro by putting a '$' before the variable.


Such code assembles with ML, AsmC, JWasm and HJWasm in 64- and 32-bit mode. Results are identical except where the register size matters.

"Project" attached. Note that deb is used twice above, in order to show that it doesn't trash any registers. It does trash the flags, though, unlike the original MasmBasic deb macro.

EDIT: Updated, I had forgotten rip/eip in the example.