Author Topic: PROC and prolog/epilog  (Read 5521 times)

markallyn

  • Member
  • **
  • Posts: 192
PROC and prolog/epilog
« on: November 28, 2017, 08:04:12 AM »
Hello everyone,

I would like to verify (or not) how the PROC directive works as far as creating prologs and epilogs.  Let us suppose that I have two PROCs, one of which calls the other.  Let us further asssume that I do not use the OPTION PROLOGUE:NONE and OPTION EPILOGU:NONE options.  Then using the PROC keyword in the first function will automatically generate a PROLOG and an EPILOG too.  But, using the PROC kleyword in the callee will NOT generate PROLOG/EPILOG pair if the function is a leaf function.

Is this true?

Thanks,
Mark Allyn

nidud

  • Member
  • *****
  • Posts: 1606
    • https://github.com/nidud/asmc
Re: PROC and prolog/epilog
« Reply #1 on: November 28, 2017, 09:29:12 AM »
Stack frame generation depends on allocation of local memory and the caller creates the frame for arguments:

Code: [Select]
.code

a   proc
    ret
a   endp

b   proc
    call a
    ret
b   endp

    end

Code: [Select]
a       PROC
        ret             ; 0000 _ C3
a       ENDP
b       PROC
        call    a       ; 0001 _ E8, FFFFFFFA
        ret             ; 0006 _ C3
b       ENDP

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 5756
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: PROC and prolog/epilog
« Reply #2 on: November 28, 2017, 09:38:42 AM »
Mark,

The way I would test that with ML64 is by creating a default proc then having a look at it with a disassembler. The problem is this, you need to at least the entry point of the executable correctly set up and aligned or the app will not start. I bothered to write the macros necessary to do this and while you can turn it off AFTER the app has started properly for simple leaf procedures then turn it back on after the procedure has exited, you must get the initial alignment right or the app exits before it even runs.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

markallyn

  • Member
  • **
  • Posts: 192
Re: PROC and prolog/epilog
« Reply #3 on: November 28, 2017, 10:47:26 AM »
Hutch and Nidud, good evening.

Nidud-  Using your code, what I'm saying is that in your "a" function, using the PROC directive does NOT cause prologue and epilogue code to be emitted in "a".  But, using PROC in "b" DOES cause prologue and epilogue code to be emitted, at least if "a" is a simple leaf function.  I have not found anything in msft documentation that indicates that PROC doesn't cause leaf function prologues and epilogues, but it ALWAYS causes prologues and epilogues to be generated if the function is a frame function, i.e. it calls another function.

Hutch-  Yes I actually created two functions as test functions, the one calling the other and looked them over with x64dbg.  That is how I "discovered" this apparently undocumented behavior of PROC when used with a leaf function.  I haven't yet tested what happens when I make the callee a frame function, but will do so very soon.  I'm guessing that in that case, PROC in the callee will emit prologue and epilogue code too.  If you think it's helpful I will post the test code here.

One other, related questio, albeit trivial.  When PROC is used with a frame function and prologue code gets emitted, ml64 automatically uses the instruction "enter 80,0." .  Why 80?

Thanks to both of you.

Mark

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 5756
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: PROC and prolog/epilog
« Reply #4 on: November 28, 2017, 06:29:31 PM »
Mark,

It is adjustable, just look at the STACKFRAME macro and how it uses the UseStackFrame and EndStackFrame macros.

STACKFRAME MACRO dflt:=<96>,dynm:=<128>,algn:=<16>

The alignment equate which has a default of 16, you can change which allows you to align procedures so that the locals match the largest size of the data types so if you wanted to align at 64, 1024, page align at 4096, it is easy to do which means you can use SSE, AVX and AVX2 locals if you place them first before smaller sized data types.

Rather than repeat it, the reference material for how the stackframe is built and used is in the MASM64.chm help file and the actual pre-processor code is in the main macro file. As you are probably aware by now, external documentation is appalling, incomplete and often wrong, I built this system by exhaustive testing of code against the Microsoft ABI using API functions and local function, both with and without stackframes.

Now with the value 80 used with the ENTER mnemonic, you will do better by reading the Intel instruction manual. MASM has used LEAVE over many years while ENTER makes the proc simple,  clean and very reliable and the method is free of the messy and unreliable RSP twiddling that many have messed around with. It is not a fast mnemonic but on procedures that need a stack frame so they can call other high level procedures and API or external library functions, stack entry speed does not matter.

Now with the entry point being correctly set up the stack is aligned correctly so when you call a leaf procedure with no stack frame, the stack is aligned and you can avoid the tiny overhead as long as you don't mess the stack up and use a simple RET to exit the proc. Generally you don't use PUSH / POP like you did in win32, you make locals and use MOV to load the registers that need to be protected and restore them before proc exit.

LOCAL myreg :QWORD

mov myreg, rsi

; write the source code

mov rsi, myreg

ret

hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

markallyn

  • Member
  • **
  • Posts: 192
Re: PROC and prolog/epilog
« Reply #5 on: November 29, 2017, 02:37:00 AM »
Good morning/evening Hutch,

I am still reading and re-reading your last post.  As always, it's compact and loaded with information and takes more than a single pass to digest.

Meantime, I wanted to report on the results of inserting a simple invoke command into what was a leaf function to see what ml64 would do with the PROC directive under these circumstances.  You will recall previously that I had found that using PROC in a leaf function resulted in NO prologue or epilogue code being emitted.  Well, when I converted the callee into a "frame function" with the invoke macro (all it did was to call printf), then PROC puts the prologue and epilogue code in--including a gratuitous allocation of 96 bytes for locals that I'm not using.  Very annoying.  Annoying because it messes up the frame pointer to 6 parameters I had passed in from the calling function.

The one conclusive take-away from this escapade is that novices like me cannot write even the simplest masm code without a debugger, and a good one at that!

Thanks for your help and counseling.

Mark

nidud

  • Member
  • *****
  • Posts: 1606
    • https://github.com/nidud/asmc
Re: PROC and prolog/epilog
« Reply #6 on: November 29, 2017, 03:11:06 AM »
Nidud-  Using your code, what I'm saying is that in your "a" function, using the PROC directive does NOT cause prologue and epilogue code to be emitted in "a".  But, using PROC in "b" DOES cause prologue and epilogue code to be emitted, at least if "a" is a simple leaf function.

There is no stack frame added in procedure b, so this assumption is wrong.

Quote
I have not found anything in msft documentation that indicates that PROC doesn't cause leaf function prologues and epilogues, but it ALWAYS causes prologues and epilogues to be generated if the function is a frame function, i.e. it calls another function.

The PROC directive do not causes prologues and epilogues to be generated based on these conditions. When you call another function you have to manually add a stack frame for the call regardless if the call is inside a PROC or not.

It may be simpler to use a disassembler instead of a debugger for testing this.
 
http://www.agner.org/optimize/#objconv

AW

  • Member
  • *****
  • Posts: 1476
  • Let's Make ASM Great Again!
Re: PROC and prolog/epilog
« Reply #7 on: November 29, 2017, 03:25:51 AM »
There are only 3 cases to consider, when we are not talking about exception handling, and using the default ML64 prolog/epilog

1- Leaf functions
Align the stack on entry, restore the stack on exit

2-Functions with parameters passed in registers, or in the registers and stack, or LOCAL variables, or have USES clause as well, or all that, or part thereof, and may also call other functions
MASM automatically builds an rbp based stack frame and uses leave on the epilog.
ALL you have to do is ALIGN the stack after providing shadow space + space for the parameters after the 4th of the function(s) to be called, if there are functions to be called!

3- Functions that simply call other functions, but have no LOCALS, or parameters or USES.
Subtract from rsp the amount of shadow space plus parameters after the 4th and align the stack. Restore the stack on exit because there is no leave.


markallyn

  • Member
  • **
  • Posts: 192
Re: PROC and prolog/epilog
« Reply #8 on: November 29, 2017, 03:48:28 AM »
Good morning, Nifud,

I tested the code you wrote and you're right--no prologue/epilogue pair is emitted even though procedure b has a call to function a.   So, I am very puzzled.

More experimentation is in order ....

BTW, I'm using x64dbg which has a nice disassemlber with it. 

Regards,
Mark

nidud

  • Member
  • *****
  • Posts: 1606
    • https://github.com/nidud/asmc
Re: PROC and prolog/epilog
« Reply #9 on: November 29, 2017, 04:34:57 AM »
BTW, I'm using x64dbg which has a nice disassemlber with it. 

Yes, I'm using that too  :t

In 32-bit the stack frame for a call is created basically by pushing the arguments/return address unto the stack and jump:
Code: [Select]
a proto :ptr, :ptr, :ptr, :ptr, :ptr

.code
    push 5
    push 4
    push 3
    push 2
    push 1
    push @F
    jmp a
@@:

In 64-bit the principle are the same with some modifications.
Code: [Select]
    mov rcx,1
    mov rdx,2
    mov r8,3
    mov r9,4

    push 5
    push r9
    push r8
    push rdx
    push rcx
    push @F
    jmp a
@@:

However, you skip the pushing and only the fifth argument are assigned to the frame:
Code: [Select]
    sub rsp,5*8 ; create a frame for 5 args
    mov rcx,1   ; first 4
    mov rdx,2
    mov r8,3
    mov r9,4
    mov qword ptr [rsp+4*8],5
    call a
    add rsp,5*8 ; restore stack

In addition to this (and the very reason it's done this way) the stack have to be aligned 16. This means the stack is aligned 16 - 8 (return address) on proc-entry.

Vortex

  • Member
  • *****
  • Posts: 1840
Re: PROC and prolog/epilog
« Reply #10 on: November 29, 2017, 04:40:08 AM »
Hi markallyn,

I use Agner Fog's objconv to disassemble object files. Studying the output is useful for me :

http://agner.org/optimize/

markallyn

  • Member
  • **
  • Posts: 192
Re: PROC and prolog/epilog
« Reply #11 on: November 29, 2017, 06:34:01 AM »
Nifud, aw27, and Vortex,

Nifud:  Could you explain a bit more what you mean by:

"This means the stack is aligned 16 - 8 (return address) on proc-entry."

I follow everything else.  Very clear.

aw27:
I'm missing something very basic in what you wrote.  Namely, are you saying that these three conditions REQUIRE prologues and epilogues (whether built-in by ml or hand-written) OR are you saying the opposite, that the three conditions DO NOT require prologues and epilogues? 

Vortex:
Yes, I'm familiar with objconv by A. Fog.  It's been a couple of years since I played with it.  I'll try it again.  Thanks for reminding me about its existence.

Regards,
Mark

nidud

  • Member
  • *****
  • Posts: 1606
    • https://github.com/nidud/asmc
Re: PROC and prolog/epilog
« Reply #12 on: November 29, 2017, 07:06:30 AM »
Nifud:  Could you explain a bit more what you mean by:

"This means the stack is aligned 16 - 8 (return address) on proc-entry."

I follow everything else.  Very clear.

Well, the stack is (or should always be) aligned on call:
Code: [Select]
    ; the stack is aligned 16 here..
    call a

However, the call itself pushes the return address unto the stack so the stack becomes (always) 8 byte off on entry:
Code: [Select]
a   proc
    ; the stack is aligned 8 here..
    push rbp   
    mov rbp,rsp
    ; stack now aligned 16

This means that the frame created for arguments must be even if RBP is used as above or odd if not to ensure alignment on call:
Code: [Select]
    ;sub rsp,5*8 ; create a frame for 5 args
    sub rsp,6*8 ; create a frame for 5 args
    ; stack aligned 16

AW

  • Member
  • *****
  • Posts: 1476
  • Let's Make ASM Great Again!
Re: PROC and prolog/epilog
« Reply #13 on: November 29, 2017, 08:44:22 AM »
Please refer to the 3 cases above:

Code: [Select]
includelib \masm32\lib64\kernel32.lib
ExitProcess PROTO :dword

.code

;CASE 1
p1   proc ; leaf
sub rsp, 8 ; align stack

; ... do our things

add rsp, 8
ret
p1   endp

;Case 2
p2 proc parm1:dword
    ;and rsp, -16 ;no need here, but will not hurt if used, because push rbp will align

; ... do our things
ret
p2 endp

;Case 2
p3 proc
LOCAL myvar:dword
and rsp, -16 ; align

; do our things
ret
p3 endp

;Case 2
p4_1 proc uses rbx rdi rsi par1:qword, par2:qword, par3:qword, par4:qword, par5:qword
and rsp, -16 ; align

; do our things
ret
p4_1 endp

; Case 3
p4 proc
sub rsp, 28h ; shadow space+space for 5th parameter.
       ;and rsp, -16 ;no need here, but will not hurt if used, because already aligned
mov rcx,1
mov rdx,2
mov r8,3
mov r9,4
mov rax, 5
mov [rsp+20h],rax
call p4_1
add rsp, 28h
ret
p4 endp

; Case 3, but without need for epilog because ExitProcess fixes everything
main   proc
sub rsp, 28h ; shadow space + align

call p1
mov rcx, 1
call p2
call p3
call p4


;add rsp, 28h
;ret
mov ecx,0
call ExitProcess

main   endp

end

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 5756
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: PROC and prolog/epilog
« Reply #14 on: November 29, 2017, 01:24:06 PM »
I did not want to double post this example of recursion so I put it here,

http://masm32.com/board/index.php?topic=6720.0

No stack frame twiddling trying to get it to work, a prologue/epilogue that constructs a minimum stack frame that automates stack frame creation in a context where it is the only way to do it. (Iteration procs are not recursive.)
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin: