News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

PROC and prolog/epilog

Started by markallyn, November 28, 2017, 08:04:12 AM

Previous topic - Next topic

markallyn

Hello everyone,

I would like to verify (or not) how the PROC directive works as far as creating prologs and epilogs.  Let us suppose that I have two PROCs, one of which calls the other.  Let us further asssume that I do not use the OPTION PROLOGUE:NONE and OPTION EPILOGU:NONE options.  Then using the PROC keyword in the first function will automatically generate a PROLOG and an EPILOG too.  But, using the PROC kleyword in the callee will NOT generate PROLOG/EPILOG pair if the function is a leaf function.

Is this true?

Thanks,
Mark Allyn

nidud

#1
deleted

hutch--

Mark,

The way I would test that with ML64 is by creating a default proc then having a look at it with a disassembler. The problem is this, you need to at least the entry point of the executable correctly set up and aligned or the app will not start. I bothered to write the macros necessary to do this and while you can turn it off AFTER the app has started properly for simple leaf procedures then turn it back on after the procedure has exited, you must get the initial alignment right or the app exits before it even runs.

markallyn

Hutch and Nidud, good evening.

Nidud-  Using your code, what I'm saying is that in your "a" function, using the PROC directive does NOT cause prologue and epilogue code to be emitted in "a".  But, using PROC in "b" DOES cause prologue and epilogue code to be emitted, at least if "a" is a simple leaf function.  I have not found anything in msft documentation that indicates that PROC doesn't cause leaf function prologues and epilogues, but it ALWAYS causes prologues and epilogues to be generated if the function is a frame function, i.e. it calls another function.

Hutch-  Yes I actually created two functions as test functions, the one calling the other and looked them over with x64dbg.  That is how I "discovered" this apparently undocumented behavior of PROC when used with a leaf function.  I haven't yet tested what happens when I make the callee a frame function, but will do so very soon.  I'm guessing that in that case, PROC in the callee will emit prologue and epilogue code too.  If you think it's helpful I will post the test code here.

One other, related questio, albeit trivial.  When PROC is used with a frame function and prologue code gets emitted, ml64 automatically uses the instruction "enter 80,0." .  Why 80?

Thanks to both of you.

Mark

hutch--

Mark,

It is adjustable, just look at the STACKFRAME macro and how it uses the UseStackFrame and EndStackFrame macros.

STACKFRAME MACRO dflt:=<96>,dynm:=<128>,algn:=<16>

The alignment equate which has a default of 16, you can change which allows you to align procedures so that the locals match the largest size of the data types so if you wanted to align at 64, 1024, page align at 4096, it is easy to do which means you can use SSE, AVX and AVX2 locals if you place them first before smaller sized data types.

Rather than repeat it, the reference material for how the stackframe is built and used is in the MASM64.chm help file and the actual pre-processor code is in the main macro file. As you are probably aware by now, external documentation is appalling, incomplete and often wrong, I built this system by exhaustive testing of code against the Microsoft ABI using API functions and local function, both with and without stackframes.

Now with the value 80 used with the ENTER mnemonic, you will do better by reading the Intel instruction manual. MASM has used LEAVE over many years while ENTER makes the proc simple,  clean and very reliable and the method is free of the messy and unreliable RSP twiddling that many have messed around with. It is not a fast mnemonic but on procedures that need a stack frame so they can call other high level procedures and API or external library functions, stack entry speed does not matter.

Now with the entry point being correctly set up the stack is aligned correctly so when you call a leaf procedure with no stack frame, the stack is aligned and you can avoid the tiny overhead as long as you don't mess the stack up and use a simple RET to exit the proc. Generally you don't use PUSH / POP like you did in win32, you make locals and use MOV to load the registers that need to be protected and restore them before proc exit.

LOCAL myreg :QWORD

mov myreg, rsi

; write the source code

mov rsi, myreg

ret


markallyn

Good morning/evening Hutch,

I am still reading and re-reading your last post.  As always, it's compact and loaded with information and takes more than a single pass to digest.

Meantime, I wanted to report on the results of inserting a simple invoke command into what was a leaf function to see what ml64 would do with the PROC directive under these circumstances.  You will recall previously that I had found that using PROC in a leaf function resulted in NO prologue or epilogue code being emitted.  Well, when I converted the callee into a "frame function" with the invoke macro (all it did was to call printf), then PROC puts the prologue and epilogue code in--including a gratuitous allocation of 96 bytes for locals that I'm not using.  Very annoying.  Annoying because it messes up the frame pointer to 6 parameters I had passed in from the calling function.

The one conclusive take-away from this escapade is that novices like me cannot write even the simplest masm code without a debugger, and a good one at that!

Thanks for your help and counseling.

Mark

nidud

#6
deleted

aw27

There are only 3 cases to consider, when we are not talking about exception handling, and using the default ML64 prolog/epilog

1- Leaf functions
Align the stack on entry, restore the stack on exit

2-Functions with parameters passed in registers, or in the registers and stack, or LOCAL variables, or have USES clause as well, or all that, or part thereof, and may also call other functions
MASM automatically builds an rbp based stack frame and uses leave on the epilog.
ALL you have to do is ALIGN the stack after providing shadow space + space for the parameters after the 4th of the function(s) to be called, if there are functions to be called!

3- Functions that simply call other functions, but have no LOCALS, or parameters or USES.
Subtract from rsp the amount of shadow space plus parameters after the 4th and align the stack. Restore the stack on exit because there is no leave.


markallyn

Good morning, Nifud,

I tested the code you wrote and you're right--no prologue/epilogue pair is emitted even though procedure b has a call to function a.   So, I am very puzzled.

More experimentation is in order ....

BTW, I'm using x64dbg which has a nice disassemlber with it. 

Regards,
Mark

nidud

#9
deleted

Vortex

Hi markallyn,

I use Agner Fog's objconv to disassemble object files. Studying the output is useful for me :

http://agner.org/optimize/

markallyn

Nifud, aw27, and Vortex,

Nifud:  Could you explain a bit more what you mean by:

"This means the stack is aligned 16 - 8 (return address) on proc-entry."

I follow everything else.  Very clear.

aw27:
I'm missing something very basic in what you wrote.  Namely, are you saying that these three conditions REQUIRE prologues and epilogues (whether built-in by ml or hand-written) OR are you saying the opposite, that the three conditions DO NOT require prologues and epilogues? 

Vortex:
Yes, I'm familiar with objconv by A. Fog.  It's been a couple of years since I played with it.  I'll try it again.  Thanks for reminding me about its existence.

Regards,
Mark

nidud

#12
deleted

aw27

Please refer to the 3 cases above:


includelib \masm32\lib64\kernel32.lib
ExitProcess PROTO :dword

.code

;CASE 1
p1   proc ; leaf
sub rsp, 8 ; align stack

; ... do our things

add rsp, 8
ret
p1   endp

;Case 2
p2 proc parm1:dword
    ;and rsp, -16 ;no need here, but will not hurt if used, because push rbp will align

; ... do our things
ret
p2 endp

;Case 2
p3 proc
LOCAL myvar:dword
and rsp, -16 ; align

; do our things
ret
p3 endp

;Case 2
p4_1 proc uses rbx rdi rsi par1:qword, par2:qword, par3:qword, par4:qword, par5:qword
and rsp, -16 ; align

; do our things
ret
p4_1 endp

; Case 3
p4 proc
sub rsp, 28h ; shadow space+space for 5th parameter.
       ;and rsp, -16 ;no need here, but will not hurt if used, because already aligned
mov rcx,1
mov rdx,2
mov r8,3
mov r9,4
mov rax, 5
mov [rsp+20h],rax
call p4_1
add rsp, 28h
ret
p4 endp

; Case 3, but without need for epilog because ExitProcess fixes everything
main   proc
sub rsp, 28h ; shadow space + align

call p1
mov rcx, 1
call p2
call p3
call p4


;add rsp, 28h
;ret
mov ecx,0
call ExitProcess

main   endp

end

hutch--

I did not want to double post this example of recursion so I put it here,

http://masm32.com/board/index.php?topic=6720.0

No stack frame twiddling trying to get it to work, a prologue/epilogue that constructs a minimum stack frame that automates stack frame creation in a context where it is the only way to do it. (Iteration procs are not recursive.)