News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

PROC and prolog/epilog

Started by markallyn, November 28, 2017, 08:04:12 AM

Previous topic - Next topic

markallyn

Hutch, Nidud, and aw27,

Thanks very much indeed for your contributions. 

Nidud-  Very clear explanation.  Using x64dbg I was able to verify what you wrote.

aw27:  Wow!  Very detailed and coherent.  I need to study the options more, but looking them over I think they are pretty self-explanatory.  I will certainly get back to you after completing the necessary study. 

Hutch:  I haven't yet been able to look over your attached file, but will get to it later today.  As with aw27, I will respond to what you sent.

Now, as for the issue that kicked off this series, I went back to the original Programmer's Guide for version 6.1 of masm--way back to the 1992 edition.  Here is a direct quotation from the Guide on page 198 under the heading "Generating Prologue and Epilogue Code":

QuoteWhen you use the PROC directive with its extended syntax and argument list the assembler automatically generates the prologue and epilogue code in your procedure.

This is true.  I tested it using a slightly expanded version of Nidud's (see above) streamlined code.  If I add to his "a PROC" a couple of arguments of the form :QWORD, :QWORD, etc.  then sure enough ml64 will emit prologue and epilogue code.  If there is no parameter list (as there isn't in Nidud's example) then indeed no prologue/epilogue pair shows up.  Of course, it matters because--as Nidud points out clearly--one must adjust the stack search by 8 bytes to allow for the return address or one will not recover a fifth or greater passed parameter.

Thanks again to all of you, and I will reply shortly to aw27 and Hutch after doing justice to their efforts.

Mark

markallyn

Hello aw27,

Since you sent your very detailed examples of the three situations I have copied, assembled, and linked all of them.  I am still studying the results with x64dbg.  But, I do have one preliminary questtion.

Namely, in those cases where you have declared LOCAL variables I had predicted that the assembler would use the ENTER XX,0 instruction, but that never happened.  Do you have any explanation as to why ml64 persisted in using the usual individual cponents of the prologue?

I'll have more questions shortly.  But, thank you for your exhaustive demonstration.

Mark

aw27

Quote from: markallyn on December 01, 2017, 05:44:57 AM
Namely, in those cases where you have declared LOCAL variables I had predicted that the assembler would use the ENTER XX,0 instruction, but that never happened.  Do you have any explanation as to why ml64 persisted in using the usual individual cponents of the prologue?
There is no obligation to use ENTER with LEAVE. ENTER is considered a slow instruction and is not very popular these days.

hutch--

 :biggrin:

The catch here is that you don't put ENTER in a loop but then the instruction is not designed to be used in a loop so the reference to speed in this context is irrelevant. Outside of that the only gain of constructing a stack frame in an unreliable manner is that you may have the fastest MessageBoxA on the planet by saving a few picoseconds. LEAVE was always fast enough.

markallyn

aw27,

QuoteThere is no obligation to use ENTER with LEAVE. ENTER is considered a slow instruction and is not very popular these days.

I was aware of this.  But, nevertheless, for whatever reason ml64 has been ignoring this in some of the code I write and uses ENTER 80,0.  If ml64 consistently avoided this usage, I would understand why, but it doesn't.  In all of the cases you created ENTER never appears, but I can show you more that one instance in my stuff where it does.  In fact, if you look at the link that Hutch sent in connection with recursion, you will see that his code has ENTER in it too.

I'm still plowing through your 3 cases of code.  I have another question I will reserve for a follow-on post regarding stack aligning. 

Thanks for your assistance.  It is warmly welcomed.

Mark

hutch--

Mark,

Have a look at the 64 bit MACRO file to see how the stackframe is constructed. Search for "UseStackFrame" to find it. There are 3 arguments you can pass to the STACKFRAME argument, the third being alignment which must be a power or 2. I have done a number of extras to handle a stack aligned for AVX ad AVX2 locals. You can adjust the 3 arguments if you want to reduce the stack overhead with nested procedure calls, the recursion test piece can be used to test this. In most instances trimming the stack overhead does not matter as it is pre-allocated memory but if you are writing recursive code that has a very large count of recursion depth, you can trim it down carefully or increase the linker settings or both. You will know if you have trimmed off too much as the app will not start OR it will stop once the stack memory is exhausted.

aw27

There is no great damage in using ENTER, but no advantage either.  :biggrin:
ENTER was though for languages that use nested procedures, like Pascal and Delphi. However, they don't use it.  :biggrin:

markallyn

Good evening Hutch, aw27:

Thanks.  I'll check out the macro.  By the way, I'm perfectly content to use ENTER just about always (except for recursions--which I don't write in any case), I just don't understand what circumstances cause ml to generate it, and when it does, why it picks the size of stack frame that it picks.  It seems to default to 80h--why this value is a mystery.  For a bit I thought that ml would do ENTERs whenever there were LOCALs defined.  But, testing this with one or two of aw27's small programs indicates that this isn't the case.  It stays with the conventional push rbp;  mov rbp, rsp.

Mark

hutch--

Mark,

The values associated with the stackframe are those in the macro I have referred you to. The default values are aimed at safety but they are also modifiable which allows you to change alignment and tweak the arguments pointed at the stackframe macro to optimise the memory usage if you are running recursion to any large depth.

markallyn

Good morning/evening Hutch:

Ah, mystery solved!  I know there will be at least one more question from me winging its way more or less in your direction, but this is for my small brain a major breakthrough.

Mark

markallyn

Good afternoon, evening Hutch,

I'm looking at the usestackframe macro.  I'll try invoking it, but I'm unclear what the "flag" parameter is about.  I can't see it in any of the comments.

Mark

markallyn

Hello Hutch,

Actually, if you could direct me to an example invocation of UseStackFrame it would be most helpful.  I tried a number of times to get the parameters right but haven't yet figured out how.  I googled on UseStackFrame and found two references to it, but no code, just discussion.

Thanks,
Mark

hutch--

 :biggrin:

Mark,

Look in the "macros64" directory for the file "macros64.inc" and you will find the source of all the macros I wrote for 64 bit MASM. There you will find the macro "UseStackFrame" and its matching "EndStackFrame" and the two macros are designed to be called by a number of wrapper macros.

STACKFRAME = the default stackframe with a 16 byte alignment.
NOSTACKFRAME turns the stackframe off.

Then there a a number of alternative forms that only differ in their alignment.

YMMSTACK = 32 byte alignment for AVX instructions.
ZMMSTACK = 64 byte alignment for AVX2 instructions.
CUSTOMSTACK = roll your own.

They differ only in the equates passed to the "UseStackFrame" macro.

      stackframe_default equ <dflt>     ;; set default stack
      stackframe_dynamic equ <dynm>     ;; set byte count for ENTER mnemonic
      stackframe_align   equ <algn>     ;; align the stack by an interval of 16

The documentation for the first 2 arguments (dflt and dynm) is available in the Intel manuals under the ENTER mnemonic, the third argument "algn" has to be a power of 2 byte alignment.

With the "masm64" help file which is incomplete on many of the library and macro code you need to read the data in these categories.

Simplified Introduction
A basic explanation of the stackframe and invoke notations.

Design Criteria

Calling Convention
How the Win 64 calling convention works.

Stack frame reference
How the MASM64 stackframe works.


In particular, you need to understand how the Win64 ABI works and why you must fully comply with it or your application will not start. Get it wrong and the app will just exit telling you nothing. Win64 FASTCALL calling convention is a lot more complex than how Win32 worked, rather than LIFO stack arguments, all arguments must be aligned according to the ABI with the first 4 arguments being in RCX, RDX, R8 and R9 and the stack has what is called "shadow space" that allows the register contents to be written to the stack for code design that required repeated access to the arguments (recursion being one example).

The system I have built is designed to look much like 32 bit code so you don't have to keep twiddling the stack to write reliable code, if you need to know how it works you need to read the documentation AND the main macro file. I warn anyone who want to write 64 bit MASM that it is an advanced topic with little useful data available, lousy and often inaccurate documentation and very few people who understand how it works.





markallyn

Godd evening/morning Hutch,

Thank you so much for your very generous assistance on this thorny business.  I spent my afternoon here in south east pennsylvania usa wrestling with the ABI -- which as you can easily see and have seen -- is what I'm attempting to grasp.  Sill a long way to go.

What I completely fmissed is that the program defaults automatically to your macro "dynamic version" without any active invocation on my part.  That much I finally comprehended, although it took longer than it should have.  I tumbled to your NOSTACKFRAME macro and that was the clue that finally drove it through my thick skull,  What was decisive was when I didn't include the masm54rt.inc file the prologue and epilogue no longer showed up in the disassembly.

So, where I am now as far as the usestackframe macro is concerned is that I DON'T invoke it, but I can use an alternative form as you show in the macro64.inc file and also in this post. 

QuoteI warn anyone who want to write 64 bit MASM that it is an advanced topic with little useful data available, lousy and often inaccurate documentation and very few people who understand how it works.

Yes, I have been fairly warned, but I am like one of Dante's poor souls condemned to enter a 64 bit labyrinth from which no return is possible.  It's what happens when you turn 75!

Regards as always,
Mark

hutch--

Mark,

> It's what happens when you turn 75!

You can't use that as an excuse, I am not far behind you as I turn 70 in the middle of next year. You know the old rule, use it or lose it.