The MASM Forum

General => The Campus => Topic started by: JK on July 01, 2020, 06:25:13 AM

Title: push/pop in 64 bit
Post by: JK on July 01, 2020, 06:25:13 AM
I realize that i cannot use push and pop for 8 byte registers (rax, ... , r15) in 64 bit mode as i could in 32 bit mode. E.g saving a register value temporarily on the stack and retrieving it later on, which works flawlessly in 32 bit, doesn´t work always in 64 bit. The same code (replacing every 4 byte register with an 8 byte register, EAX -> RAX, etc.) doesn´t work or even crashes.

In 64 bit mode the stack must be correctly aligned (16 bit, is this correct?) - ok, but how could i misalign the stack by pushing/popping 8 byte (64 bit) registers, 64 bit being a multiple of 16 bit? Please note, i don´t build my own stackframes, i let the assembler do this work, i just want to save some registers inside a routine. If i use a local or a global for temporarily saving register values, it works. But why can´t i simply use the stack for this in 64 bit like i can in 32 bit? What are the rules?

Looking at a disassembly i see that there is no rbp based stack frame in 64 bit (assembler UASM) did i miss something here?

Any help and explanation appreciated - thanks


JK
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 01, 2020, 07:47:43 AM
You can use pairs of push & pop, thus maintaining the 16-byte alignment. However, check carefully the concept of shadow space, e.g. googling for shadow space x64
Title: Re: push/pop in 64 bit
Post by: Mikl__ on July 01, 2020, 10:26:13 AM
Hi, JK!
when you call an API function, the value in RSP should be a multiple of 10h, PUSH reg/mem/imm decreases RSP by 8, POP reg/mem increases the contents of RSP by 8. If you use PUSH/POP before calling the API function, then you can use as many PUSH and POP as you want. But if the API function is between PUSH and POP, then you have to use even number of instructions PUSH/POP or use local variables    mov old_eax,eax
      . . . .
    mov eax,old_eax
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 01, 2020, 10:32:51 AM
Hi JK,

The stack system in Win64 is not designed to be used like the win32 stack and it is a different calling convention. The 16 byte alignment is necessary and arguments are passed first with 4 registers and subsequently at specific locations on the stack.

I have attached a zipped chm help file that has this type of data and it is technically correct.
Title: Re: push/pop in 64 bit
Post by: JK on July 02, 2020, 06:43:51 AM
Maybe i should post an example. This is UASM code for 32 and 64 bit:

;*************************************************************************************
;assemble console 32           ;assemble 32 bit
assemble console 64            ;assemble 64 bit
;*************************************************************************************


ifdef _WIN64
  option win64:15
 
else
  .386
  .model flat, stdcall
endif


include <windows.inc>

includelib kernel32.lib
includelib user32.lib


.code


start proc
;*************************************************************************************
;
;*************************************************************************************


ifdef _WIN64
  push rax                                            ;why must it be two pushes ?
;  push rax
else
  push eax
endif 


  invoke MessageBoxA, 0, CSTR("works"), CSTR("test"), 0


ifdef _WIN64
  pop rax                                             ;and two pops in order to work in 64 bit
;  pop rax
else
  pop eax
endif 


  invoke ExitProcess, 0
  ret

   
start endp


end start


If i have two pushes and pops it works in 64 bit, if i have only one push and one pop it crashes in 64 bit. Pushing a 64 bit register (64 is a multiple of 16) keeps the stack aligned to 16 bit (10h), so it shouldn´t make problems, but as the code demonstrates - it does! Pushing two 64 bit registers (thus pushing 128 bit) works in 64 bit - why is that?


JK
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 02, 2020, 07:00:00 AM
The answer is don't use win32 techniques in win64. You write to stack addresses without messing up the alignment of the stack. When you use the CALL RET pair you are changing the stack by 8 bytes both ways so you must ensure that your start address is aligned correctly. I have had a quick play with UASM and got it to work but you would need one of the UASM guys to decypher how to set it up so it works OK.

I work with MASM and have multiple stack techniques available to deal with a number of different stack requirements.
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 02, 2020, 07:47:11 AM
Quote from: JK on July 02, 2020, 06:43:51 AMPushing a 64 bit register (64 is a multiple of 16) keeps the stack aligned to 16 bit (10h)

Nope. Eax is 4 bytes, rax is 8 bytes, not 16 as you seem to believe. So, as explained above, you can use pairs of push & pop, thus maintaining the 16-byte alignment.
Title: Re: push/pop in 64 bit
Post by: JK on July 03, 2020, 04:14:21 AM
A-ha, i see. Thanks for your explantions!

So the rule in 64 bit is: RSP must be 16 byte aligned before calling a procedure
In 64 bit i can do all the things with the stack i can do in 32 bit as long as this rule is met - is this correct?


JK
Title: Re: push/pop in 64 bit
Post by: nidud on July 03, 2020, 05:24:16 AM
deleted
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 03, 2020, 05:44:11 AM
Quote from: JK on July 03, 2020, 04:14:21 AMIn 64 bit i can do all the things with the stack i can do in 32 bit as long as this rule is met - is this correct?

Many but not all the things :cool:

Quote from: jj2007 on July 01, 2020, 07:47:43 AMcheck carefully the concept of shadow space, e.g. googling for shadow space x64
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 03, 2020, 07:37:52 AM
In win64 FASTCALL, each argument fits into a 64 bit location, the first 4 are what is called "shadow space" as the first 4 arguments are written to 4 registers, rcx rdx r8 and r9. Any additional arguments get written to the higher addresses on the stack so you have a format something like this.

reg | reg| reg | reg | mem | mem | mem | etc ......

If you pass arguments of different sizes from BYTE to QWORD, they are all written to 64 bit addresses and the important thing here is the stack remains at the same alignment. By writing arguments according to the 64 bit Windows calling convention, you don't have to balance the stack on exit with a "ret number", you just use a "ret".

I know for certain that UASM uses the convention correctly and I have no doubt that nidud does so as well. With MASM I had to write the macros that set up the stack for procedure calls and the "invoke" style technique for calling procedures.

It can be done by writing the procedure and the calling technique manually but it is messy and very unreliable where having this automated makes win64 as easy to use as win32. With all of the extra registers and FASTCALL, win64 is more efficient and has less overhead than the 32 bit STDCALL where you push args and balance the stack on exit.
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 03, 2020, 09:44:34 AM
Three years ago CMalcheski wrote a nice article titled Nightmare on (Overwh)Elm Street: The 64-bit Calling Convention (https://www.codeproject.com/Articles/1187064/Nightmare-on-Overwh-Elm-Street-The-bit-Calling-Con). It's fun to read :tongue:
Title: Re: push/pop in 64 bit
Post by: Biterider on July 03, 2020, 03:36:53 PM
Hi JJ

Really fun article. Fortunately, I did NOT find it before ...
Otherwise I would not have followed the 64bit path so quickly. :biggrin:


Biterider
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 03, 2020, 05:48:32 PM
Hi Biterider,

I've invested my fair share in the jinvoke macro (356 lines), plus prolog+epilog (200), it works fine with Masm and the Watcom clones alike, it even counts and checks parameters so that Hutch doesn't have to hold my hot little hand, but... I just don't see any compelling reason to abandon 32-bit coding  :bgrin:
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 03, 2020, 06:32:12 PM
> I just don't see any compelling reason to abandon 32-bit coding  :bgrin:

Except speed, power, memory, twice as many registers etc etc .... You can hide in a small world of 32 bit but it will never get bigger where 64 bit is the future.  :biggrin:
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 03, 2020, 08:51:40 PM
Going from 16-bit to 32-bit was a huge improvement, absolutely. More memory, no selectors, more speed - fantastic.
Going from 32-bit to 64-bit is a tiny improvement, if and only if you regularly need more than 2GB of memory :cool:
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 03, 2020, 09:47:37 PM
 :biggrin:

That's what the MS-DOS guys used to say. The extra registers are another story, 16 integer regs and almost never having to touch RBP and RSP, you can live in a small world of 32 bit with 3 free registers but 64 bit hits the big time. 2 gig plus some higher that 2 gig with the right linker options is the limit for 32 bit, with 64 bit its how much memory you install.

Much greater freedom of design, much greater power and the capacity to properly use SSE, AVX and AVX2 + AVX 512.
Title: Re: push/pop in 64 bit
Post by: jj2007 on July 03, 2020, 11:33:50 PM
Speed gain from 16- to 32-bit was roughly a factor 5. From 32- to 64-bit it's +-5% :tongue:

And I've never, ever run out of registers in an innermost loop (of course, using the whole range of SSE instructions) :cool:
Title: Re: push/pop in 64 bit
Post by: hutch-- on July 03, 2020, 11:35:10 PM
 :biggrin:

That's because you are playing with small stuff, allocate 32 gig and let 'er rip.  :tongue:
Title: Re: push/pop in 64 bit
Post by: JK on July 04, 2020, 05:22:10 AM
Thanks for all your input! I think i got it now.