Author Topic: RBP vs RSP stack frames  (Read 7313 times)

AW

  • Member
  • *****
  • Posts: 2583
  • Let's Make ASM Great Again!
Re: RBP vs RSP stack frames
« Reply #15 on: March 25, 2017, 03:11:24 AM »
the invokes will assume they can use [RSP+0] -> [RSP+x] to fill in the parameters..
So if you SUB rsp,Y somewhere in the proc .. invokes would overwrite your dynamic stack allocation ?
No, you are simply "rebasing" the stack pointer. If the function is not a leaf, before calling a subrotine you will have to:
1) subtract the usual 32 bytes
2) align the stack.
On return:
add the usual 32 bytes + bytes used for stack alignment if any.
After that you will be as you were before the call :)

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7535
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: RBP vs RSP stack frames
« Reply #16 on: March 25, 2017, 03:15:26 AM »
This is what I get with 64 bit MASM using a custom prologue/epilogue. The entry/exit code is small and for high level code, its easily fast enough. For low level code you don't use a stack frame.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL a1    :QWORD
    LOCAL a2    :QWORD
    LOCAL a3    :QWORD
    LOCAL a4    :QWORD

    mov a1, 1
    mov a2, 2
    mov a3, 3
    mov a4, 4

    xor rcx, rcx
    call ExitProcess

    ret

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment * +++++++++++++++++++++++++++

segment .text
 enter 0x80, 0x0
 sub rsp, 0x80
 mov qword ptr [rbp-0x68], 0x1
 mov qword ptr [rbp-0x70], 2
 mov qword ptr [rbp-0x78], 3
 mov qword ptr [rbp-0x80], 4
 xor rcx, rcx
 call qword ptr [ExitProcess]
 leave
 ret
* +++++++++++++++++++++++++++++++++++


The disassembly in detail.

.text:0000000140001000 C8800000                   enter 0x80, 0x0
.text:0000000140001004 4881EC80000000             sub rsp, 0x80
.text:000000014000100b 48C7459801000000           mov qword ptr [rbp-0x68], 0x1
.text:0000000140001013 48C7459002000000           mov qword ptr [rbp-0x70], 2
.text:000000014000101b 48C7458803000000           mov qword ptr [rbp-0x78], 3
.text:0000000140001023 48C7458004000000           mov qword ptr [rbp-0x80], 4
.text:000000014000102b 4833C9                     xor rcx, rcx
.text:000000014000102e FF1560100000               call qword ptr [ExitProcess]
.text:0000000140001034 C9                         leave
.text:0000000140001035 C3                         ret
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

jj2007

  • Member
  • *****
  • Posts: 10543
  • Assembler is fun ;-)
    • MasmBasic
Re: RBP vs RSP stack frames
« Reply #17 on: March 25, 2017, 03:21:02 AM »
Do you have an ASM based example of how you'd handle dynamically allocating the stack ?

StackBuffer()

johnsa

  • Member
  • ****
  • Posts: 807
    • Uasm
Re: RBP vs RSP stack frames
« Reply #18 on: March 25, 2017, 03:26:36 AM »
interestingly, as I've always avoided ENTER/LEAVE as I had assumed they weren't  that quick..
http://stackoverflow.com/questions/5959890/enter-vs-push-ebp-mov-ebp-esp-sub-esp-imm-and-leave-vs-mov-esp-ebp

johnsa

  • Member
  • ****
  • Posts: 807
    • Uasm
Re: RBP vs RSP stack frames
« Reply #19 on: March 25, 2017, 03:28:42 AM »
Do you have an ASM based example of how you'd handle dynamically allocating the stack ?

StackBuffer()


That looks interesting! and I guess thats x64 as well as x86 ?

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7535
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: RBP vs RSP stack frames
« Reply #20 on: March 25, 2017, 03:50:51 AM »
There has never been a problem with LEAVE and it was the normal cleanup in 32 bit MASM where ENTER was known to be slow in 32 bit. With the size of 64 bit instructions generally being larger than the 32 bit versions, using ENTER does not seem to be a problem as any high level code is some powers slower than direct mnemonic code. With pure mnemonic code you would go for not using a stack frame as you total call overhead is simple CALL/RET.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

jj2007

  • Member
  • *****
  • Posts: 10543
  • Assembler is fun ;-)
    • MasmBasic
Re: RBP vs RSP stack frames
« Reply #21 on: March 25, 2017, 04:16:41 AM »
That looks interesting! and I guess thats x64 as well as x86 ?

No, StackBuffer() is 32-bit only so far.

Re enter+leave:
EDIT: How about testing in x64 enter/leave and rsp sub/add ?

Saw your edit only now, sorry. If I remember well, we tested that for 32-bit code in the Lab; enter was slow, leave was fast.

P.S.: Made a few tests, and for a naked procedure, enter is about 15% slower than push rbp + mov rbp, rsp

Which means a cycle or so. As Hutch wrote above, if it's really speed critical, you would use only registers + CALL + RET.

And if you want it really fast, i.e. the extra cycle for enter slows your algo down, then your design is wrong. Short procedures in speed critical loops are nonsense, drop the call and the ret and use a macro, or "inline" it by hand.