Register preservation macros for no stack frame procs.

sinsi · July 17, 2018, 03:50:45 PM

Can't you slot "reglist" in there somewhere whilst you are adjusting the stack?
The good thing about reglist is that in the epilogue the registers are reversed i.e. in the correct order for popping.

hutch-- · July 17, 2018, 06:37:22 PM

> The good thing about reglist is that in the epilogue the registers are reversed i.e. in the correct order for popping.

The bad thing about reglist is that in the epilogue the registers are reversed i.e. in the correct order for popping.

Stack manipulation with PUSH POP messes up the stack alignment which in turn wrecks aligned data larger than QWORD.

jj2007 · July 17, 2018, 07:34:18 PM

Some time ago I remember Sinsi writing that push+pop are a no-no in 64-bit code, not because of alignment issues (just make sure you use an even number of pushes...) but because of the shadow space. Unfortunately I can't find that post, and I can't find a crispy example showing the spill/shadow space problem. Maybe on of you can help out.

But I found this instead (a very long but interesting post by Peter Cordes):

QuoteModern code generators avoid using PUSH. It is inefficient on today's processors because it modifies the stack pointer, that gums-up a super-scalar core. (Hans Passant)

This was true 15 years ago, but compilers are once again using push when optimizing for speed, not just code-size. Compilers already use push/pop for saving/restoring call-preserved registers they want to use, like rbx, and for pushing stack args

hutch-- · July 17, 2018, 11:05:27 PM

This seems to work OK as well. I have always been keen to make use of the MMX registers when they are just about useless for much else.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

NOSTACKFRAME

testme proc

movq mm0, rbx
movq mm1, r12
movq mm2, r13
movq mm3, r14
movq mm4, r15
movq mm5, rsi
movq mm6, rdi
movq mm7, rbp

mov rbx, 1
mov r12, 2
mov r13, 3
mov r14, 4
mov r15, 5
mov rsi, 6
mov rdi, 7
mov rbp, 8

movq rbx, mm0
movq r12, mm1
movq r13, mm2
movq r14, mm3
movq r15, mm4
movq rsi, mm5
movq rdi, mm6
movq rbp, mm7

ret

testme endp

STACKFRAME

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

HSE · July 17, 2018, 11:13:35 PM

Quote from: hutch-- on July 17, 2018, 11:05:27 PM
I have always been keen to make use of the MMX registers when they are just about useless for much else.

Obviously You never use FPU!!!!!

hutch-- · July 18, 2018, 12:12:19 AM

You would be surprised. :P

If you are going to use the same registers for floating point, you use the right instruction for clearing them but in win64 the FP/MMX registers are not defined and you can do what you like with them.

HSE · July 18, 2018, 01:21:52 AM

Quote from: hutch-- on July 18, 2018, 12:12:19 AM
... but in win64 the FP/MMX registers are not defined and you can do what you like with them.

Even if defined... I will not trust very much in foreign functions inside calculations process.

Just in case, try not to develop something amazing trashing those registers

hutch-- · July 18, 2018, 02:05:12 AM

The simple answer is use "emms" Empty MMX Technology State. MMX is old stuff and the regs are generally not used any longer as there are better SSE and later instructions but if you want the performance of floating point you just clear the MMX state with "emms". What I have been looking for is a way to use more 64 bit registers that you can get with volatile registers and with the MMX registers you have 8 that can be used to preserve normal 64 bit integer registers.

The MASM Forum

News:

Register preservation macros for no stack frame procs.

sinsi

hutch--

jj2007

hutch--

HSE

hutch--

HSE

hutch--