News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Register preservation macros for no stack frame procs.

Started by hutch--, July 13, 2018, 08:40:49 PM

Previous topic - Next topic

hutch--

I sneaked in a new macro when no-one was looking, a technique for preserving registers in procedures with no stack frame. It can be used in either stackframe or nostackframe procedures but a normal stackframe proc can allocate its own locals so its not really needed there. With testing so far it seems to nest OK but I would like to get more testing done. Importantly the two macros must be used in pairs and the second macro tests if the first is there but I don't want the method to get too clunky to try and make it idiot proof as the quality of idiot exceeds any safety measure.

The latest macro file is attached to the last update of the new help file.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

msgloop proc

    preserve_regs r14,r15

    xor r14, r14
    mov r15, ptr$(msg)
    jmp gmsg

  mloop:
    rcall TranslateMessage,r15
    rcall DispatchMessage, r15
  gmsg:
    test rax, rvcall(GetMessage,r15,r14,r14,r14)
    jnz mloop

    restore_regs r14,r15

    ret

msgloop endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤


sub_1400012a2   proc
.text:00000001400012a2 C8800000                   enter 0x80, 0x0
.text:00000001400012a6 4883EC60                   sub rsp, 0x60
.text:00000001400012aa 4C8935B7200000             mov qword ptr [0x140003368], r14
.text:00000001400012b1 4C893DB8200000             mov qword ptr [0x140003370], r15
.text:00000001400012b8 4D33F6                     xor r14, r14
.text:00000001400012bb 488D0576200000             lea rax, [0x140003338]
.text:00000001400012c2 4C8BF8                     mov r15, rax
.text:00000001400012c5 EB12                       jmp 0x1400012d9
.text:00000001400012c5
.text:00000001400012c7
.text:00000001400012c7 0x1400012c7:
.text:00000001400012c7 498BCF                     mov rcx, r15
.text:00000001400012ca FF15A80D0000               call qword ptr [TranslateMessage]
.text:00000001400012ca
.text:00000001400012d0 498BCF                     mov rcx, r15
.text:00000001400012d3 FF15B70D0000               call qword ptr [DispatchMessageA]
.text:00000001400012d3
.text:00000001400012d9
.text:00000001400012d9 0x1400012d9:
.text:00000001400012d9 4D8BCE                     mov r9, r14
.text:00000001400012dc 4D8BC6                     mov r8, r14
.text:00000001400012df 498BD6                     mov rdx, r14
.text:00000001400012e2 498BCF                     mov rcx, r15
.text:00000001400012e5 FF159D0D0000               call qword ptr [GetMessageA]
.text:00000001400012e5
.text:00000001400012eb 4885C0                     test rax, rax
.text:00000001400012ee 75D7                       jne 0x1400012c7
.text:00000001400012ee
.text:00000001400012f0 4C8B3571200000             mov r14, qword ptr [0x140003368]
.text:00000001400012f7 4C8B3D72200000             mov r15, qword ptr [0x140003370]
.text:00000001400012fe C9                         leave
.text:00000001400012ff C3                         ret
sub_1400012a2   endp

sinsi


hutch--

While I would like to get more testing done with it, by incrementing the counter for each time the macro pair are used, I think it is safe from duplication. It is a macro that writes data to the uninitialised data section so if there are no duplicates, there should not be any problems.

sinsi

But the macro isn't called by the actual code, so two threads calling the proc will clash when the proc saves the register to the same address twice?

hutch--

The catch is that the preserved registers are not stored as dynamic code, the register contents are written to the uninitialised data section and in each instance the data is written to unique addresses that are configured at build time, not run time. It means for every macro pair, there are locations already in the uninitialised data section for them to write to.

Now if I understand what you have said, the risk is if a single procedure is used to start multiple threads so that they run in parallel and there is probably a problem here in that multiple threads would be using the same set of data locations. The solution here is to manually allocate local addresses for registers which would require a stack frame. Have I understood what you have pointed out ?

sinsi

Thread1 calls your proc which execute this code

.text:00000001400012aa 4C8935B7200000             mov qword ptr [0x140003368], r14
.text:00000001400012b1 4C893DB8200000             mov qword ptr [0x140003370], r15


Thread2 calls your proc which also executes this code

.text:00000001400012aa 4C8935B7200000             mov qword ptr [0x140003368], r14
.text:00000001400012b1 4C893DB8200000             mov qword ptr [0x140003370], r15


Assuming the code takes the same time
- thread1 enters the proc and has its r14/r15 saved
- thread2 enters the proc and has its r14/r15 saved to the same addresses, clobbering thread1's registers
- thread1 exits the proc, but with thread2's r14/r15

Can't you tweak your prologue/epilogue to just push the registers?
You really need to use local (stack) memory for multithreaded variables.

hutch--

Problem is using PUSH POP will mess up the procedure alignment as the stackframe macro can be aligned to greater than QWORD for larger data types.

The alternative for multiple threads is to create LOCAL variables that are written in descending sizes from biggest that match the procedure alignment downwards in data size. This does require a stack frame to do this.

LOCAL var :XMMWORD
LOCAL reg1 :QWORD
LOCAL reg2 :QWORD
; etc ....




hutch--

sinsi,

Have a look at this one. Problem is it will only work with a stack frame but for thread safe procedures, it should do the job.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    MLOCAL equ LOCAL    ; the word LOCAL is ambiguous in a MACRO

  ; -----------------------------------------------------
  ; acnt (arg count) must match the number of 64 bit regs
  ; -----------------------------------------------------
    REGSPACE MACRO acnt
      MLOCAL r64[acnt] :QWORD
    ENDM

  ; -----------------------------------------------------------
  ; arglist for both save and restore must be in the same order
  ; -----------------------------------------------------------
    saveregs MACRO arglist:VARARG
      cntr = 0
      FOR var, <arglist>
        mov r64[cntr], var
        cntr = cntr + 8
      ENDM
    ENDM

    restregs MACRO arglist:VARARG
      cntr = 0
      FOR var, <arglist>
        mov var, r64[cntr]
        cntr = cntr + 8
      ENDM
    ENDM
  ; -----------------------------------------------------------

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    rcall regtest

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

regtest proc

    REGSPACE 8                          ; allocate stack space for 64 bit registers

    saveregs r12,r13,r14,r15,rsi,rdi,rbp,rbx

    ; do it all here !

    restregs r12,r13,r14,r15,rsi,rdi,rbp,rbx

    ret

regtest endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment #

    sub_140001037   proc
    .text:0000000140001037 C8800000                   enter 0x80, 0x0
    .text:000000014000103b 4881ECA0000000             sub rsp, 0xa0
    .text:0000000140001042 4C89A560FFFFFF             mov qword ptr [rbp-0xa0], r12
    .text:0000000140001049 4C89AD68FFFFFF             mov qword ptr [rbp-0x98], r13
    .text:0000000140001050 4C89B570FFFFFF             mov qword ptr [rbp-0x90], r14
    .text:0000000140001057 4C89BD78FFFFFF             mov qword ptr [rbp-0x88], r15
    .text:000000014000105e 48897580                   mov qword ptr [rbp-0x80], rsi
    .text:0000000140001062 48897D88                   mov qword ptr [rbp-0x78], rdi
    .text:0000000140001066 48896D90                   mov qword ptr [rbp-0x70], rbp
    .text:000000014000106a 48895D98                   mov qword ptr [rbp-0x68], rbx
    .text:000000014000106e 4C8BA560FFFFFF             mov r12, qword ptr [rbp-0xa0]
    .text:0000000140001075 4C8BAD68FFFFFF             mov r13, qword ptr [rbp-0x98]
    .text:000000014000107c 4C8BB570FFFFFF             mov r14, qword ptr [rbp-0x90]
    .text:0000000140001083 4C8BBD78FFFFFF             mov r15, qword ptr [rbp-0x88]
    .text:000000014000108a 488B7580                   mov rsi, qword ptr [rbp-0x80]
    .text:000000014000108e 488B7D88                   mov rdi, qword ptr [rbp-0x78]
    .text:0000000140001092 488B6D90                   mov rbp, qword ptr [rbp-0x70]
    .text:0000000140001096 488B5D98                   mov rbx, qword ptr [rbp-0x68]
    .text:000000014000109a C9                         leave
    .text:000000014000109b C3                         ret
    sub_140001037   endp

#

sinsi

Rather than having two places to list (and change) registers


    saveregs MACRO arglist:VARARG
      reglist TEXTEQU <>
      cntr = 0
      FOR var, <arglist>
        mov r64[cntr], var
        cntr = cntr + 8
        IFNB reglist
          reglist CATSTR reglist,<,>,<var>
        ELSE
          reglist TEXTEQU <var>
        ENDIF
      ENDM
    ENDM

    restregs MACRO
      cntr = 0
      %FOR var, <reglist>
        mov var, r64[cntr]
        cntr = cntr + 8
      ENDM
    ENDM


HSE

I can' test in this 32bit machine,butreglist TEXTEQU <arglist>
don't work?
Equations in Assembly: SmplMath

hutch--

I have simplified the save macro and the design works well, what I am not sure about is the restore not having the same arglist. I tend to prefer the identical arglist for clearer code.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    MLOCAL equ LOCAL    ; the word LOCAL is ambiguous in a MACRO

  ; -----------------------------------------------------
  ; acnt (arg count) must match the number of 64 bit regs
  ; -----------------------------------------------------
    REGSPACE MACRO acnt
      MLOCAL r64@@_@@[acnt] :QWORD
    ENDM

    SaveRegs MACRO arglist:VARARG
      cntr = 0
      reg@___list___@ equ arglist
      FOR arg,<arglist>
        ;; %echo arg
        mov r64@@_@@[cntr], arg
        cntr = cntr + 8
      ENDM
    ENDM

    RestoreRegs MACRO
      cntr = 0
      %FOR arg,<reg@___list___@>
        ;; %echo arg
        mov arg, r64@@_@@[cntr]
        cntr = cntr + 8
      ENDM
    ENDM

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    REGSPACE 4                      ; allocate local space for registers

    SaveRegs r12,r13,r14,r15        ; save register list

    call tstproc

    waitkey
    RestoreRegs                     ; restore register list

    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

tstproc proc

    REGSPACE 4

    SaveRegs rax,rbx,rcx,rdx

    nop
    nop
    nop
    nop

    RestoreRegs

    ret

tstproc endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

HSE

A troll?:
    REGSPACE MACRO arglist:VARARG
      cntr = 0
      reg@___list___@ equ arglist
      FOR arg,<arglist>
        cntr = cntr +1
      ENDM
      MLOCAL r64@@_@@[cntr] :QWORD
    ENDM

    SaveRegs MACRO
      cntr = 0
      FOR arg,<reg@___list___@>
        mov r64@@_@@[cntr], arg
        cntr = cntr +8
       ENDM
    ENDM
Equations in Assembly: SmplMath

hutch--

Combining the two seems to work OK and the results in the user code section is clear and easy enough to understand. The only problem I can see is that it will always have to be put as the last LOCAL as the following dynamic code prevents any further LOCAL variables. Might play with it a little longer as the previous suggestion does not have this problem. What I am trying for is a clean and simple to understand technique.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    MLOCAL equ LOCAL    ; the word LOCAL is ambiguous in a MACRO

    SaveRegs MACRO arglist:VARARG
      MLOCAL r64@@_@@[argcount(arglist)] :QWORD
      cntr = 0
      reg@___list___@ equ arglist
      FOR arg,<arglist>
        ;; %echo arg
        mov r64@@_@@[cntr], arg
        cntr = cntr + 8
      ENDM
    ENDM

    RestoreRegs MACRO
      cntr = 0
      %FOR arg,<reg@___list___@>
        ;; %echo arg
        mov arg, r64@@_@@[cntr]
        cntr = cntr + 8
      ENDM
    ENDM

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    SaveRegs r12,r13,r14,r15        ; save register list

    call tstproc

    waitkey

    RestoreRegs                     ; restore register list

    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

tstproc proc

    SaveRegs rax,rbx,rcx,rdx

    nop
    nop
    nop
    nop

    RestoreRegs

    ret

tstproc endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

sinsi


hutch--

Yep, its in the macro file. Its been very reliable and can be adjusted to handle different data sizes as the default alignment. This means that if someone wants to use AVX or larger, they can align the procedure so that the stack on entry is aligned and can accept from the largest down and each data size is correctly aligned.

LOCAL avxvar :YMMWORD
LOCAL ssevar :XMMWORD
LOCAL qwdvar :QWORD
LOCAL wrdvar :WORD
LOCAL bytvar :BYTE