News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

How would yall write this assembly?

Started by lemonjumps, September 07, 2024, 01:31:23 AM

Previous topic - Next topic

lemonjumps

Hi, so, I have this code,

what it does is it calls another code using the x86-64 microsoft call convention.

It's input parameters are a list of variables aligned to 8 Bytes, in rdx. (it's assumed that the variables are formatted correctly for the cpu).
And then there's r8 which has a list of types for special cases like float and double, also aligned to 8 Bytes.
rcx has the function that should be called, and r9 has the count of arguments.

I've chose to store values for rdx rcx r9 r8 in r11 to r14, r10 is the array offset, and r15 stores the size of additional stack where other values are passed.

What I'd like to ask for is, how could this be better, what are things I should improve?

This is literally the first actual x64 assembly I ever wrote. (TwT)
I've also noticed that the variables that I'm pushing to stack are technically backwards, so I'll have to fix that too. I think it could be written in a way, where I reserve stack first, and then write into it backwards.

pinADcallWIN:
    push rbp
    mov rbp, rsp
   
    xor r15, r15
    xor r10, r10

    dec r9d
    jz __winCall

    mov rax, qword ptr [r8 + r10]
    cmp rax, 1
    je _rcxSWfloat
    cmp rax, 2
    je _rcxSWdouble
    mov r11, qword ptr [rdx + r10]
    jmp _rcxSWend
_rcxSWfloat:
    movss xmm0, dword ptr [rdx + r10]
    jmp _rcxSWend
_rcxSWdouble:
    movsd xmm0, qword ptr [rdx + r10]
_rcxSWend:
    add r10, 8

    dec r9d
    jz __winCall

    mov rax, qword ptr [r8 + r10]
    cmp rax, 1
    je _rdxSWfloat
    cmp rax, 2
    je _rdxSWdouble
    mov r12, qword ptr [rdx + r10]
    jmp _rdxSWend
_rdxSWfloat:
    movss xmm1, dword ptr [rdx + r10]
    jmp _rdxSWend
_rdxSWdouble:
    movsd xmm1, qword ptr [rdx + r10]
_rdxSWend:
    add r10, 8

    dec r9d
    jz __winCall

    mov rax, qword ptr [r8 + r10]
    cmp rax, 1
    je _r8SWfloat
    cmp rax, 2
    je _r8SWdouble
    mov r13, qword ptr [rdx + r10]
    jmp _r8SWend
_r8SWfloat:
    movss xmm2, dword ptr [rdx + r10]
    jmp _r8SWend
_r8SWdouble:
    movsd xmm2, qword ptr [rdx + r10]
_r8SWend:
    add r10, 8

    dec r9d
    jz __winCall

    mov rax, qword ptr [r8 + r10]
    cmp rax, 1
    je _r9SWfloat
    cmp rax, 2
    je _r9SWdouble
    mov r14, qword ptr [rdx + r10]
    jmp _r9SWend
_r9SWfloat:
    movss xmm3, dword ptr [rdx + r10]
    jmp _r9SWend
_r9SWdouble:
    movsd xmm3, qword ptr [rdx + r10]
_r9SWend:
    add r10, 8

    cmp r9d, 0
    je __winCall

__winCallLoop:
    push qword ptr [rdx + r10]
    add r15, 8
    add r10, 8
    dec r9d
    jnz __winCallLoop
__winCall:
    mov rax, rcx
    mov rcx, r11
    mov rdx, r12
    mov r8, r13
    mov r9, r14

    mov qword ptr [rbp + 16], r15

    sub rsp, 32

    call rax

    mov r15, qword ptr [rbp + 16]
    add rsp, r15
    add rsp, 32
   
    pop rbp
    ret


Vortex

#1
Hi  lemonjumps,

xor r15, r15
Your function does not call other functions or API functions but it's important to note that :

QuoteThe x64 ABI considers registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-XMM15 nonvolatile. They must be saved and restored by a function that uses them.

https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170

NoCforMe

Let me second that emotion; very important point here. The ABI expects certain registers to be un-trashed, so be sure to save and restore them in your code if you need to use them. In x64:
  • RBX
  • RBP
  • RDI
  • RSI
  • RSP
  • R12
  • R13
  • R14
  • R15
  • XMM6-XMM15

are what I call "sacred" registers. Don't be sacrilegious!

(All others, RAX, RCX, etc.) are volatile and can be trashed.)
Assembly language programming should be fun. That's why I do it.

lemonjumps

Quote from: Vortex on September 07, 2024, 03:46:18 AMThe x64 ABI considers registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-XMM15 nonvolatile. They must be saved and restored by a function that uses them.

oooh ok I didn't know that. and yeah I'm actually calling RAX at the end :D
I'll be refactoring it tonight So I'll look into it.

NoCforMe

Quote from: lemonjumps on September 07, 2024, 07:55:23 AMI didn't know that. and yeah I'm actually calling RAX at the end

That's fine, so long as you know what's in that register when you issue the call ...

Just so you know, the flipside of having to save and restore non-volatile registers is to assume that all the volatile ones (RAX, RCX, etc.) contain garbage when entering your code (unless, of course, you yourself put something in them before making the call, which is perfectly legitimate).
Assembly language programming should be fun. That's why I do it.

lemonjumps

OK, so I'm working on optimizing my code, and I realized two things.

1. the list of variables is basically the same format as stack, so just treating it as one shouldn't be a problem
2. since volatile registers don't matter, only when there's a parameter for a function, I can just write values into both XMM and normal registers, and the called function will just pick whichever it wants.

The code is still lengthy since I have versions for 1,2,3,4+ parameters as that looks to be the simplest and fastest solution :D
Also I wonder if it's faster to call both pop and movsd or to have cmp with pointer math.

here's what my code looks like now OwO
pinADcallWIN:
    push rbp
    push r11
    push r12
    push r13
    push r14
    push r15
    mov rbp, rsp

    cmp r9, 1
    je _winCall1
    cmp r9, 2
    je _winCall2
    cmp r9, 3
    je _winCall3
   
    jmp _winCall4p

_winCall1:
    mov rsp, rdx

    movsd xmm0, qword ptr [rsp]
    pop r11

    mov rsp, rbp
    jmp __winCall

_winCall2:
    mov rsp, rdx

    movsd xmm0, qword ptr [rsp]
    pop r11
    movsd xmm1, qword ptr [rsp]
    pop r12

    mov rsp, rbp
    jmp __winCall

_winCall3:
    mov rsp, rdx

    movsd xmm0, qword ptr [rsp]
    pop r11
    movsd xmm1, qword ptr [rsp]
    pop r12
    movsd xmm2, qword ptr [rsp]
    pop r13

    mov rsp, rbp
    jmp __winCall

_winCall4p:
    mov rsp, rdx

    movsd xmm0, qword ptr [rsp]
    pop r11
    movsd xmm1, qword ptr [rsp]
    pop r12
    movsd xmm2, qword ptr [rsp]
    pop r13
    movsd xmm3, qword ptr [rsp]
    pop r14

    mov rsp, rbp

    sub r9d, 4
    jz __winCall

    add rdx, 24
    imul r9, 8
    mov r15, r9

__winCallLoop:
    push qword ptr [rdx + r9]
    sub r9d, 8
    jnz __winCallLoop

__winCall:
    mov rax, rcx
    mov rcx, r11
    mov rdx, r12
    mov r8, r13
    mov r9, r14

    sub rsp, 32

    call rax

    add rsp, r15
    add rsp, 32

    pop r15
    pop r14
    pop r13
    pop r12
    pop r11
    pop rbp
    ret