News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

X64 ABI: how does a C compiler pass a REAL8?

Started by jj2007, April 08, 2021, 08:39:27 PM

Previous topic - Next topic

jj2007

Hi everybody,

I'd like to find out how a C/C++ compiler deals with mixed int/real arguments, so I picked gluPartialDisk(), see below.
Unfortunately I can't use my Visual C compiler, because it has the infamous Command line error D8037 bug - despite a complete reinstallation.

So I tried with GCC, and stumbled over several compile errors that I could eliminate by creating a private gluJJ.h and removing the offending (unrelated) parts (C compilers are not compatible to each other, apparently).

Now it does compile, but the linker throws an error: it complains about an "undefined reference to `gluPartialDisk@44", which is pretty weird because
a) the byte count is 48 - see below
b) the lib has indeed a gluPartialDisk@44 :rolleyes:

Most likely this is a 32-bit library (strange that the linker does not notice that...), so I checked for another glu32.lib and found one at C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x64\GlU32.Lib, with a header file at C:\Program Files (x86)\Windows Kits\8.1\Include\um\gl\GLU.h. The latter throws an error saying it can't find winapifamily.h :sad:

Any experts around who can make this work? I attach the relevant files, including files used to build a Masm64 SDK version (glutest.asm) that assembles fine but does not use xmm regs as prescribed by the X64 ABI.

#include <stdio.h>
// C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Include\gl\GLU.h throws errors
#include <\Masm32\examples64\Glu32\GLUJJ.h>
#pragma comment( lib, "GlU32.Lib" )

/*
void APIENTRY gluPartialDisk (
    GLUquadric          *qobj, 8+
    GLdouble            innerRadius, 8+
    GLdouble            outerRadius, 8+
    GLint               slices, 4+
    GLint               loops, 4+
    GLdouble            startAngle, 8+
    GLdouble            sweepAngle); 8=48
*/

int main(int argc, char* argv[]) {
  GLUquadric *qobj;
  double innerRadius, outerRadius, startAngle, sweepAngle;
  int slices, loops;
  gluPartialDisk(qobj, innerRadius, outerRadius, slices, loops, startAngle, sweepAngle);
  printf("running - all is fine");
}


P.S.: I'm pretty sure that UAsm and AsmC can handle this correctly, but I would really like to see how exactly an industry standard C or C++ compiler passes the arguments to gluPartialDisk.

nidud

#1
deleted

HSE

Hi Nidud!

Is using edi and esi instead of ecx and edx. What convention is that?

Thanks.
Equations in Assembly: SmplMath

jj2007

Quote from: HSE on April 08, 2021, 09:51:57 PM
Is using edi and esi instead of ecx and edx. What convention is that?

I noticed that, too. Otherwise, it's a very useful suggestion - thanks, Nidud :thup:

Test it with a full example:
#include <stdio.h>
double foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6){
    printf("You passed 6 args: %i %f %i %f %i %f", arg1, arg2, arg3, arg4, arg5, arg6);
    return arg4*2;
}

double bar(int a) {
int pass1=123;
double pass2=123.456;
int pass3=456;
double pass4=456.789;
int pass5=555;
float pass6=666.666;
    return foo(pass1, pass2, pass3, pass4, pass5, pass6);
}

nidud

#4
deleted

johnsa

If you're getting esi/edi as the regs used in the calling convention that sounds like you're targeting SYSTEM-V (Linux ABI).

REAL8 should be passed in XMMn register as is REAL4, unless it goes into the stack space parameters.

HSE

Equations in Assembly: SmplMath

jj2007

Ok, I tried with a more detailed example:
#include <stdio.h>
double foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6){
    printf("You passed 6 args: %i %f %i %f %i %f", arg1, arg2, arg3, arg4, arg5, arg6);
    return arg4*2;
}

double bar(int a) {
int pass1=123;
double pass2=123.456;
int pass3=456;
double pass4=456.789;
int pass5=555;
float pass6=666.666;
    return foo(pass1, pass2, pass3, pass4, pass5, pass6);
}


Godbolt replies as follows, using x64 msvc v19.latest:
$SG5204 DB 'You passed 6 args: %i %f %i %f %i %f', 00H
unsigned __int64 `__local_stdio_printf_options'::`2'::_OptionsStorage DQ 01H DUP (?) ; `__local_stdio_printf_options'::`2'::_OptionsStorage
__real@4426aaa0 DD 04426aaa0r ; 666.666
__real@407c8c9fbe76c8b4 DQ 0407c8c9fbe76c8b4r ; 456.789
__real@405edd2f1a9fbe77 DQ 0405edd2f1a9fbe77r ; 123.456
__real@4000000000000000 DQ 04000000000000000r ; 2

arg1$ = 80
arg2$ = 88
arg3$ = 96
arg4$ = 104
arg5$ = 112
arg6$ = 120
double foo(int,double,int,double,int,float) PROC ; foo
$LN3:
  movsd QWORD PTR [rsp+32], xmm3
  mov DWORD PTR [rsp+24], r8d
  movsd QWORD PTR [rsp+16], xmm1
  mov DWORD PTR [rsp+8], ecx
  sub rsp, 72 ; 00000048H
  cvtss2sd xmm0, DWORD PTR arg6$[rsp]
  movsd QWORD PTR [rsp+48], xmm0
  mov eax, DWORD PTR arg5$[rsp]
  mov DWORD PTR [rsp+40], eax
  movsd xmm0, QWORD PTR arg4$[rsp]
  movsd QWORD PTR [rsp+32], xmm0
  mov r9d, DWORD PTR arg3$[rsp]
  movsd xmm2, QWORD PTR arg2$[rsp]
  movq r8, xmm2
  mov edx, DWORD PTR arg1$[rsp]
  lea rcx, OFFSET FLAT:$SG5204
  call printf
  movsd xmm0, QWORD PTR arg4$[rsp]
  mulsd xmm0, QWORD PTR __real@4000000000000000
  add rsp, 72 ; 00000048H
  ret 0
double foo(int,double,int,double,int,float) ENDP ; foo

pass6$ = 48
pass5$ = 52
pass3$ = 56
pass1$ = 60
pass4$ = 64
pass2$ = 72
a$ = 96
double bar(int) PROC ; bar
$LN3:
  mov DWORD PTR [rsp+8], ecx
  sub rsp, 88 ; 00000058H
  mov DWORD PTR pass1$[rsp], 123 ; 0000007bH
  movsd xmm0, QWORD PTR __real@405edd2f1a9fbe77
  movsd QWORD PTR pass2$[rsp], xmm0
  mov DWORD PTR pass3$[rsp], 456 ; 000001c8H
  movsd xmm0, QWORD PTR __real@407c8c9fbe76c8b4
  movsd QWORD PTR pass4$[rsp], xmm0
  mov DWORD PTR pass5$[rsp], 555 ; 0000022bH
  movss xmm0, DWORD PTR __real@4426aaa0
  movss DWORD PTR pass6$[rsp], xmm0
  movss xmm0, DWORD PTR pass6$[rsp]
  movss DWORD PTR [rsp+40], xmm0
  mov eax, DWORD PTR pass5$[rsp]
  mov DWORD PTR [rsp+32], eax
  movsd xmm3, QWORD PTR pass4$[rsp]
  mov r8d, DWORD PTR pass3$[rsp]
  movsd xmm1, QWORD PTR pass2$[rsp]
  mov ecx, DWORD PTR pass1$[rsp]
  call double foo(int,double,int,double,int,float) ; foo
  add rsp, 88 ; 00000058H
  ret 0
double bar(int) ENDP ; bar


It looks ok, but the button to the right of x64 msvc v19.latest remains orange instead of green :sad:

hutch--

I have seen it in the past that various compilers including C compilers do not use the 32 or 64 bit ABI but a scheme of their own that allows them to perform a number of compiler optimisation tricks. I imagine they must still comply with the ABI when interacting with the OS.

jj2007

Quote from: hutch-- on April 08, 2021, 10:55:38 PMI imagine they must still comply with the ABI when interacting with the OS.

I added a GetTickCount():

#include <stdio.h>
#include <Windows.h>
double foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6){
    printf("You passed 6 args: %i %f %i %f %i %f", arg1, arg2, arg3, arg4, arg5, arg6);
    return arg4*GetTickCount();  //###### force interacting with the OS ########
}

double bar(int a) {
int pass1=123;
double pass2=123.456;
int pass3=456;
double pass4=456.789;
int pass5=555;
float pass6=666.666;
    return foo(pass1, pass2, pass3, pass4, pass5, pass6);
}


The process of passing args looks oddly complicated:
$LN3: ; foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6)
  mov DWORD PTR [rsp+8], ecx
  sub rsp, 88 ; 00000058H
  mov DWORD PTR pass1$[rsp], 123 ; 0000007bH
  movsd xmm0, QWORD PTR __real@405edd2f1a9fbe77
  movsd QWORD PTR pass2$[rsp], xmm0
  mov DWORD PTR pass3$[rsp], 456 ; 000001c8H
  movsd xmm0, QWORD PTR __real@407c8c9fbe76c8b4
  movsd QWORD PTR pass4$[rsp], xmm0
  mov DWORD PTR pass5$[rsp], 555 ; 0000022bH
  movss xmm0, DWORD PTR __real@4426aaa0
  movss DWORD PTR pass6$[rsp], xmm0
  movss xmm0, DWORD PTR pass6$[rsp] ; REAL4
  movss DWORD PTR [rsp+40], xmm0
  mov eax, DWORD PTR pass5$[rsp] ; DWORD
  mov DWORD PTR [rsp+32], eax
  movsd xmm3, QWORD PTR pass4$[rsp] ; REAL8
  mov r8d, DWORD PTR pass3$[rsp] ; DWORD
  movsd xmm1, QWORD PTR pass2$[rsp] ; REAL8
  mov ecx, DWORD PTR pass1$[rsp] ; DWORD
  call double foo(int,double,int,double,int,float) ; foo

KradMoonRa


#include <stdio.h>#include <stdio.h>
#include <intrin.h>
#include <stdio.h>

__pragma(optimize("g", on))
__pragma(pack(push, 16) )

// Exclude rarely-used stuff from Windows headers
#ifndef VC_EXTRALEAN
#define VC_EXTRALEAN
#endif
#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif
#ifndef NOMINMAX
#define NOMINMAX
#endif
#ifndef STRICT
#define STRICT
#endif

#include <Windows.h>

extern int const pass1=123;
extern double const pass2=123.456;
extern int const pass3=456;
extern double const pass4=456.789;
extern int const pass5=555;
extern float const pass6=666.666f;
extern const char passedrags[] ="You passed 6 args: %i %f %i %f %i %f";

double foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6){
    printf(passedrags, arg1, arg2, arg3, arg4, arg5, arg6);
    return arg4*GetTickCount();  //###### force interacting with the OS ########
}

double bar(int a) {
    return foo(pass1, pass2, pass3, pass4, pass5, pass6);
}



char const * const passedrags DB 'You passed 6 args: %i %f %i %f %i %f', 00H ; passedrags
unsigned __int64 `__local_stdio_printf_options'::`2'::_OptionsStorage DQ 01H DUP (?) ; `__local_stdio_printf_options'::`2'::_OptionsStorage
__real@4426aaa0 DD 04426aaa0r             ; 666.666
__real@407c8c9fbe76c8b4 DQ 0407c8c9fbe76c8b4r   ; 456.789
__real@405edd2f1a9fbe77 DQ 0405edd2f1a9fbe77r   ; 123.456

arg1$ = 96
arg2$ = 104
arg3$ = 112
arg4$ = 120
arg5$ = 128
arg6$ = 136
double foo(int,double,int,double,int,float) PROC                              ; foo
$LN4:
        sub     rsp, 88                             ; 00000058H
        movss   xmm0, DWORD PTR arg6$[rsp]
        movaps  xmm2, xmm1
        mov     eax, DWORD PTR arg5$[rsp]
        mov     r9d, r8d
        cvtps2pd xmm0, xmm0
        mov     edx, ecx
        lea     rcx, OFFSET FLAT:char const * const passedrags       ; passedrags
        movaps  XMMWORD PTR [rsp+64], xmm6
        movq    r8, xmm2
        movsd   QWORD PTR [rsp+48], xmm0
        movaps  xmm6, xmm3
        mov     DWORD PTR [rsp+40], eax
        movsd   QWORD PTR [rsp+32], xmm3
        call    printf
        call    QWORD PTR __imp_GetTickCount
        xorps   xmm0, xmm0
        mov     eax, eax
        cvtsi2sd xmm0, rax
        mulsd   xmm0, xmm6
        movaps  xmm6, XMMWORD PTR [rsp+64]
        add     rsp, 88                             ; 00000058H
        ret     0
double foo(int,double,int,double,int,float) ENDP                              ; foo

a$ = 64
double bar(int) PROC                                 ; bar
$LN4:
        sub     rsp, 56                             ; 00000038H
        movss   xmm0, DWORD PTR __real@4426aaa0
        mov     r8d, 456                      ; 000001c8H
        movsd   xmm3, QWORD PTR __real@407c8c9fbe76c8b4
        mov     ecx, 123                      ; 0000007bH
        movsd   xmm1, QWORD PTR __real@405edd2f1a9fbe77
        movss   DWORD PTR [rsp+40], xmm0
        mov     DWORD PTR [rsp+32], 555             ; 0000022bH
        call    double foo(int,double,int,double,int,float)                 ; foo
        add     rsp, 56                             ; 00000038H
        ret     0
double bar(int) ENDP                                 ; bar
The uasmlib

jj2007

Thanks, KradMoonRa. Can you see any real difference to the disassembly I posted above (under The process of passing args looks oddly complicated)?

tenkey

int is 32-bit in MSVC and is not converted to 64-bit. float is REAL4 and is not converted to REAL8. I don't see anything else that seems complicated.

Quote from: jj2007 on April 08, 2021, 11:05:19 PM
The process of passing args looks oddly complicated:
$LN3: ; foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6)
  mov DWORD PTR [rsp+8], ecx
  sub rsp, 88 ; 00000058H
  mov DWORD PTR pass1$[rsp], 123 ; 0000007bH
  movsd xmm0, QWORD PTR __real@405edd2f1a9fbe77
  movsd QWORD PTR pass2$[rsp], xmm0
  mov DWORD PTR pass3$[rsp], 456 ; 000001c8H
  movsd xmm0, QWORD PTR __real@407c8c9fbe76c8b4
  movsd QWORD PTR pass4$[rsp], xmm0
  mov DWORD PTR pass5$[rsp], 555 ; 0000022bH
  movss xmm0, DWORD PTR __real@4426aaa0
  movss DWORD PTR pass6$[rsp], xmm0
  movss xmm0, DWORD PTR pass6$[rsp] ; REAL4, in mem
  movss DWORD PTR [rsp+40], xmm0
  mov eax, DWORD PTR pass5$[rsp] ; DWORD, in mem
  mov DWORD PTR [rsp+32], eax
  movsd xmm3, QWORD PTR pass4$[rsp] ; REAL8, not in r9, in xmm3
  mov r8d, DWORD PTR pass3$[rsp] ; DWORD, in r8 (32-bit half), not in xmm2
  movsd xmm1, QWORD PTR pass2$[rsp] ; REAL8, not in rdx, in xmm1
  mov ecx, DWORD PTR pass1$[rsp] ; DWORD, in rcx (32-bit half), not in xmm0
  call double foo(int,double,int,double,int,float) ; foo


TimoVJL

With Pelles C 1031: double foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6){

foo:
  [0000000000000000] 4883EC58                     sub               rsp,58
  [0000000000000004] 660F7F742440                 movdqa            xmmword ptr [rsp+40],xmm6
  [000000000000000A] 89CA                         mov               edx,ecx
  [000000000000000C] 660F28D1                     movapd            xmm2,xmm1
  [0000000000000010] 4589C1                       mov               r9d,r8d
  [0000000000000013] 660F28F3                     movapd            xmm6,xmm3
  [0000000000000017] 8B842480000000               mov               eax,dword ptr [rsp+80]
32:     printf(passedrags, arg1, arg2, arg3, arg4, arg5, arg6);
  [000000000000001E] F30F10842488000000           movss             xmm0,dword ptr [rsp+88]
  [0000000000000027] F30F5AC0                     cvtss2sd          xmm0,xmm0
  [000000000000002B] F20F11442430                 movsd             qword ptr [rsp+30],xmm0
  [0000000000000031] 89442428                     mov               dword ptr [rsp+28],eax
  [0000000000000035] F20F11742420                 movsd             qword ptr [rsp+20],xmm6
  [000000000000003B] 488D0D00000000               lea               rcx,[passedrags]
  [0000000000000042] 66490F7ED0                   movq              r8,xmm2
  [0000000000000047] E800000000                   call              printf
33:     return arg4*GetTickCount();  //###### force interacting with the OS ########
  [000000000000004C] FF1500000000                 call              qword ptr [__imp_GetTickCount]
  [0000000000000052] F2480F2AC0                   cvtsi2sd          xmm0,rax
  [0000000000000057] F20F59C6                     mulsd             xmm0,xmm6
  [000000000000005B] 660F6F742440                 movdqa            xmm6,xmmword ptr [rsp+40]
  [0000000000000061] 4883C458                     add               rsp,58
  [0000000000000065] C3                           ret               
  [0000000000000066] 90                           nop               
  [0000000000000067] 660F1F840000000000           nop               [rax+rax+0]
34: }
clangfoo:
00000000  4883EC58                 sub rsp, 58h
00000004  0F29742440               movaps xmmword ptr [rsp+40h], xmm6
00000009  660F28F3                 movapd xmm6, xmm3
0000000D  4589C1                   mov r9d, r8d
00000010  89CA                     mov edx, ecx
00000012  8B842480000000           mov eax, dword ptr [rsp+80h]
00000019  F30F10842488000000       movss xmm0, dword ptr [rsp+88h]
00000022  F30F5AC0                 cvtss2sd xmm0, xmm0
00000026  F20F11442430             movsd qword ptr [rsp+30h], xmm0
0000002C  89442428                 mov dword ptr [rsp+28h], eax
00000030  F20F115C2420             movsd qword ptr [rsp+20h], xmm3
00000036  488D0D00000000           lea rcx, [passedrags]
0000003D  660F6FD1                 movdqa xmm2, xmm1
00000041  66490F7EC8               movq r8, xmm1
00000046  E800000000               call printf
0000004B  FF1500000000             call qword ptr [__imp_GetTickCount]
00000051  89C0                     mov eax, eax
00000053  0F57C0                   xorps xmm0, xmm0
00000056  F2480F2AC0               cvtsi2sd xmm0, rax
0000005B  F20F59C6                 mulsd xmm0, xmm6
0000005F  0F28742440               movaps xmm6, xmmword ptr [rsp+40h]
00000064  4883C458                 add rsp, 58h
00000068  C3                       ret
00000069  0F1F8000000000           nop dword ptr [rax], eax
bar:
00000070  4883EC38                 sub rsp, 38h
00000074  48B80000000054D58440     mov rax, 4084D55400000000h
0000007E  4889442430               mov qword ptr [rsp+30h], rax
00000083  48B8B4C876BE9F8C7C40     mov rax, 407C8C9FBE76C8B4h
0000008D  4889442420               mov qword ptr [rsp+20h], rax
00000092  C74424282B020000         mov dword ptr [rsp+28h], 22Bh
0000009A  488D0D00000000           lea rcx, [passedrags]
000000A1  F30F7E1500000000         movq xmm2, qword ptr [__real@405edd2f1a9fbe77]
000000A9  BA7B000000               mov edx, 7Bh
000000AE  66490F7ED0               movq r8, xmm2
000000B3  41B9C8010000             mov r9d, 1C8h
000000B9  E800000000               call printf
000000BE  FF1500000000             call qword ptr [__imp_GetTickCount]
000000C4  89C0                     mov eax, eax
000000C6  F2480F2AC0               cvtsi2sd xmm0, rax
000000CB  F20F590500000000         mulsd xmm0, qword ptr [__real@407c8c9fbe76c8b4]
000000D3  4883C438                 add rsp, 38h
000000D7  C3                       ret
May the source be with you

LiaoMi

Quote from: tenkey on April 14, 2021, 01:45:36 PM
int is 32-bit in MSVC and is not converted to 64-bit. float is REAL4 and is not converted to REAL8. I don't see anything else that seems complicated.

Quote from: jj2007 on April 08, 2021, 11:05:19 PM
The process of passing args looks oddly complicated:
$LN3: ; foo(int arg1, double arg2, int arg3, double arg4, int arg5, float arg6)
  mov DWORD PTR [rsp+8], ecx
  sub rsp, 88 ; 00000058H
  mov DWORD PTR pass1$[rsp], 123 ; 0000007bH
  movsd xmm0, QWORD PTR __real@405edd2f1a9fbe77
  movsd QWORD PTR pass2$[rsp], xmm0
  mov DWORD PTR pass3$[rsp], 456 ; 000001c8H
  movsd xmm0, QWORD PTR __real@407c8c9fbe76c8b4
  movsd QWORD PTR pass4$[rsp], xmm0
  mov DWORD PTR pass5$[rsp], 555 ; 0000022bH
  movss xmm0, DWORD PTR __real@4426aaa0
  movss DWORD PTR pass6$[rsp], xmm0
  movss xmm0, DWORD PTR pass6$[rsp] ; REAL4, in mem
  movss DWORD PTR [rsp+40], xmm0
  mov eax, DWORD PTR pass5$[rsp] ; DWORD, in mem
  mov DWORD PTR [rsp+32], eax
  movsd xmm3, QWORD PTR pass4$[rsp] ; REAL8, not in r9, in xmm3
  mov r8d, DWORD PTR pass3$[rsp] ; DWORD, in r8 (32-bit half), not in xmm2
  movsd xmm1, QWORD PTR pass2$[rsp] ; REAL8, not in rdx, in xmm1
  mov ecx, DWORD PTR pass1$[rsp] ; DWORD, in rcx (32-bit half), not in xmm0
  call double foo(int,double,int,double,int,float) ; foo


Hi tenkey,

"Plain ints have the natural size suggested by the architecture of the execution environment44; the other signed integer types are provided to meet special needs." - https://timsong-cpp.github.io/cppwp/n3337/basic.fundamental

Here is a typical case of abstraction, when you need to decide what type a function should take in different bit systems.