Author Topic: fastcall  (Read 430 times)

jimg

  • Member
  • ***
  • Posts: 462
fastcall
« on: March 29, 2020, 01:53:32 AM »
I don't think I'll ever understand fastcall or the rational behind it.   There should be a special place reserved for the inventor next to those that invented c.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: fastcall
« Reply #1 on: March 29, 2020, 02:39:16 AM »
In 32 bit, its design your own, in 64 bit it is the normal calling convention.

In 32 bit you can use eax, ecx and edx as you first three arguments then use the stack for the rest, in 64 bit you MUST use the first four arguments in the rcx, rdx, r8 and r9 registers and the following arguments must be written to the correct locations on the stack. In 64 bit its best to use a macro designed to do that, doing it manually is a nightmare.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

jimg

  • Member
  • ***
  • Posts: 462
Re: fastcall
« Reply #2 on: March 29, 2020, 03:23:22 AM »
in 64 bit you MUST use the first four arguments in the rcx, rdx, r8 and r9 registers and the following arguments must be written to the correct locations on the stack. In 64 bit its best to use a macro designed to do that, doing it manually is a nightmare.

Yes, I meant for 64 bit.  I'll need a macro for sure as I don't get it.  But I'd kind of like to understand it.
It says you have to pass the parameters in rcx,rdx,r8,r9,  but also you have to make space for those 4 on the stack.  Why?  Either I want to preserve them and I'll put them somewhere, or don't want to preserve them and won't, why extra space on the stack?

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: fastcall
« Reply #3 on: March 29, 2020, 05:09:44 AM »
Jim,

Its called shadow space and its for higher level functions where you are mixing API and similar calls with mnemonic code. For 4 arguments or less that are QWORD sized arguments, I use a macro "rcall" as it drops the overhead slightly without the shadow space but for procedure calls with more than 4 arguments the "invoke" notation does the job and its more tolerant with different data sizes  as the all get dumped into the same 64 bit sized slot. Win64 FASTCALL is a bit complicated but it is more efficient that the 32 bit STDCALL.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

Vortex

  • Member
  • *****
  • Posts: 2206
Re: fastcall
« Reply #4 on: March 29, 2020, 05:21:47 AM »
Quote
The described feature leads to a significant growth of the stack consumption speed. Even if the function does not have parameters, 32 bytes will be "bit off" the stack anyway and they will not be used anyhow then. I failed to find the reason for such a wasteful mechanism. There were some explanations concerning unification and simplification of debugging but this information was too vague.

https://software.intel.com/en-us/blogs/2010/07/01/the-reasons-why-64-bit-programs-require-more-stack-memory


jimg

  • Member
  • ***
  • Posts: 462
Re: fastcall
« Reply #5 on: March 29, 2020, 05:42:55 AM »
I can't see how it could be more efficient.  It's extra overhead on every call.  This overhead should have been handled once in the proc being called, not every time it's being called.  Plus the caller has to move the parms into the registers rather than just pushing them.  e.g. if what you need to call with is already edx but the proc wants it in ecx, but you already have in ecx what the proc wants in edx, etc., it get real messy real fast.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: fastcall
« Reply #6 on: March 29, 2020, 09:13:37 AM »
One of the things you do in Win64 is design your algorithms in a different way, as you know what registers will be used for arguments, you set up the algo to use them as passed from the caller. With all of the extra registers you have much more to work with and rarely ever have to use LOCAL variables in low level code. Once you adapt to it you will see why it is efficient but you have no other choice than to use the win64 Microsoft ABI.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

daydreamer

  • Member
  • *****
  • Posts: 1215
  • I also want a stargate
Re: fastcall
« Reply #7 on: March 29, 2020, 09:36:00 PM »
but isnt fastcall as old as C ,as an alternative to stdcall?
but it must be easier to loops with PROC call inside loop to do things and faster because regs are faster than memory(also stack) and was it maybe design it to fastcall,because twice the size of registers in 64bitx4 become slower push 4 regs to stack(mem) with std call and inside PROC retrieve from 4 regs from stack(mem)
shadow space,was it defined by a Babylon 5 fan?

I think its more natural for an assembler programmer to use registers for call,both for direct use of integer values and indirect(Pointers in C)

I lost the adress but found a page on many kinds of calling conventions,some old calling conventions you leave floating point data in fpu regs,while 64bit fastcall you also can use four of the XMM regs for float's,double's
thats good for us who like to use the versatile fp's,instead of integers

heres the page about different kinds of calling conventions
https://en.wikipedia.org/wiki/X86_calling_conventions
Quote from Flashdance
Nick  :  When you give up your dream, you die
*wears a flameproof asbestos suit*
Gone serverside programming p:  :D
I love assembly,because its legal to write
princess:lea eax,luke
:)

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: fastcall
« Reply #8 on: March 29, 2020, 09:51:15 PM »
magnus,

You are talking nonsense here, win32 had either STDCALL or C call, FASTCALL was a custom version that was not standard. In Win64 it is an exact convention and it does not respond to guesswork.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

Vortex

  • Member
  • *****
  • Posts: 2206
Re: fastcall
« Reply #9 on: March 30, 2020, 01:03:46 AM »
Here is a quick 32-bit Poasm example. The first two parameters of the fastcall function are passed to the rcx & rdx couple and the rest to the stack :

Code: [Select]
.386
.model flat,stdcall
option casemap:none

include     fcalldemo.inc

.data
string1     db '10 + 20 + 30 + 40 = %d',0

.code

fcallproc PROC FASTCALL a:DWORD,b:DWORD,c:DWORD,d:DWORD

    mov     eax,a
    add     eax,b
    add     eax,c
    add     eax,d
    ret

fcallproc ENDP

start:

    invoke  fcallproc,10,20,30,40
    invoke  printf,ADDR string1,eax
    invoke  ExitProcess,0
   
END start

Disassembling the object module :
Code: [Select]
@fcallproc@16 PROC NEAR
        push    ebp
        mov     ebp, esp
        mov     eax, ecx
        add     eax, edx
        add     eax, dword ptr [ebp+8H]
        add     eax, dword ptr [ebp+0CH]
        leave
        ret     8
@fcallproc@16 ENDP

_start  PROC NEAR
        push    40
        push    30
        mov     edx, 20
        mov     ecx, 10
        call    @fcallproc@16
        push    eax
        push    offset string1
        call    _printf
        add     esp, 8
        push    0
        call    _ExitProcess@4
_start  ENDP

jimg

  • Member
  • ***
  • Posts: 462
Re: fastcall
« Reply #10 on: March 30, 2020, 04:59:57 AM »
Vortex-

I thought the 3rd and 4th were supposed to be in r8 and r9, not pushed?  Is PoAsm just different?

Vortex

  • Member
  • *****
  • Posts: 2206
Re: fastcall
« Reply #11 on: March 30, 2020, 05:12:11 AM »
Hi Jim,

It's just a 32-bit Poasm example. I mentioned about it in my previous message. In 64-bit programming, r8 and r9 are the 3rd and 4th parameters.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: fastcall
« Reply #12 on: March 30, 2020, 06:03:07 AM »
Jim,

Here is some simple code that shows how win64 FASTCALL works. The first is a 4 register version, the second has more arguments that requires stack space as well.

    rcall SendMessage,HWND_BROADCAST,WM_COMMAND,1,0

    invoke LoadImage,hInstance,10,IMAGE_ICON,64,64,LR_DEFAULTCOLOR

This is what it dis-assembles to.

.text:0000000140001008 4D33C9                     xor r9, r9
.text:000000014000100b 49C7C001000000             mov r8, 0x1
.text:0000000140001012 48C7C211010000             mov rdx, 0x111
.text:0000000140001019 48C7C1FFFF0000             mov rcx, 0xffff
.text:0000000140001020 FF15A2100000               call qword ptr [SendMessageA]

.text:0000000140001026 488B0D0B110000             mov rcx, qword ptr [0x140002138]
.text:000000014000102d 48C7C20A000000             mov rdx, 0xa
.text:0000000140001034 49C7C001000000             mov r8, 0x1
.text:000000014000103b 49C7C140000000             mov r9, 0x40
.text:0000000140001042 48C744242040000000         mov qword ptr [rsp+0x20], 0x40
.text:000000014000104b 48C744242800000000         mov qword ptr [rsp+0x28], 0x0
.text:0000000140001054 FF157E100000               call qword ptr [LoadImageA]
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy: