News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Re: 64-bit: Why Can't I get "CreateFileA" to Access a File or Device?

Started by nidud, March 04, 2021, 06:02:33 AM

Previous topic - Next topic

hutch--

> The point is that RCX and RDX has other qualities (or privileges) than R10 and R11.

The catch here is you have to go back a very long way with your instruction choice to find these things. With the exception of the MOVS STOS and SCAS which have special case circuitry due to their popularity over a long period, these fixed register instructions are mainly old junk that is best avoided in 64 bit Windows as they are often very slow.

Now as far as FASTCALL, in this context of Microsoft Windows x64 ABI for Windows 10, its specifications are known by most programmers who write 64 bit Windows code and it is not some open sauce binding set of rules that bridge across multiple OS versions. The UNIX guys have their own specs and I have no doubt that other platforms have theirs as well but there is little in common between them.

I mean seriously, who cares what you need for MAC on MIPS ?

I have no doubt that you well understand the Win64 x64 ABI but your description of how it works shows the cross influence of working across different OS types and platforms.

Here is a simple byte copy procedure using REP MOVSB.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

NOSTACKFRAME

bcopy proc

  ; rcx = src
  ; rdx = dst
  ; r8  = count

    mov r11, rsi
    mov r10, rdi

    mov rsi, rcx
    mov rdi, rdx
    mov rcx, r8

    rep movsb

    mov rsi, r11
    mov rdi, r10

    ret

bcopy endp

STACKFRAME

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

No stack frame and using some of the extra registers rather than using LOCALs to preserve RSI and RDI and arguments read at the proc end directly from registers in accordance with the Microsoft 64 bit FASTCALL ABI without the need to read shadow space.

HSE

Hi Hutch!

Quote from: hutch-- on March 06, 2021, 08:28:11 AM
Microsoft 64 bit FASTCALL ABI without the need to read shadow space.

I readed an article by Microsoft from twelve years ago that say that in 64 bits there is only one calling convention, the (Microsoft) "x64  calling convention". Fastcall it's not part of the name. They only say: is a fastcall-like calling convention. That is a lot more clear for me that, you know, I know very little :biggrin:   

Equations in Assembly: SmplMath

hutch--

Hi Hector,

I think its a case of a rose by any other name still has thorns.  :tongue:

nidud

deleted

daydreamer

Quote from: nidud on March 06, 2021, 10:50:54 AM
Quote from: hutch-- on March 06, 2021, 08:28:11 AM
> The point is that RCX and RDX has other qualities (or privileges) than R10 and R11.

The catch here is you have to go back a very long way with your instruction choice to find these things. With the exception of the MOVS STOS and SCAS which have special case circuitry due to their popularity over a long period, these fixed register instructions are mainly old junk that is best avoided in 64 bit Windows as they are often very slow.

True, newer CPU's actually favors the MOVS instructions over XMM but the usage and functionality of the shift/rotate instructions are the same, so nothing has changed in that regard. MOVSX is a relatively new instruction that didn't exist back in the DOS area. An immediate operand was also added to most instructions using CL as count.

    SHL, SHR, ROR, ROL, SAL, SAR, RCL, RCR, SHLD, SHRD, ...


isnt in DOS 16bit minimum size MOVS,STOS instructions smallest
they are wrapping MOV [edi],[esi] and inc alternative dec instructions
my theory when they went from old hardware wired with good old opcode->circuitry,to modern microcode architechture,longer snippets of microcode is needed to perform all those instructions,compared to simpler modern mov instructions
with 64bit opcodes+many of them +8bytes of immediate,adress its pointless to use it as smaller size instruction
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

hutch--

> However, the arguments here was right-to-left versus left-to-right and the consequences of choosing one over the other. Or more to the point, the origin of the idea of the choice made.

The argument was about you use of PUSH ORDER done by association with a collection of technologies that belong in the past. I make the same point, Windows x64 ABI only requires the WRITE order to be correct, there is no PUSH ORDER. For the obvious reason, I don't see the association between DOS C to x64 Windows and that is because there is none. You may also note that under DOS in a non-re-entrant environment there were no formal calling conventions as it did not matter, they all did their own thing.

Win32 had a PUSH ORDER using push/call notation via STDCALL, Win x64 writes addresses to locations on the stack WITHOUT CHANGING THE STACK ADDRESS.

jj2007

Quote from: hutch-- on March 07, 2021, 08:45:11 AMWin32 had a PUSH ORDER using push/call notation via STDCALL, Win x64 writes addresses to locations on the stack WITHOUT CHANGING THE STACK ADDRESS.

Hutch, I think we all know that we are not talking about a "push order" in the old Win32 sense...

Quote from: jj2007 on March 05, 2021, 11:29:49 PM
Quote from: hutch-- on March 05, 2021, 10:52:05 PMas long as you don't over write registers with other registers.

Yep, and that's the only reason why the "push order" matters a little bit.

hutch--

I have no doubt that most DO understand how the win32 push order works but I complained about it being applied to windows x64 because its simply wrong. Anyone who can read the Win64 ABI know that values are written to the 1st 4 registers and depending on how the call is done, the same 4 registers are written to shadow space then all other arguments are written after that.

The difference is MOV rather than PUSH, PUSH changes the stack, MOV does not and that is the spec for Windows x64.

jj2007

Quote from: hutch-- on March 07, 2021, 12:16:13 PMvalues are written to the 1st 4 registers and depending on how the call is done, the same 4 registers are written to shadow space then all other arguments are written after that.

My understanding is that the 4 registers rcx rdx r8 r9 are not written to shadow space before the call; it is the called proc's task to write them to shadow space if needed.

IMHO it is better to deal first with the higher arguments, in order to be able to use rcx rdx r8 r9 also as arguments. Afterwards, the first 4 arguments can be assigned to rcx rdx r8 r9 without risking to overwrite arguments.

hutch--

Interesting view but its not part of the x64 ABI. The ABI only specifies the first 4 registers for the first 4 arguments and after the first 4 argument shadow spaces locations, any following arguments written to the 64 bit locations. Its a mistake that many make trying to append their own rules to an existing specification.

nidud

deleted

hutch--

You are mixing the creation of a stack frame with passing arguments to a procedure. From a calling procedure, the stack must be aligned for another procedure call due to the CALL mnemonic and the RET mnemonic, this is why you provide multiple stack frame designs for different tasks.

jj2007

Quote from: hutch-- on March 07, 2021, 12:16:13 PMAnyone who can read the Win64 ABI know that values are written to the 1st 4 registers and depending on how the call is done, the same 4 registers are written to shadow space then all other arguments are written after that.

The caller does not write the regs to shadow space. The callee may write them to shadow space.

https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160
QuoteThe caller is responsible for allocating space for the callee's parameters. The caller must always allocate sufficient space to store four register parameters, even if the callee doesn't take that many parameters.

nidud

deleted

hutch--

Are you saying you cannot produce varieties of stack frames. I can routinely do it in MASM.