The MASM Forum

64 bit assembler => 64 bit assembler. Conceptual Issues => Topic started by: hutch-- on June 23, 2016, 01:52:21 PM

Title: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 23, 2016, 01:52:21 PM
I have put together a test framework using the 64 bit binaries from VC2010 and the include files and libraries that Mikl_ has been using  below is a bare bones example so that my question will be understood.

The batch file to build the example.

@echo off

\masm64\bin\ml64.exe /c test1.asm

\masm64\bin\link.exe /SUBSYSTEM:WINDOWS /ENTRY:main test1.obj

pause


The bare minimum source code to demonstrate what I need to know.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    OPTION DOTNAME
   
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc

    includelib \masm64\lib\user32.lib   
    includelib \masm64\lib\kernel32.lib

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  .data
    pmsg db "This example is written in ML64.EXE",0
    pttl db "Howdy Folks",0

  .data?

  .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    sub rsp, 40

    invoke MessageBox,0,ADDR pmsg,ADDR pttl,0

    invoke ExitProcess,0

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  end


What I need to know is why the spill space needs to be set at a specific size, I looked up Mikl_'s example which showed 40 bytes and if I add more it will not work and if I set it to less it will not work. I need to know why and if anyone has some definitive reference material on how and why spill space is configured, it will be most appreciated.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 23, 2016, 02:14:20 PM
Hi hutch :biggrin:
If you use HJWasm you don't have to worry about these intricacies, it will take care of everything :t
However, here (http://www.codemachine.com/article_x64deepdive.html) is everything clearly explained with examples. Only HJWasm is able to do those things which you can find there.
I appreciate your will to step in to a "Brave New World" :t

Cheers!

Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 23, 2016, 02:34:56 PM
Have you read the ABI? And, there are many tutorials on the topic. Google is your friend.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 23, 2016, 03:07:56 PM
Still I must admit that many of the tutorials have mistakes, are hard to follow, and don't really answer your question. The trick is to include these keywords in your Google search: "good", "correct" "relevant", and "understandable". That filters out all the bad, wrong, irrelevant ones that are impossible to understand. In your case you should probably also use "written_in_Australian". You're welcome in advance!
Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 23, 2016, 04:40:57 PM
I confess answer of this type are about as useful as a hip pocket in a singlet.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 23, 2016, 05:09:58 PM
Hi, hutch--!
I'm not explain in English, but it may be to be clearly understood
Quote;immediately after entry into the program <-- rsp=2CFC58
13F921000: sub rsp,28h <-- rsp=2CFC30 <-- align 10h
13F921004: xor ecx,ecx
13F921006: xor r9d,r9d
13F921009: lea rdx,[140003000];"This example is written in ML64.EXE",0
13F921010: lea r8,[140003024];"Howdy Folks",0
13F921017: call MessageBoxA
;immediately after the call instruction
;          RSP = 2CFC28 [RSP]=13FFA101D<-- Address of Return
;                2CFC30 [RSP+8]=0 <-- RCX_Home
;                2CFC38 [RSP+10]=0 <-- RDX_Home
;                2CFC40 [RSP+18]=0 <-- R8_Home
;                2CFC48 [RSP+20]=0 <-- R9_Home
13FFA101D: xor ecx,ecx     
13F92101F: call ExitProcess
Now try this option, it runs in  Winx64 Seven
    OPTION DOTNAME
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc

    includelib \masm64\lib\user32.lib
    includelib \masm64\lib\kernel32.lib

  .data
    pmsg db "This example is written in ML64.EXE",0
    pttl db "Howdy Folks",0
  .code
main proc
    push rbp
    invoke MessageBox,0,ADDR pmsg,ADDR pttl,0
    pop rbp
    retn
main endp
end
Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 23, 2016, 06:20:03 PM
Thanks Mikl_, that worked fine but I am none the wise why. I am testing this on Win10 64 bit Professional.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    OPTION DOTNAME
   
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc
    include \masm64\include\msvcrt.inc

    includelib \masm64\lib\user32.lib   
    includelib \masm64\lib\kernel32.lib
    includelib \masm64\lib\msvcrt.lib

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  .data?
    msize db 32 dup (?)

  .data
    ptrm dq msize
    pttl db "Memory Address",0

  .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL pMem  :QWORD

    ;;; sub rsp, 40
    push rbp

    invoke GlobalAlloc,GMEM_FIXED   ,1024*1024*1024*8         ; 8 gig
    mov pMem, rax

; char *_itoa(
;    int value,
;    char *str,
;    int radix

  ; ----------------------------------
  ; convert memory address into string
  ; ----------------------------------
    invoke _itoa,pMem,ptrm,10

    invoke MessageBox,0,ptrm,ADDR pttl,0

    invoke GlobalFree,pMem

    invoke ExitProcess,0

    pop rbp

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 23, 2016, 06:58:46 PM
Doing a quick search on Google, is there anything better in terms of reference material for the 64 bit calling convention than the following URL ?

https://msdn.microsoft.com/en-us/library/ms235286.aspx (https://msdn.microsoft.com/en-us/library/ms235286.aspx)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 23, 2016, 07:30:22 PM
OPTION DOTNAME
   
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc
    include \masm64\include\msvcrt.inc

    includelib \masm64\lib\user32.lib   
    includelib \masm64\lib\kernel32.lib
    includelib \masm64\lib\msvcrt.lib
    OPTION PROLOGUE:rbpFramePrologue

  .data?
    msize db 32 dup (?)

  .data
    ptrm dq msize
    pttl db "Memory Address",0

  .code
main proc

    LOCAL pMem  :QWORD

    invoke GlobalAlloc,GMEM_FIXED   ,1024*1024*1024*2*2;8         4 gig
    mov pMem, rax

; char *_itoa(
;    int value,
;    char *str,
;    int radix

  ; ----------------------------------
  ; convert memory address into string
  ; ----------------------------------
    invoke _itoa,rax,ptrm,10
    invoke MessageBox,NULL,ptrm,&pttl,MB_OK
    invoke GlobalFree,pMem
    invoke ExitProcess,0
main endp
end
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 23, 2016, 07:52:03 PM
hutch, if you are looking for the pill try to contact Bradley Cooper otherwise, you will have to roll the sleeves and sweat blood :biggrin:
I've given you the address where to go in the post above 8)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 23, 2016, 08:31:23 PM
I think you're talking about shadow space. Because fastcall calling convention arguments are passed throught registers (rcx,rdx,r8,r9), so you should store these registers on stack at start of your procedure to if anything goes wrong, system can recover that info.Rsp should be aligned to 10h multiple before a call instruction. Not sure, but appears that only even number of arguments, or you should do a foo on stack to align stack. On your program entry point rsp ends with 8h.
You have some way, you can wait while coding your procedure to see whats the biggest function parameters you're using and after do a sub rsp,?? and after add rsp,?? only one time (the biggests supports reusable space to less functions); or you can do this after each function call.
I don't have a 64 windows to try, I'm talking only using my memory, but veh or seh deals like start_address and end_address to be monitored, you setup a range of address, and with arguments on stack you can see what happened before a possible error.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 23, 2016, 08:40:29 PM
Olá, mineiro!
Desculpe, mas não é claro a quem você está se referindo? Para mim, habran ou hutch--?
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 23, 2016, 08:44:25 PM
Para o senhor hutch senhor Mikl___, respondendo a questão sobre spill space. Um forte abraço irmão. Seus exemplos postados aqui no fórum são muito úteis, de grande valia.

I'm talking about spill space, answering author topic.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 23, 2016, 08:51:39 PM
Desculpe-me novamente, senhor mineiro!
(http://www.cyberforum.ru/images/smilies/senor.gif)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 23, 2016, 09:01:25 PM
Seria interessante o senhor postar um exemplo sobre veh (manipulação estruturada de erros) para win64. Não é tão complicado quanto parece, e o senhor sabe usar o windbg pelo que pude perceber. Abraços senhor Mikl___. Não precisa se desculpar irmão, estamos no mesmo barco.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 23, 2016, 09:07:15 PM
Vou ver o que pode ser feito com veh-exemplo...
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 23, 2016, 09:11:58 PM
beleza, eu lembro que fiz uma divisão por zero para causar um erro intencional na época em que estava me aventurando com win64, inserí alguns nop's antes e depois da instrução para ter um limite de endereços para trabalhar.
abraços.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 24, 2016, 12:21:10 AM
hutch, I thought you'd appreciate a little humor to lighten your day! But the main reason I didn't answer, couldn't find my previous posts from long ago that went into this; and don't want to get into a long discussion about a trivial error I might make, recalling how it goes. Anyway - mineiro is right, but here's my take on it (with probably a trivial error).

The ABI fastcall allows up to four parameters passed in registers rcx, rdx, r8, r9. After that they go on the stack. But the strange thing is you must allow four spaces on the stack even though you don't send any data in them. Called spill or shadow space. The called routine can use that space to store the four registers if they want. It's hokey but that's MS for you.

The other requirement is that when you call, the stack must be on 16-bit boundary, ???????0h. The call will put the return address on stack and jump to called routine. So when that routine starts, it will be on 8h. As long as everyone follows the rules it will always be that way. So the same thing has occurred in your own routine: when it started, you're on an 8-boundary. Therefore you have to add one more dword to get to 0h. That's why you need 5 8's in all: 40, or 28h. One of them is to round it up to 0h, then 4 (20h) for the actual spill space.

You mentioned it works only with exactly that number; no, it's ok with 38h, 48h, etc; but you have to adjust stack afterwards, before returning from your routine.

The reason for insisting on this standard alignment is that XMM registers must go on the stack at even boundaries; some of the instructions need that.

It's important to note the following fact, which has tripped up many people. When I was learning I found long threads on StackExchange (or whatever) that never did get this point straight. MessageBox is one of the few simple functions that really does insist on this alignment! printf, for instance, does not. So if you experiment on many other simple calls you wind up thinking you have more latitude. But then MessageBox will get you; and, some others. Best to follow the rules at all times; although, for convenience and speed, my code breaks this rule often - when I know all subsequent calls will be "safe".

Why does MessageBox behave like that? I don't doubt it's because they make a call to a window routine to put up that box. Whereas most other basic functions don't, and their code just never uses XMM registers.

There are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM. Also VARARG is handled specially. I found one ref somewhere on MS that explained that correctly. Other MS pages, and (iirc) all others, got it wrong. I actually don't remember the details. See the way I did it in my nvk Macro, "Yet Another Macro" post, it's about 40 posts ago in 64-bit forum. There was also a post a year ago, or so, where I answered all this in detail. It's not on 64-bit forum though, because OP (I think it was fearless?) asked the q. somewhere else. Generally, you could do a lot worse than simply review all my 64-bit work from that period.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: jj2007 on June 24, 2016, 01:11:22 AM
Quote from: rrr314159 on June 24, 2016, 12:21:10 AM"Yet Another Macro"

It's here: http://masm32.com/board/index.php?topic=3988.msg42003#msg42003 (http://masm32.com/board/index.php?topic=3988.msg42003#msg42003)

It might be helpful to have a sticky post in the 64-bit forum with a "Hello World" archive containing
- basic includes (kernel, user, msvcrt, ...), or at least exact links to them
- basic libs, or at least exact links to them
- a batch file that takes an asm file as argument
- a link to the version of ml/jwasm/hjwasm/asmc that works with the hello world
- a link to a free 64-bit debugger

So far, I find it far too confusing to even start playing with 64-bit code 8)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 24, 2016, 01:32:14 AM
I really don't get the point about fastcall calling convention. They said that's because speed, ok, I agree, parameters on registers is really quickly. But you should move registers to stack, so where's the gain? Only to use rbp as a normal register because rip relative addressing?
You're not forced to move parameters to stack, but this becames bad habits.
So, why not code as a stdcall, where you push things on stack and adjust that after function is callled if more than 4 parameter? On linux is the same thing I suppose, the difference is that have 6 registers instead of 4. So, why bother about rsp alignment? I really don't get the point about fastcall.
Try a wsprintf with 7 parameters and an error can happen, this way we lost precious memory to alignt stack and to nothing.

---edited----
C calling convention instead of stdcall as I said before.
And more, reading that topic about 32 bits versus 64 bits I think that everybody agree that does not have a real gain from one to another, only on specific types of code (overhead removed). So my conclusion is that programs to 64 bits eats more memory and do not have a real gain.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 24, 2016, 02:59:08 AM
This is just playing with ML64 macros.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    OPTION DOTNAME
   
    option casemap:none

    include \masm64\include\win64.inc
    include \masm64\include\temphls.inc

    include \masm64\include\kernel32.inc
    include \masm64\include\user32.inc
    include \masm64\include\msvcrt.inc

    includelib \masm64\lib\user32.lib   
    includelib \masm64\lib\kernel32.lib
    includelib \masm64\lib\msvcrt.lib

; char *_itoa(
;    int value,
;    char *str,
;    int radix

    buff$ MACRO valu
      LOCAL buffer,pbuf
      .data?
        buffer db 32 dup (?)
      .data
        pbuf dq buffer
      .code
      invoke _itoa,valu,pbuf,10
      EXITM <pbuf>
    ENDM

    falloc MACRO bsize
      invoke GlobalAlloc,GMEM_FIXED,bsize
      EXITM <rax>
    ENDM

    fxfree MACRO hndl
      invoke GlobalFree,hndl
    ENDM

    appexit MACRO valu
      invoke ExitProcess,0
    ENDM

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  .data
    pttl db "Memory Address",0

  .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL pMem  :QWORD

    push rbp

    mov pMem, falloc(1024*1024*1024*8)              ; allocate fixed memory
    invoke MessageBox,0,buff$(pMem),ADDR pttl,0     ; display string of memory value
    fxfree pMem                                     ; release memory
    appexit 0                                       ; exit the process

    pop rbp

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment #

    https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx

    Volatile
    rax rcx rdx r8 r9 r10 r11

    Non Volatile
    r12 r13 r14 r15 rdi rsi rbx rbp rsp

    Volotile
    xmm0 ymmo
    xmm1 ymm1
    xmm2 ymm2
    xmm3 ymm3
    xmm4 ymm4
    xmm5 ymm5

    Nonvolatile (XMM), Volatile (upper half of YMM)
    xmm6-15
    ymm6-15

#

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

  end


The register data is ALA Microsoft.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 24, 2016, 03:33:33 AM
Structures should be padded to 8 (on each member?, if yes a lot of them should be done by hands, I think assemblers get total size of structure instead of each member size), so, pointers are all 8 bytes, but handles are not I suppose. If speed is the argument as they say, put all types to qwords make more sense, but it's not this way. But again I think, not sure, lea instruction (addr) deals with dword addressing size on long mode (x86-64) and offset deals with qwords. Procedures should be aligned to 16 to favor use of xmm/ymm.
One doubt, whats the minimum machine that can be used to win64? I was reading about SSE2 as minimum, not all machines have ymm registers and instructions set.
too much headcache, good adventures.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 24, 2016, 04:21:41 AM
mineiro, SSE2 is all you need.

As for fastcall, if you just remember that MS did a lousy job with the ABI it all makes sense. As you say advantage of passing values in rcx, rdx, r8, r9 is somewhat negated by necessity of stack manipulation, and so forth. But remember, with your own code you don't have to follow any conventions at all. Only when interfacing with MS or other outside entities. In your own code, you can take advantage of those extra registers to almost eliminate passing anything on the stack. And of course ignore alignment except when using instructions (like some SSE XMM instructions) that demand it.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 24, 2016, 04:41:07 AM
Quote from: jj2007 on June 24, 2016, 01:11:22 AMIt might be helpful to have a sticky post in the 64-bit forum with a "Hello World" archive containing ...

Only problem with that idea, it sounds suspiciously similar to work

Quote from: jj2007So far, I find it far too confusing to even start playing with 64-bit code

For the type of thing that MasmBasic does 64-bit is a lot of work for essentially no gain. You "join the 64-bit world", and someday when everybody has mega-gigs of RAM it might be necessary, but apart from that there is no added functionality. It just bloats the code and slows it down a tiny bit.

There are two main reasons why it's bad, neither of them having to do with Intel, or the basic concept of 64-bit. First, MS did a lousy job with the ABI and much of the rest of their implementation. Second there's no "masm64", so 64-bit asm is not standardized. For instance MikL and I use different libraries and a couple other less important differences so are not immediately compatible.

All these bad points disappear when writing code for yourself (more or less). All my code obeys MS interface where it must, maybe 10%. In the rest I can freely use the extra registers and qword-manipulation capability. It all works great, easy to learn, big advantage especially for math routines, but also graphics and, lesser extent, any other code.

So for non-production code, for your own use, or distributed only to (more or less) friends, I extremely recommend getting into 64-bit. But for the type of thing most people here do, like you and hutch, not. It's still worth learning about but only as a dull chore, so you can keep up with the times. For MS-production coding it's pretty much an unalloyed negative.

If you, hutch and similar want to make something good out of it, well worth considering is working with Habran to make HJWasm a de facto standard. It could be developed to be the core of a "masm64.com" site.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 24, 2016, 08:18:14 AM
Thanks rrr314159 :t
You are downright rational thinker 8)

hutch and jj2007, if you find the FASTCALL to compex, brace yourself for the VECTORCALL which is coming in the next release of HJWasm.
The VECTORCALL will also work in x86.
I am sure that rrr314159 and qWORD will embrace it ;)
Here is MSDN introduction what is it about:
Quote
In addition to SIMD data types, Vector Calling Convention can also be
used for Homogeneous Vector Aggregate data-type (HVA) and Homogeneous Float Aggregate data-type (HFA).
An HVA/HFA data-type is a composite type where all of fundamental data types of members that compose
the type are the same and are of Vector or Floating Point data type. (__m128, __m256,__512 float/double).
An HVA/HFA data type can have at most four members.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 24, 2016, 09:29:47 AM
Hmm, yes sir rrr314159, you're right. Inside scope of our program we can do anything and if need to talk with other calling conventions/abi we follow that rules. This way can have a real gain to justify fastcall.
I like to read your opinion, so, to you, whats the best way to release a library (like masm32 lib)? Maybe a (or many) prologue and epilogue function(s) to deal with other languages/abi and an internal calling convention to assembly programmers?

Sir habran, you have any plans to continue support to linux?
Title: Re: Playtime with ML64 and a question on spill space.
Post by: fearless on June 24, 2016, 09:51:55 AM
Quote from: jj2007 on June 24, 2016, 01:11:22 AM
- basic includes (kernel, user, msvcrt, ...), or at least exact links to them
- basic libs, or at least exact links to them

Some information related to these might be found in this post: JWasm64 with RadASM - http://masm32.com/board/index.php?topic=4162.msg44176#msg44176 (http://masm32.com/board/index.php?topic=4162.msg44176#msg44176)

Quote from: jj2007 on June 24, 2016, 01:11:22 AM
- a link to a free 64-bit debugger
http://x64dbg.com/#start (http://x64dbg.com/#start)
Latest snapshots are available from here: https://github.com/x64dbg/x64dbg/releases (https://github.com/x64dbg/x64dbg/releases)

Some additional bits and pieces i played around with ive uploaded to bitbucket: https://bitbucket.org/mrfearless/jwasm64-with-radasm (https://bitbucket.org/mrfearless/jwasm64-with-radasm) and https://bitbucket.org/mrfearless/debug64-for-jwasm64 (https://bitbucket.org/mrfearless/debug64-for-jwasm64) - related post (http://masm32.com/board/index.php?topic=4203.msg44670#msg44670)

I also started a port of some of the functions from the masm32.lib for x64 a while ago: https://bitbucket.org/LetTheLightIn/masm64-library (https://bitbucket.org/LetTheLightIn/masm64-library)

Any and all can be downloaded, modified etc - they are a work in progress, or a starting point for some other enterprising fella to continue on with.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 24, 2016, 09:56:43 AM
Sir mineiro ;)
I can assure you that my co-developer sir Johnsa will make it work for linux 8)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 24, 2016, 10:04:22 AM
Excellent job mr. fearless :t
However, when I tied to download RadAsm from your link this is what I get:
QuoteThe site ahead contains harmful programs

Attackers on www.assembly.com.br might attempt to trick you into installing programs that harm your browsing experience (for example, by changing your homepage or showing extra ads on sites you visit).
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 24, 2016, 10:28:55 AM
Quote from: mineiro on June 24, 2016, 09:29:47 AM
I like to read your opinion, so, to you, whats the best way to release a library (like masm32 lib)? Maybe a (or many) prologue and epilogue function(s) to deal with other languages/abi and an internal calling convention to assembly programmers?

Hi mineiro, since you ask,

Avoid complexity. I gather you want to support both Windows and Linux - that's already a fair amount of complexity in the interface. Don't forget, not only do you have to program it, but also produce documentation; and your users have to understand it.

If you want one function to serve for both, you should just use the simplest approach. My guess is, that would mean doing it the Linux way and providing prologue / epilogue to translate to Windows. But if the best way is to develop your own internal methods and translate to both OS's, then fine do that. But in that case I wouldn't publish your internal ABI so others can use it. Then you're locked in to that definition, and also must provide documentation and support. Every error they find can be a big headache: minimize the ways they can access the code to produce errors.

Perhaps you've already done this sort of thing and know all about it, in which case my opinion is superfluous. I've directed many software interfaces on Navy projects but have almost no professional experience in commercial projects. FWIW, when planning a project I always emphasize one thing: simplicity. Not speed or anything else. Everything tends to be a lot more complex than you thought at first, don't add any "extras". That can be done in version 2.

So bottom line - decide what interfaces you MUST support, decide the simplest way to do that, don't publish any more interfaces (like your internal ABI) than necessary.

An alternative might be only publish your own, "improved", ABI, and leave translation for Windows and Linux to external routines.

The "KISS" principle: "Keep it simple, sailor!"
Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 24, 2016, 10:45:17 AM
 :biggrin:

Habran,

Quote
hutch and jj2007, if you find the FASTCALL to compex, brace yourself for the VECTORCALL which is coming in the next release of HJWasm

I was using FASTCALL in MS-DOS in 1990, passing data in registers without a stack is not new technology. As far as the later AVX instructions, you must first have the hardware, then read the Intel manuals. In Win32 you can do your own FASTCALL with EAX, ECD and EDX and use GLOBALS for any further arguments OR use structures passed in 1 register.

Now Win64 apart from having a really crappy ABI has many advantages for an assembler language programmer, roll your own VERYFASTCALL with the extra integer registers (rax rcx rdx r8 r9 r10 r11) and GLOBALS for a stack free method of calling procedures while remaining compatible with the Windows version of the ABI. What you don't need to cripple 64 bit x86 assembler with is the assumptions of a C compiler. If you do you may as well use a C compiler and write modules in an assembler when you need extra speed.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 24, 2016, 10:52:54 AM
Thanks for answering.
Good sir rrr314159, the first thing is to put code to do something usefull, after yes, optimize. I like simplistic too and I start from point where everything will not work, this way I can reach less headcache. Thanks a lot, very helpfull answer.

sir habran, so I suppose the answer is no.

I meet the owner of that .br site years ago on an extinguished board, he have translated radasm language to brazilian portuguese, don't appears to be a bad person, but I will not put my hand on fire.

Good job sir fearless.

Title: Re: Playtime with ML64 and a question on spill space.
Post by: fearless on June 24, 2016, 10:59:34 AM
Yeh seems that site gives a warning about radasm v2, not sure where else it is hosted anymore, if at all - hopefully some of the other info helps anyhow.

Maybe softpedia: http://www.softpedia.com/get/Programming/File-Editors/RadASM.shtml (http://www.softpedia.com/get/Programming/File-Editors/RadASM.shtml) - seemed to work

plus minor update i compiled for a v2.2.2.1 (only worth getting this exe if your are likely to have more than 256 resources compiled into your final project - raises the limit up to 512 resource files in total): http://masm32.com/board/index.php?topic=4884.msg53620#msg53620
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 24, 2016, 03:24:28 PM
Sir mineiro, that means that it will be taken care of :biggrin:
Mr. fearless, softpedia is fine :t
Maestro hutch ;)
I don't doubt your programming skills, and I agree with you about crappy ABI.
What I love the mos about x64 is having plenty of registers to work (my apology to rrr314159) with.
Those who do programming for maths or graphic will be able to the advantage of the VECTORCALL.
The difference between  C and assembler is that in assembler we can optimize the code for the specific purpose, while C language does it more portable. So, my goal is to create assembly source easy to understand (HLL features) and at the same time optimized for the machine code 8)

Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 24, 2016, 04:04:08 PM
Hi, rrr314159!
You wrote
QuoteThere are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM
Tell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?
Thank you!
Title: Re: Playtime with ML64 and a question on spill space.
Post by: jj2007 on June 24, 2016, 06:26:29 PM
Quote from: Mikl__ on June 24, 2016, 04:04:08 PMTell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?

Here is the 32-bit version:

include \masm32\MasmBasic\MasmBasic.inc      ; download (http://masm32.com/board/index.php?topic=94.0)
.data
SomeInt        dq    1234567890123456789
SomeFloat      REAL8 1234567890.1234567890

  Init
  movlps xmm0, SomeInt
  movlps xmm1, SomeFloat
  Print Str$("X0 (int)=\t%i\n", xmm0), Str$("X1 (R8)=\t%If\n", f:xmm1)      ; for comparison

  sub esp, 8            ; create qword slot
  mov esi, esp          ; assign a reg that points to the slot
  movlps qword ptr [esi], xmm0      ; move a qword from xmm reg to slot

  sub esp, 8            ; repeat for second value
  mov edi, esp
  movlps real8 ptr [edi], xmm1

  printf("X0 (int)=\t%lld\n", qword ptr [esi])
  printf("X1 (float)=\t%.8f\n", REAL8 ptr [edi])
  Inkey "that was cute, right?"
EndOfCode


Output:
X0 (int)=       1234567890123456789
X1 (R8)=        1234567890.12345672
X0 (int)=       1234567890123456789
X1 (float)=     1234567890.12345670


Note that CRT and WinAPI both have the bad habit ("ABI") to trash the lower xmm regs :(
Title: Re: Playtime with ML64 and a question on spill space.
Post by: Mikl__ on June 24, 2016, 06:39:19 PM
Thank you, jj2007!
(http://www.en.kolobok.us/smiles/icq/good.gif)
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 25, 2016, 12:23:05 AM
Quote from: Mikl__ on June 24, 2016, 04:04:08 PM
Hi, rrr314159!
You wrote
QuoteThere are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM
Tell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?
Thank you!

See this post http://masm32.com/board/index.php?topic=3988.msg42123#msg42123 in my old thread "Yet Another Invoke Macro". jj2007, GoneFishing and I discussed this at some length, in the pages around this post. Basic idea: put XMM contents into memory, pass pointer to printf, and use %llx format command (twice). AFAIK that's the only way; printf doesn't accept XMM registers as input.

include \myinc\inc64.inc
.data
    o1 OWORD 12335678aacdff0112344678abbeef02h
.code

start:
    movups xmm0, o1
    movups [rsp-16], xmm0
    mov r15, [rsp-8]
    mov r14, [rsp-16]
    prnt "%llx ", QWORD PTR r14
    prnt "%llx \n", QWORD PTR r15
ret
end start


This code puts XMM0 on the stack, because that's what GoneFishing (a.k.a. Vertograd) wanted to do; you could also simply point at o1. If you want to use this code as it is you have to get my nvk macro, along with prnt macro and "inc64.inc" includes. But "prnt" is just a simple wrapper that calls printf. You should be able to adapt this technique to use with printf, without needing the nvk macro (s).

[EDIT] Forgot you asked about real numbers also. 64-bit, put it in a register, like r8 or r9. Larger, you would have to use the above technique with the right format statement, maybe %llf; I don't know about that.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: jj2007 on June 25, 2016, 01:52:54 AM
Quote from: rrr314159 on June 25, 2016, 12:23:05 AMuse %llx format command

%llx doesn't work for me - do you have code that produces valid output? Or is there are difference between 32-bit and 64-bit crt printf()??

See above, they both work fine:
  printf("X0 (int)=\t%lld\n", qword ptr [esi])
  printf("X1 (float)=\t%.8f\n", REAL8 ptr [edi])
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 25, 2016, 04:03:32 AM
@jj2007, I don't remember. But I know %llx (long hex) worked with that code. Undoubtedly if you download nvk macros, which includes everything except the libraries, it will work. And, there were other %ll type formats that worked; maybe %llf, %llu, etc. These days I'm only doing 32-bit, because I got tired of the lack of standardization with 64-bit. Happened often that something worked for me, with my idiosyncratic macros, but not for others. To get benefit from my stuff you should probably read the code see what went on under the hood, and adapt what's useful for your own code.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: jj2007 on June 25, 2016, 06:48:10 AM
Quote from: rrr314159 on June 25, 2016, 04:03:32 AM@jj2007, I don't remember. But I know %llx (long hex) worked with that code.

Yes it does: llx produces hex output. I had fed it with the decimal stuff posted above, and couldn't make sense of the result...
My fault :P
Title: Re: Playtime with ML64 and a question on spill space.
Post by: rrr314159 on June 25, 2016, 07:03:12 AM
Partly my fault also. MikL asked about floating point, I didn't read carefully, and responded with a hex format code.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: MichaelW on June 28, 2016, 05:26:09 PM
For the original question, Raymond Chen provides a compact answer  here (https://blogs.msdn.microsoft.com/oldnewthing/20040114-00/?p=41053):

"The stack must be kept 16-byte aligned. Since the "call" instruction pushes an 8-byte return address, this means that every non-leaf function is going to adjust the stack by a value of the form 16n+8 in order to restore 16-byte alignment."

Title: Re: Playtime with ML64 and a question on spill space.
Post by: hutch-- on June 28, 2016, 06:11:38 PM
Thanks Michael.  :t
Title: Re: Playtime with ML64 and a question on spill space.
Post by: mineiro on June 30, 2016, 08:36:03 AM
Quote from: habran on June 24, 2016, 03:24:28 PM
Sir mineiro, that means that it will be taken care of :biggrin:
Ohh sir habran, really sorry, only now I understand what you have write, my fault. Only today I access your page and I understood about name John.
I download hjwasm and now I'm playing with it, I have successfull coded a simple asm sample.
Thanks a lot.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: habran on June 30, 2016, 08:56:02 AM
No worries mate :biggrin:
Title: Re: Playtime with ML64 and a question on spill space.
Post by: HSE on November 13, 2023, 09:28:24 AM
Hi fearless!

Quote from: fearless on June 24, 2016, 09:51:55 AMI also started a port of some of the functions from the masm32.lib for x64 a while ago: https://bitbucket.org/LetTheLightIn/masm64-library (https://bitbucket.org/LetTheLightIn/masm64-library)

Is that a dead project?

Thanks, HSE.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: fearless on November 13, 2023, 11:07:57 AM
I moved it over to github: https://github.com/mrfearless/libraries/tree/master/Masm64 (https://github.com/mrfearless/libraries/tree/master/Masm64)

But I have not added anything to it since then I think or maybe one or two extra functions. Of course anyone can contribute on github or suggest other inclusions or functions that should be added or updated or post code here for inclusion/addition/corrections to the library.

I mainly created it for myself when doing some of the x64 stuff as it was easier to port code and not have to worry about missing functions etc, i could just change a few things (registers, params etc) and rename '32' to '64' when including the masm inc and lib and hit compile. For the most part the params and returns are the same (prob 95% of the functions that have been ported over) but there might be a few that take extra params etc

Title: Re: Playtime with ML64 and a question on spill space.
Post by: HSE on November 13, 2023, 12:56:37 PM
Quote from: fearless on November 13, 2023, 11:07:57 AMI moved it over to github: https://github.com/mrfearless/libraries/tree/master/Masm64

I saw that.

But Masm32Lib have around 238 functions, and you have around 35.

My hope was that I was searching in the wrong place  :biggrin:

It's not clear for me why Hutch only translate half of the library. Perhaps in 32 bits there was no plan neither.

Thanks.
Title: Re: Playtime with ML64 and a question on spill space.
Post by: fearless on November 13, 2023, 10:22:12 PM
Quote from: HSE on November 13, 2023, 12:56:37 PMBut Masm32Lib have around 238 functions, and you have around 35.

Yes, I only ported the functions I needed at the time.

I can move the project so that it is on its own repository, so that any one can contribute to it. Also it could be renamed: Asm64.lib, A64.lib, U64.lib or whatever, suggestions are welcome.

Where there any particular functions you where looking for?
Title: Re: Playtime with ML64 and a question on spill space.
Post by: HSE on November 13, 2023, 11:04:29 PM
Quote from: fearless on November 13, 2023, 10:22:12 PMYes, I only ported the functions I needed at the time.

:thumbsup:

Quote from: fearless on November 13, 2023, 10:22:12 PMI can move the project so that it is on its own repository, so that any one can contribute to it. Also it could be renamed: Asm64.lib, A64.lib, U64.lib or whatever, suggestions are welcome.

Fantastic. That could be "Masm64.lib: A fearless' curated repository"  :thumbsup:

Quote from: fearless on November 13, 2023, 10:22:12 PMWhere there any particular functions you where looking for?

Just some days ago I translated a couple: ClearScreen and locate, so elementals but missing. Along time there was a couple more.