I have put together a test framework using the 64 bit binaries from VC2010 and the include files and libraries that Mikl_ has been using below is a bare bones example so that my question will be understood.
The batch file to build the example.
@echo off
\masm64\bin\ml64.exe /c test1.asm
\masm64\bin\link.exe /SUBSYSTEM:WINDOWS /ENTRY:main test1.obj
pause
The bare minimum source code to demonstrate what I need to know.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION DOTNAME
option casemap:none
include \masm64\include\win64.inc
include \masm64\include\temphls.inc
include \masm64\include\kernel32.inc
include \masm64\include\user32.inc
includelib \masm64\lib\user32.lib
includelib \masm64\lib\kernel32.lib
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
.data
pmsg db "This example is written in ML64.EXE",0
pttl db "Howdy Folks",0
.data?
.code
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
sub rsp, 40
invoke MessageBox,0,ADDR pmsg,ADDR pttl,0
invoke ExitProcess,0
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end
What I need to know is why the spill space needs to be set at a specific size, I looked up Mikl_'s example which showed 40 bytes and if I add more it will not work and if I set it to less it will not work. I need to know why and if anyone has some definitive reference material on how and why spill space is configured, it will be most appreciated.
Hi hutch :biggrin:
If you use HJWasm you don't have to worry about these intricacies, it will take care of everything :t
However, here (http://www.codemachine.com/article_x64deepdive.html) is everything clearly explained with examples. Only HJWasm is able to do those things which you can find there.
I appreciate your will to step in to a "Brave New World" :t
Cheers!
Have you read the ABI? And, there are many tutorials on the topic. Google is your friend.
Still I must admit that many of the tutorials have mistakes, are hard to follow, and don't really answer your question. The trick is to include these keywords in your Google search: "good", "correct" "relevant", and "understandable". That filters out all the bad, wrong, irrelevant ones that are impossible to understand. In your case you should probably also use "written_in_Australian". You're welcome in advance!
I confess answer of this type are about as useful as a hip pocket in a singlet.
Hi,
hutch--!
I'm not explain in English, but it may be to be clearly understood
Quote;immediately after entry into the program <-- rsp=2CFC58
13F921000: sub rsp,28h <-- rsp=2CFC30 <-- align 10h
13F921004: xor ecx,ecx
13F921006: xor r9d,r9d
13F921009: lea rdx,[140003000];"This example is written in ML64.EXE",0
13F921010: lea r8,[140003024];"Howdy Folks",0
13F921017: call MessageBoxA
;immediately after the call instruction
; RSP = 2CFC28 [RSP]=13FFA101D<-- Address of Return
; 2CFC30 [RSP+8]=0 <-- RCX_Home
; 2CFC38 [RSP+10]=0 <-- RDX_Home
; 2CFC40 [RSP+18]=0 <-- R8_Home
; 2CFC48 [RSP+20]=0 <-- R9_Home
13FFA101D: xor ecx,ecx
13F92101F: call ExitProcess
Now try this option, it runs in Winx64 Seven
OPTION DOTNAME
option casemap:none
include \masm64\include\win64.inc
include \masm64\include\temphls.inc
include \masm64\include\kernel32.inc
include \masm64\include\user32.inc
includelib \masm64\lib\user32.lib
includelib \masm64\lib\kernel32.lib
.data
pmsg db "This example is written in ML64.EXE",0
pttl db "Howdy Folks",0
.code
main proc
push rbp
invoke MessageBox,0,ADDR pmsg,ADDR pttl,0
pop rbp
retn
main endp
end
Thanks Mikl_, that worked fine but I am none the wise why. I am testing this on Win10 64 bit Professional.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION DOTNAME
option casemap:none
include \masm64\include\win64.inc
include \masm64\include\temphls.inc
include \masm64\include\kernel32.inc
include \masm64\include\user32.inc
include \masm64\include\msvcrt.inc
includelib \masm64\lib\user32.lib
includelib \masm64\lib\kernel32.lib
includelib \masm64\lib\msvcrt.lib
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
.data?
msize db 32 dup (?)
.data
ptrm dq msize
pttl db "Memory Address",0
.code
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
LOCAL pMem :QWORD
;;; sub rsp, 40
push rbp
invoke GlobalAlloc,GMEM_FIXED ,1024*1024*1024*8 ; 8 gig
mov pMem, rax
; char *_itoa(
; int value,
; char *str,
; int radix
; ----------------------------------
; convert memory address into string
; ----------------------------------
invoke _itoa,pMem,ptrm,10
invoke MessageBox,0,ptrm,ADDR pttl,0
invoke GlobalFree,pMem
invoke ExitProcess,0
pop rbp
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Doing a quick search on Google, is there anything better in terms of reference material for the 64 bit calling convention than the following URL ?
https://msdn.microsoft.com/en-us/library/ms235286.aspx (https://msdn.microsoft.com/en-us/library/ms235286.aspx)
OPTION DOTNAME
option casemap:none
include \masm64\include\win64.inc
include \masm64\include\temphls.inc
include \masm64\include\kernel32.inc
include \masm64\include\user32.inc
include \masm64\include\msvcrt.inc
includelib \masm64\lib\user32.lib
includelib \masm64\lib\kernel32.lib
includelib \masm64\lib\msvcrt.lib
OPTION PROLOGUE:rbpFramePrologue
.data?
msize db 32 dup (?)
.data
ptrm dq msize
pttl db "Memory Address",0
.code
main proc
LOCAL pMem :QWORD
invoke GlobalAlloc,GMEM_FIXED ,1024*1024*1024*2*2;8 4 gig
mov pMem, rax
; char *_itoa(
; int value,
; char *str,
; int radix
; ----------------------------------
; convert memory address into string
; ----------------------------------
invoke _itoa,rax,ptrm,10
invoke MessageBox,NULL,ptrm,&pttl,MB_OK
invoke GlobalFree,pMem
invoke ExitProcess,0
main endp
end
hutch, if you are looking for the pill try to contact Bradley Cooper otherwise, you will have to roll the sleeves and sweat blood :biggrin:
I've given you the address where to go in the post above 8)
I think you're talking about shadow space. Because fastcall calling convention arguments are passed throught registers (rcx,rdx,r8,r9), so you should store these registers on stack at start of your procedure to if anything goes wrong, system can recover that info.Rsp should be aligned to 10h multiple before a call instruction. Not sure, but appears that only even number of arguments, or you should do a foo on stack to align stack. On your program entry point rsp ends with 8h.
You have some way, you can wait while coding your procedure to see whats the biggest function parameters you're using and after do a sub rsp,?? and after add rsp,?? only one time (the biggests supports reusable space to less functions); or you can do this after each function call.
I don't have a 64 windows to try, I'm talking only using my memory, but veh or seh deals like start_address and end_address to be monitored, you setup a range of address, and with arguments on stack you can see what happened before a possible error.
Olá, mineiro!
Desculpe, mas não é claro a quem você está se referindo? Para mim, habran ou hutch--?
Para o senhor hutch senhor Mikl___, respondendo a questão sobre spill space. Um forte abraço irmão. Seus exemplos postados aqui no fórum são muito úteis, de grande valia.
I'm talking about spill space, answering author topic.
Desculpe-me novamente, senhor mineiro!
(http://www.cyberforum.ru/images/smilies/senor.gif)
Seria interessante o senhor postar um exemplo sobre veh (manipulação estruturada de erros) para win64. Não é tão complicado quanto parece, e o senhor sabe usar o windbg pelo que pude perceber. Abraços senhor Mikl___. Não precisa se desculpar irmão, estamos no mesmo barco.
Vou ver o que pode ser feito com veh-exemplo...
beleza, eu lembro que fiz uma divisão por zero para causar um erro intencional na época em que estava me aventurando com win64, inserí alguns nop's antes e depois da instrução para ter um limite de endereços para trabalhar.
abraços.
hutch, I thought you'd appreciate a little humor to lighten your day! But the main reason I didn't answer, couldn't find my previous posts from long ago that went into this; and don't want to get into a long discussion about a trivial error I might make, recalling how it goes. Anyway - mineiro is right, but here's my take on it (with probably a trivial error).
The ABI fastcall allows up to four parameters passed in registers rcx, rdx, r8, r9. After that they go on the stack. But the strange thing is you must allow four spaces on the stack even though you don't send any data in them. Called spill or shadow space. The called routine can use that space to store the four registers if they want. It's hokey but that's MS for you.
The other requirement is that when you call, the stack must be on 16-bit boundary, ???????0h. The call will put the return address on stack and jump to called routine. So when that routine starts, it will be on 8h. As long as everyone follows the rules it will always be that way. So the same thing has occurred in your own routine: when it started, you're on an 8-boundary. Therefore you have to add one more dword to get to 0h. That's why you need 5 8's in all: 40, or 28h. One of them is to round it up to 0h, then 4 (20h) for the actual spill space.
You mentioned it works only with exactly that number; no, it's ok with 38h, 48h, etc; but you have to adjust stack afterwards, before returning from your routine.
The reason for insisting on this standard alignment is that XMM registers must go on the stack at even boundaries; some of the instructions need that.
It's important to note the following fact, which has tripped up many people. When I was learning I found long threads on StackExchange (or whatever) that never did get this point straight. MessageBox is one of the few simple functions that really does insist on this alignment! printf, for instance, does not. So if you experiment on many other simple calls you wind up thinking you have more latitude. But then MessageBox will get you; and, some others. Best to follow the rules at all times; although, for convenience and speed, my code breaks this rule often - when I know all subsequent calls will be "safe".
Why does MessageBox behave like that? I don't doubt it's because they make a call to a window routine to put up that box. Whereas most other basic functions don't, and their code just never uses XMM registers.
There are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM. Also VARARG is handled specially. I found one ref somewhere on MS that explained that correctly. Other MS pages, and (iirc) all others, got it wrong. I actually don't remember the details. See the way I did it in my nvk Macro, "Yet Another Macro" post, it's about 40 posts ago in 64-bit forum. There was also a post a year ago, or so, where I answered all this in detail. It's not on 64-bit forum though, because OP (I think it was fearless?) asked the q. somewhere else. Generally, you could do a lot worse than simply review all my 64-bit work from that period.
Quote from: rrr314159 on June 24, 2016, 12:21:10 AM"Yet Another Macro"
It's here: http://masm32.com/board/index.php?topic=3988.msg42003#msg42003 (http://masm32.com/board/index.php?topic=3988.msg42003#msg42003)
It might be helpful to have a sticky post in the 64-bit forum with a "Hello World" archive containing
- basic includes (kernel, user, msvcrt, ...), or at least exact links to them
- basic libs, or at least exact links to them
- a batch file that takes an asm file as argument
- a link to the version of ml/jwasm/hjwasm/asmc that works with the hello world
- a link to a free 64-bit debugger
So far, I find it far too confusing to even start playing with 64-bit code 8)
I really don't get the point about fastcall calling convention. They said that's because speed, ok, I agree, parameters on registers is really quickly. But you should move registers to stack, so where's the gain? Only to use rbp as a normal register because rip relative addressing?
You're not forced to move parameters to stack, but this becames bad habits.
So, why not code as a stdcall, where you push things on stack and adjust that after function is callled if more than 4 parameter? On linux is the same thing I suppose, the difference is that have 6 registers instead of 4. So, why bother about rsp alignment? I really don't get the point about fastcall.
Try a wsprintf with 7 parameters and an error can happen, this way we lost precious memory to alignt stack and to nothing.
---edited----
C calling convention instead of stdcall as I said before.
And more, reading that topic about 32 bits versus 64 bits I think that everybody agree that does not have a real gain from one to another, only on specific types of code (overhead removed). So my conclusion is that programs to 64 bits eats more memory and do not have a real gain.
This is just playing with ML64 macros.
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION DOTNAME
option casemap:none
include \masm64\include\win64.inc
include \masm64\include\temphls.inc
include \masm64\include\kernel32.inc
include \masm64\include\user32.inc
include \masm64\include\msvcrt.inc
includelib \masm64\lib\user32.lib
includelib \masm64\lib\kernel32.lib
includelib \masm64\lib\msvcrt.lib
; char *_itoa(
; int value,
; char *str,
; int radix
buff$ MACRO valu
LOCAL buffer,pbuf
.data?
buffer db 32 dup (?)
.data
pbuf dq buffer
.code
invoke _itoa,valu,pbuf,10
EXITM <pbuf>
ENDM
falloc MACRO bsize
invoke GlobalAlloc,GMEM_FIXED,bsize
EXITM <rax>
ENDM
fxfree MACRO hndl
invoke GlobalFree,hndl
ENDM
appexit MACRO valu
invoke ExitProcess,0
ENDM
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
.data
pttl db "Memory Address",0
.code
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
LOCAL pMem :QWORD
push rbp
mov pMem, falloc(1024*1024*1024*8) ; allocate fixed memory
invoke MessageBox,0,buff$(pMem),ADDR pttl,0 ; display string of memory value
fxfree pMem ; release memory
appexit 0 ; exit the process
pop rbp
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
comment #
https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx
Volatile
rax rcx rdx r8 r9 r10 r11
Non Volatile
r12 r13 r14 r15 rdi rsi rbx rbp rsp
Volotile
xmm0 ymmo
xmm1 ymm1
xmm2 ymm2
xmm3 ymm3
xmm4 ymm4
xmm5 ymm5
Nonvolatile (XMM), Volatile (upper half of YMM)
xmm6-15
ymm6-15
#
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end
The register data is ALA Microsoft.
Structures should be padded to 8 (on each member?, if yes a lot of them should be done by hands, I think assemblers get total size of structure instead of each member size), so, pointers are all 8 bytes, but handles are not I suppose. If speed is the argument as they say, put all types to qwords make more sense, but it's not this way. But again I think, not sure, lea instruction (addr) deals with dword addressing size on long mode (x86-64) and offset deals with qwords. Procedures should be aligned to 16 to favor use of xmm/ymm.
One doubt, whats the minimum machine that can be used to win64? I was reading about SSE2 as minimum, not all machines have ymm registers and instructions set.
too much headcache, good adventures.
mineiro, SSE2 is all you need.
As for fastcall, if you just remember that MS did a lousy job with the ABI it all makes sense. As you say advantage of passing values in rcx, rdx, r8, r9 is somewhat negated by necessity of stack manipulation, and so forth. But remember, with your own code you don't have to follow any conventions at all. Only when interfacing with MS or other outside entities. In your own code, you can take advantage of those extra registers to almost eliminate passing anything on the stack. And of course ignore alignment except when using instructions (like some SSE XMM instructions) that demand it.
Quote from: jj2007 on June 24, 2016, 01:11:22 AMIt might be helpful to have a sticky post in the 64-bit forum with a "Hello World" archive containing ...
Only problem with that idea, it sounds suspiciously similar to work
Quote from: jj2007So far, I find it far too confusing to even start playing with 64-bit code
For the type of thing that MasmBasic does 64-bit is a lot of work for essentially no gain. You "join the 64-bit world", and someday when everybody has mega-gigs of RAM it might be necessary, but apart from that there is no added functionality. It just bloats the code and slows it down a tiny bit.
There are two main reasons why it's bad, neither of them having to do with Intel, or the basic concept of 64-bit. First, MS did a lousy job with the ABI and much of the rest of their implementation. Second there's no "masm64", so 64-bit asm is not standardized. For instance MikL and I use different libraries and a couple other less important differences so are not immediately compatible.
All these bad points disappear when writing code for yourself (more or less). All my code obeys MS interface where it must, maybe 10%. In the rest I can freely use the extra registers and qword-manipulation capability. It all works great, easy to learn, big advantage especially for math routines, but also graphics and, lesser extent, any other code.
So for non-production code, for your own use, or distributed only to (more or less) friends, I extremely recommend getting into 64-bit. But for the type of thing most people here do, like you and hutch, not. It's still worth learning about but only as a dull chore, so you can keep up with the times. For MS-production coding it's pretty much an unalloyed negative.
If you, hutch and similar want to make something good out of it, well worth considering is working with Habran to make HJWasm a de facto standard. It could be developed to be the core of a "masm64.com" site.
Thanks rrr314159 :t
You are downright rational thinker 8)
hutch and jj2007, if you find the FASTCALL to compex, brace yourself for the VECTORCALL which is coming in the next release of HJWasm.
The VECTORCALL will also work in x86.
I am sure that rrr314159 and qWORD will embrace it ;)
Here is MSDN introduction what is it about:
Quote
In addition to SIMD data types, Vector Calling Convention can also be
used for Homogeneous Vector Aggregate data-type (HVA) and Homogeneous Float Aggregate data-type (HFA).
An HVA/HFA data-type is a composite type where all of fundamental data types of members that compose
the type are the same and are of Vector or Floating Point data type. (__m128, __m256,__512 float/double).
An HVA/HFA data type can have at most four members.
Hmm, yes sir rrr314159, you're right. Inside scope of our program we can do anything and if need to talk with other calling conventions/abi we follow that rules. This way can have a real gain to justify fastcall.
I like to read your opinion, so, to you, whats the best way to release a library (like masm32 lib)? Maybe a (or many) prologue and epilogue function(s) to deal with other languages/abi and an internal calling convention to assembly programmers?
Sir habran, you have any plans to continue support to linux?
Quote from: jj2007 on June 24, 2016, 01:11:22 AM
- basic includes (kernel, user, msvcrt, ...), or at least exact links to them
- basic libs, or at least exact links to them
Some information related to these might be found in this post: JWasm64 with RadASM - http://masm32.com/board/index.php?topic=4162.msg44176#msg44176 (http://masm32.com/board/index.php?topic=4162.msg44176#msg44176)
Quote from: jj2007 on June 24, 2016, 01:11:22 AM
- a link to a free 64-bit debugger
http://x64dbg.com/#start (http://x64dbg.com/#start)
Latest snapshots are available from here: https://github.com/x64dbg/x64dbg/releases (https://github.com/x64dbg/x64dbg/releases)
Some additional bits and pieces i played around with ive uploaded to bitbucket: https://bitbucket.org/mrfearless/jwasm64-with-radasm (https://bitbucket.org/mrfearless/jwasm64-with-radasm) and https://bitbucket.org/mrfearless/debug64-for-jwasm64 (https://bitbucket.org/mrfearless/debug64-for-jwasm64) - related post (http://masm32.com/board/index.php?topic=4203.msg44670#msg44670)
I also started a port of some of the functions from the masm32.lib for x64 a while ago: https://bitbucket.org/LetTheLightIn/masm64-library (https://bitbucket.org/LetTheLightIn/masm64-library)
Any and all can be downloaded, modified etc - they are a work in progress, or a starting point for some other enterprising fella to continue on with.
Sir mineiro ;)
I can assure you that my co-developer sir Johnsa will make it work for linux 8)
Excellent job mr. fearless :t
However, when I tied to download RadAsm from your link this is what I get:
QuoteThe site ahead contains harmful programs
Attackers on www.assembly.com.br might attempt to trick you into installing programs that harm your browsing experience (for example, by changing your homepage or showing extra ads on sites you visit).
Quote from: mineiro on June 24, 2016, 09:29:47 AM
I like to read your opinion, so, to you, whats the best way to release a library (like masm32 lib)? Maybe a (or many) prologue and epilogue function(s) to deal with other languages/abi and an internal calling convention to assembly programmers?
Hi mineiro, since you ask,
Avoid complexity. I gather you want to support both Windows and Linux - that's already a fair amount of complexity in the interface. Don't forget, not only do you have to program it, but also produce documentation; and your users have to understand it.
If you want one function to serve for both, you should just use the simplest approach. My guess is, that would mean doing it the Linux way and providing prologue / epilogue to translate to Windows. But if the best way is to develop your own internal methods and translate to both OS's, then fine do that. But in that case I wouldn't publish your internal ABI so others can use it. Then you're locked in to that definition, and also must provide documentation and support. Every error they find can be a big headache: minimize the ways they can access the code to produce errors.
Perhaps you've already done this sort of thing and know all about it, in which case my opinion is superfluous. I've directed many software interfaces on Navy projects but have almost no professional experience in commercial projects. FWIW, when planning a project I always emphasize one thing: simplicity. Not speed or anything else. Everything tends to be a lot more complex than you thought at first, don't add any "extras". That can be done in version 2.
So bottom line - decide what interfaces you MUST support, decide the simplest way to do that, don't publish any more interfaces (like your internal ABI) than necessary.
An alternative might be
only publish your own, "improved", ABI, and leave translation for Windows and Linux to external routines.
The "KISS" principle: "Keep it simple, sailor!"
:biggrin:
Habran,
Quote
hutch and jj2007, if you find the FASTCALL to compex, brace yourself for the VECTORCALL which is coming in the next release of HJWasm
I was using FASTCALL in MS-DOS in 1990, passing data in registers without a stack is not new technology. As far as the later AVX instructions, you must first have the hardware, then read the Intel manuals. In Win32 you can do your own FASTCALL with EAX, ECD and EDX and use GLOBALS for any further arguments OR use structures passed in 1 register.
Now Win64 apart from having a really crappy ABI has many advantages for an assembler language programmer, roll your own VERYFASTCALL with the extra integer registers (rax rcx rdx r8 r9 r10 r11) and GLOBALS for a stack free method of calling procedures while remaining compatible with the Windows version of the ABI. What you don't need to cripple 64 bit x86 assembler with is the assumptions of a C compiler. If you do you may as well use a C compiler and write modules in an assembler when you need extra speed.
Thanks for answering.
Good sir rrr314159, the first thing is to put code to do something usefull, after yes, optimize. I like simplistic too and I start from point where everything will not work, this way I can reach less headcache. Thanks a lot, very helpfull answer.
sir habran, so I suppose the answer is no.
I meet the owner of that .br site years ago on an extinguished board, he have translated radasm language to brazilian portuguese, don't appears to be a bad person, but I will not put my hand on fire.
Good job sir fearless.
Yeh seems that site gives a warning about radasm v2, not sure where else it is hosted anymore, if at all - hopefully some of the other info helps anyhow.
Maybe softpedia: http://www.softpedia.com/get/Programming/File-Editors/RadASM.shtml (http://www.softpedia.com/get/Programming/File-Editors/RadASM.shtml) - seemed to work
plus minor update i compiled for a v2.2.2.1 (only worth getting this exe if your are likely to have more than 256 resources compiled into your final project - raises the limit up to 512 resource files in total): http://masm32.com/board/index.php?topic=4884.msg53620#msg53620
Sir mineiro, that means that it will be taken care of :biggrin:
Mr. fearless, softpedia is fine :t
Maestro hutch ;)
I don't doubt your programming skills, and I agree with you about crappy ABI.
What I love the mos about x64 is having plenty of registers to work (my apology to rrr314159) with.
Those who do programming for maths or graphic will be able to the advantage of the VECTORCALL.
The difference between C and assembler is that in assembler we can optimize the code for the specific purpose, while C language does it more portable. So, my goal is to create assembly source easy to understand (HLL features) and at the same time optimized for the machine code 8)
Hi,
rrr314159!
You wrote
QuoteThere are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM
Tell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?
Thank you!
Quote from: Mikl__ on June 24, 2016, 04:04:08 PMTell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?
Here is the 32-bit version:
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
.data
SomeInt dq 1234567890123456789
SomeFloat REAL8 1234567890.1234567890
Init
movlps xmm0, SomeInt
movlps xmm1, SomeFloat
Print Str$("X0 (int)=\t%i\n", xmm0), Str$("X1 (R8)=\t%If\n", f:xmm1) ; for comparison
sub esp, 8 ; create qword slot
mov esi, esp ; assign a reg that points to the slot
movlps qword ptr [esi], xmm0 ; move a qword from xmm reg to slot
sub esp, 8 ; repeat for second value
mov edi, esp
movlps real8 ptr [edi], xmm1
printf("X0 (int)=\t%lld\n", qword ptr [esi])
printf("X1 (float)=\t%.8f\n", REAL8 ptr [edi])
Inkey "that was cute, right?"
EndOfCodeOutput:X0 (int)= 1234567890123456789
X1 (R8)= 1234567890.12345672
X0 (int)= 1234567890123456789
X1 (float)= 1234567890.12345670
Note that CRT and WinAPI both have the bad habit ("ABI") to trash the lower xmm regs :(
Thank you, jj2007!
(http://www.en.kolobok.us/smiles/icq/good.gif)
Quote from: Mikl__ on June 24, 2016, 04:04:08 PM
Hi, rrr314159!
You wrote
QuoteThere are other mistakes in all tutorials you'll see, which I'll mention briefly. They say all floating points are passed in XMM's. No, they're often passed in the GPRs. For instance printf gets floats from GPRs and will ignore any data you send in XMM
Tell me please how to display the contents of the XMM-register or a real number (float, double, 80- or 128-bits) using the printf function?
Thank you!
See this post http://masm32.com/board/index.php?topic=3988.msg42123#msg42123 in my old thread "Yet Another Invoke Macro". jj2007, GoneFishing and I discussed this at some length, in the pages around this post. Basic idea: put XMM contents into memory, pass pointer to printf, and use %llx format command (twice). AFAIK that's the only way; printf doesn't accept XMM registers as input.
include \myinc\inc64.inc
.data
o1 OWORD 12335678aacdff0112344678abbeef02h
.code
start:
movups xmm0, o1
movups [rsp-16], xmm0
mov r15, [rsp-8]
mov r14, [rsp-16]
prnt "%llx ", QWORD PTR r14
prnt "%llx \n", QWORD PTR r15
ret
end start
This code puts XMM0 on the stack, because that's what GoneFishing (a.k.a. Vertograd) wanted to do; you could also simply point at o1. If you want to use this code as it is you have to get my nvk macro, along with prnt macro and "inc64.inc" includes. But "prnt" is just a simple wrapper that calls printf. You should be able to adapt this technique to use with printf, without needing the nvk macro (s).
[EDIT] Forgot you asked about real numbers also. 64-bit, put it in a register, like r8 or r9. Larger, you would have to use the above technique with the right format statement, maybe %llf; I don't know about that.
Quote from: rrr314159 on June 25, 2016, 12:23:05 AMuse %llx format command
%ll
x doesn't work for me - do you have code that produces valid output? Or is there are difference between 32-bit and 64-bit crt printf()??
See above, they both work fine:
printf("X0 (int)=\t%lld\n", qword ptr [esi])
printf("X1 (float)=\t%.8f\n", REAL8 ptr [edi])
@jj2007, I don't remember. But I know %llx (long hex) worked with that code. Undoubtedly if you download nvk macros, which includes everything except the libraries, it will work. And, there were other %ll type formats that worked; maybe %llf, %llu, etc. These days I'm only doing 32-bit, because I got tired of the lack of standardization with 64-bit. Happened often that something worked for me, with my idiosyncratic macros, but not for others. To get benefit from my stuff you should probably read the code see what went on under the hood, and adapt what's useful for your own code.
Quote from: rrr314159 on June 25, 2016, 04:03:32 AM@jj2007, I don't remember. But I know %llx (long hex) worked with that code.
Yes it does: llx produces hex output. I had fed it with the decimal stuff posted above, and couldn't make sense of the result...
My fault :P
Partly my fault also. MikL asked about floating point, I didn't read carefully, and responded with a hex format code.
For the original question, Raymond Chen provides a compact answer here (https://blogs.msdn.microsoft.com/oldnewthing/20040114-00/?p=41053):
"The stack must be kept 16-byte aligned. Since the "call" instruction pushes an 8-byte return address, this means that every non-leaf function is going to adjust the stack by a value of the form 16n+8 in order to restore 16-byte alignment."
Thanks Michael. :t
Quote from: habran on June 24, 2016, 03:24:28 PM
Sir mineiro, that means that it will be taken care of :biggrin:
Ohh sir habran, really sorry, only now I understand what you have write, my fault. Only today I access your page and I understood about name John.
I download hjwasm and now I'm playing with it, I have successfull coded a simple asm sample.
Thanks a lot.
No worries mate :biggrin:
Hi fearless!
Quote from: fearless on June 24, 2016, 09:51:55 AMI also started a port of some of the functions from the masm32.lib for x64 a while ago: https://bitbucket.org/LetTheLightIn/masm64-library (https://bitbucket.org/LetTheLightIn/masm64-library)
Is that a dead project?
Thanks, HSE.
I moved it over to github: https://github.com/mrfearless/libraries/tree/master/Masm64 (https://github.com/mrfearless/libraries/tree/master/Masm64)
But I have not added anything to it since then I think or maybe one or two extra functions. Of course anyone can contribute on github or suggest other inclusions or functions that should be added or updated or post code here for inclusion/addition/corrections to the library.
I mainly created it for myself when doing some of the x64 stuff as it was easier to port code and not have to worry about missing functions etc, i could just change a few things (registers, params etc) and rename '32' to '64' when including the masm inc and lib and hit compile. For the most part the params and returns are the same (prob 95% of the functions that have been ported over) but there might be a few that take extra params etc
Quote from: fearless on November 13, 2023, 11:07:57 AMI moved it over to github: https://github.com/mrfearless/libraries/tree/master/Masm64
I saw that.
But Masm32Lib have around 238 functions, and you have around 35.
My hope was that I was searching in the wrong place :biggrin:
It's not clear for me why Hutch only translate half of the library. Perhaps in 32 bits there was no plan neither.
Thanks.
Quote from: HSE on November 13, 2023, 12:56:37 PMBut Masm32Lib have around 238 functions, and you have around 35.
Yes, I only ported the functions I needed at the time.
I can move the project so that it is on its own repository, so that any one can contribute to it. Also it could be renamed: Asm64.lib, A64.lib, U64.lib or whatever, suggestions are welcome.
Where there any particular functions you where looking for?
Quote from: fearless on November 13, 2023, 10:22:12 PMYes, I only ported the functions I needed at the time.
:thumbsup:
Quote from: fearless on November 13, 2023, 10:22:12 PMI can move the project so that it is on its own repository, so that any one can contribute to it. Also it could be renamed: Asm64.lib, A64.lib, U64.lib or whatever, suggestions are welcome.
Fantastic. That could be "Masm64.lib: A fearless' curated repository" :thumbsup:
Quote from: fearless on November 13, 2023, 10:22:12 PMWhere there any particular functions you where looking for?
Just some days ago I translated a couple: ClearScreen and locate, so elementals but missing. Along time there was a couple more.
Whilst in the process of doing some coding for some current upcoming projects: applications, libraries, controls etc, I recalled this post and took a break from those projects to spend some time on this. Initially I was just going to create a repo and upload what was already the Masm64.lib contents, but after some thought I decided to go over it and change a few things.
I decided to name it UASM64 Library as having it named as Masm64.lib gave the impression that it could be built with ML64. As it is targeted towards using UASM x64 it made sense to name it that. Other compilers and assemblers should be able to use the library though.
I also decided to rename some of the functions for greater readability, and the same with parameter names. There are equates for those coming from x86 using the masm32 library names, to help with porting or for those that prefer those names.
I also added some cpu functions whilst I was at it. And also created a readthedocs page that auto builds the documentation for the UASM64 Library, which will allow viewing the documentation online, or by downloading as a pdf, chm or epub for offline viewing.
I haven't got round to all the functions, just some. Also the conversion functions I haven't settled on a name yet. Provisionally I had thought to prefix them with Convert_ and then some info like StringDecToDWORD etc, but I am open to suggestions about any of the naming of functions, as it can be changed.
There is a basic test radasm project included to test some of the functions created so far - some, not all are currently in the test application.
https://github.com/mrfearless/UASM64-Library (https://github.com/mrfearless/UASM64-Library)
I might post this somewhere else rather than have the discussions appended to this topic.