Hello developers,
This one does not work.
.686
.MODEL FLAT, C
option casemap:none
OPTION STACKBASE:ESP
.code
sub2 proc private dest, src
mov ecx, src
mov edx, dest
mov eax, dword ptr [ecx]
mov dword ptr [edx], eax
ret
sub2 endp
proc1 proc public dest : ptr, src : ptr
invoke sub2, dest, src
ret
proc1 endp
end
Try to initialise source and destinu and see if it works.
That's what I did, I am calling proc1 from high level language.
It works like a charm if I do not use the OPTION STACKBASE:ESP :(
JWasm 2.12 produces exactly the same code. We can investigate, provide the proper source code please.
I know JWasm produces the same error, it is a common bug.
Follows attachment.
As I said before, you don't initialise parameters and it can be anything there.
Here is an example with initialised src and dest which works as expected:
.686
.MODEL FLAT, C
option casemap:none
OPTION STACKBASE:ESP
.data
src dd "ABCD"
dest dd 0
.code
sub2 proc private dest3, src3
mov ecx, src3
mov edx, dest3
mov eax, dword ptr [ecx]
mov dword ptr [edx], eax
ret
sub2 endp
proc1 proc public dest1 : ptr, src1 : ptr
local bob:DWORD
local dob:DWORD
lea eax,bob
mov dob,eax
invoke sub2, ADDR dest, ADDR src
ret
proc1 endp
end
it produces:
--- AW32.asm -------------------------------------------------------------------
14: mov ecx, src3
01331010 8B 4C 24 08 mov ecx,dword ptr [src3]
15: mov edx, dest3
01331014 8B 54 24 04 mov edx,dword ptr [dest3]
16: mov eax, dword ptr [ecx]
01331018 8B 01 mov eax,dword ptr [ecx]
17: mov dword ptr [edx], eax
0133101A 89 02 mov dword ptr [edx],eax
18:
19: ret
0133101C C3 ret
20: sub2 endp
21:
22: proc1 proc public dest1 : ptr, src1 : ptr
0133101D 83 EC 08 sub esp,8
23: local bob:DWORD
24: local dob:DWORD
25: lea eax,bob
01331020 8D 44 24 04 lea eax,[bob]
26: mov dob,eax
01331024 89 04 24 mov dword ptr [esp],eax
27: invoke sub2, ADDR dest, ADDR src
01331027 68 00 40 33 01 push 1334000h
0133102C 68 04 40 33 01 push 1334004h
01331031 E8 DA FF FF FF call sub2 (01331010h)
01331036 83 C4 08 add esp,8
28: ret
01331039 83 C4 08 add esp,8
0133103C C3 ret
--- No source file -------------------------------------------------------------
I'm not 100% convinced yet on this, the parameters (if initalized in hll) should be fine, the asm code if being used as a library shouldn't need the params initialised, the generated code should be the same either way..
; Disassembly of file: test86.obj
; Mon Mar 20 23:28:15 2017
; Mode: 32 bits
; Syntax: MASM/ML
; Instruction set: 80386
.386
.model flat
public _proc1
_text SEGMENT PARA PUBLIC 'CODE' ; section number 1
_sub2 LABEL NEAR
mov ecx, dword ptr [esp+8H] ; 0000 _ 8B. 4C 24, 08
mov edx, dword ptr [esp+4H] ; 0004 _ 8B. 54 24, 04
mov eax, dword ptr [ecx] ; 0008 _ 8B. 01
mov dword ptr [edx], eax ; 000A _ 89. 02
ret ; 000C _ C3
_proc1 PROC NEAR
push dword ptr [esp+8H] ; 000D _ FF. 74 24, 08
push dword ptr [esp+4H] ; 0011 _ FF. 74 24, 04
call _sub2 ; 0015 _ E8, FFFFFFE6
add esp, 8 ; 001A _ 83. C4, 08
ret ; 001D _ C3
_proc1 ENDP
_text ENDS
_data SEGMENT PARA PUBLIC 'DATA' ; section number 2
_data ENDS
END
I'm not sure why there is an add esp,8 at all in proc1... i guess the problem might be:
push dword ptr [esp+8H] ; 000D _ FF. 74 24, 08
push dword ptr [esp+4H] ; 0011 _ FF. 74 24, 04
push is going to move the stack pointer each time.. so the original arguments if they're at esp+8 and esp+4 would mean it should be:
push dword ptr [esp+8H] ; 000D _ FF. 74 24, 08
push dword ptr [esp+8H] ; 0011 _ FF. 74 24, 04
.. just thinking out aloud :)
Quote from: habran on March 21, 2017, 08:21:32 AM
As I said before, you don't initialise parameters and it can be anything there.
Here is an example with initialised src and dest which works as expected:
The sad reality is that it is initialized, values are set before call (no need to be) and crashes. :(
Your example manipulates the reality, were the references carried over from sub2 to another sub it would crash as well. ::)
I can confirm the issue is the pushes
it's using esp to lookup the current parameters on the stack, but then esp is being modified with each push (this moving the position of the arguments on the stack relative to the esp register).
Quote from: johnsa on March 21, 2017, 08:39:24 PM
I can confirm the issue is the pushes
it's using esp to lookup the current parameters on the stack, but then esp is being modified with each push (this moving the position of the arguments on the stack relative to the esp register).
True :t
Quote from: johnsa on March 21, 2017, 08:39:24 PMit's using esp to lookup the current parameters on the stack, but then esp is being modified with each push (this moving the position of the arguments on the stack relative to the esp register).
It's a bit tricky but I use it occasionally:
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
StreamOut proc cookie, pBuffer, NumBytes, pBytesWritten
invoke RtlMoveMemory, [esp+12], [esp+12], [esp+12] ; dest, source, count
mov eax, [esp+12]
ret 4*4
StreamOut endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDefWhat I don't see yet is why this is a problem in the x64 world (http://in%20the%20x64%20world) ::)
If you're referring to the other thread about the same stackbase:rsp issue ?
My suggestion temporary there is just use:
option frame:auto
option win64:11
option stackbase:rsp
then that code should be 100% (for now), that said there is no reason for the bug to be there so we will fix it. :)
this particular case here is just a plain old bug and we're fixing it now.
Hi,
OPTION STACKBASE:ESP has been removed.
There are many issues with it, it clearly wasn't a well thought-out addition.
1) No other assembler supports it.. hence we'd be producing non portable 32bit code
2) It totally violates the standard ABI..
3) The side-effects of how it works used in conjunction with the ABI using PUSH is just bad ..
4) There could be side-effects related to it's use in win32 (x86) with SEH and VEH that I've not yet fully been able to dig through.
If anyone has any other thoughts on this, please share! :)
On that note, the test-piece as mentioned works perfectly without it.
STACKBASE:ESP will throw an assemble-time error now.
Quote from: johnsa on March 24, 2017, 01:11:31 AM
Hi,
OPTION STACKBASE:ESP has been removed.
I will be good with that but someone with a Godfather's Marlon Brando face wrote once:
"Unless you know exactly what are you doing, I would suggest you to use:
.....
option STACKBASE:RSP ; use rsp as a stack base instead of rbp"
lol
well with a pinch of salt, stackbase:rsp is a lot more useful as the x64 fastcall ABI is a lot more lenient in that regard and more powerful, avoid push/pop and liberally aligns stack etc.. so yeah for 64bit code I'd definitely say STACKBASE:RSP is the way to go.
The x64 stack unwind works with it so it'll work with VEH etc and it's nice a fast and frees up RBP.
STACKBACK:RSP remains and always will, it's just the 32bit ESP version that has gone to it's grave :)
Also I also feel less bad about having hjwasm 64bit stuff diverge from MASM because masm + 64bit is crippled anyway and they've always been very different assemblers for x64, for x86 they were 99% compatible and I see no reason not to leave it like that, I never had any complaints with 32bit MASM.. it was my assembler of choice pre 64bit days.
Quote from: johnsa on March 24, 2017, 01:42:58 AM
lol
well with a pinch of salt, stackbase:rsp is a lot more useful as the x64 fastcall ABI
Last time I checked there were serious issues while using stackbase:rsp, I don't know if are already solved.
On the other hand I am not yet convinced it will be advantageous because instructions are longer and execution appear slower. Of course, it will release the rbp register which might be useful.
I've not got any examples of stackbase:rsp being an issue, as I said I use it in +- 500k's worth of code.
The only issue I had found from your previous post after digging through and re-checking everything was just the omission of the FRAME attribute on the PROC.
And yes, you free up RBP and the prologue/epilogue are shorter (which == faster) and the stack isn't constantly fiddled with like other modes so that should improve cache use.
Quote from: johnsa on March 24, 2017, 02:13:28 AM
I've not got any examples of stackbase:rsp being an issue
You just forgot.
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, dest, rdx, r8, r9
ret
getSum endp
end
sub1 will be invoked with the stack not aligned (release HJWasm 2.21, not checked yet on latest).
you need to put FRAME on the proc decoration
sub1 proc private FRAME dest:ptr, src:ptr, val1 : qword, val2:qword
and
getSum proc public FRAME dest:ptr, src:ptr, val1 : qword, val2:qword
(I've attached the link to the C/C++ project with updated asm in the other thread, it all runs through perfectly). (Just for reference, but you can just add these two your side).
Quote from: johnsa on March 24, 2017, 02:53:23 AM
you need to put FRAME on the proc decoration
Now, imagine I want to use the good old standard RBP base frame. That possibility appears to have vanished completely.
option casemap:none
option frame:auto
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; OPTION STACKBASE:RSP
option win64:11
option ARCH:SSE
.code
sub1 proc private FRAME dest:ptr, src:ptr, val1 : qword, val2:qword
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 xmm7 dest:ptr, src:ptr, val1 : qword, val2:qword
LOCAL myVar1 : qword
mov rax, rdx
mov myVar1, rax
mov rdx, myVar1
INVOKE sub1, dest, rdx, val1, r9
ret
getSum endp
end
Disassembles to:
getSum:
000000013FDA181B mov qword ptr [rsp+8],rcx
000000013FDA1820 mov qword ptr [rsp+18h],r8
000000013FDA1825 sub rsp,50h
000000013FDA1829 movdqa xmmword ptr [rsp+20h],xmm6
000000013FDA182F movdqa xmmword ptr [rsp+30h],xmm7
000000013FDA1835 mov rax,rdx
000000013FDA1838 mov qword ptr [rsp+40h],rax
000000013FDA183D mov rdx,qword ptr [rsp+40h]
000000013FDA1842 mov rcx,qword ptr [rsp+58h]
000000013FDA1847 mov r8,qword ptr [rsp+68h]
000000013FDA184C call 000000013FDA1800
000000013FDA1851 movdqa xmm6,xmmword ptr [rsp+20h]
000000013FDA1857 movdqa xmm7,xmmword ptr [rsp+30h]
000000013FDA185D add rsp,50h
000000013FDA1861 ret
Where is my RBP base frame? Please help! :dazzled:
And we have the stack not aligned again! OMG! :dazzled:
aw27,
Spare us the wise cracks, these guys are doing a lot of work and don't need nonsense. If you can help with the BETA testing, well and good but no further nonsense.
remove option win64:11 AND stackbase:RSP
and it reverts to working the old-fashion JWASM and ML style way, which produces this (and works on that same test project I linked to) :
getSum proc public frame dest:ptr, src : ptr, val1 : qword, val2 : qword
000000013FEA1834 55 push rbp
000000013FEA1835 48 8B EC mov rbp,rsp
mov dest, rcx
000000013FEA1838 48 89 4D 10 mov qword ptr [dest],rcx
mov src, rdx
000000013FEA183C 48 89 55 18 mov qword ptr [src],rdx
mov val1, r8
000000013FEA1840 4C 89 45 20 mov qword ptr [val1],r8
mov val2, r9
000000013FEA1844 4C 89 4D 28 mov qword ptr [val2],r9
INVOKE sub1, dest, src, val1, val2
000000013FEA1848 48 83 EC 20 sub rsp,20h
000000013FEA184C 48 8B 4D 10 mov rcx,qword ptr [dest]
000000013FEA1850 48 8B 55 18 mov rdx,qword ptr [src]
000000013FEA1854 4C 8B 45 20 mov r8,qword ptr [val1]
000000013FEA1858 4C 8B 4D 28 mov r9,qword ptr [val2]
000000013FEA185C E8 AF FF FF FF call sub1 (013FEA1810h)
000000013FEA1861 48 83 C4 20 add rsp,20h
ret
000000013FEA1865 5D pop rbp
000000013FEA1866 C3 ret
Quote from: johnsa on March 24, 2017, 04:21:20 AM
remove option win64:11 AND stackbase:RSP
and it reverts to working the old-fashion JWASM and ML style way, which produces this (and works on that same test project I linked to) :
But why becomes the stack not aligned in my previous example?
This comes back to what I was saying about all these modes and options, it all boils down to really several "supportable" and sensible options
Do pretty much nothing special, RBP frame pointer style / aligned.
option frame:auto
Or the "do everything optimally way"
option frame:auto
option stackbase:rsp
option win64:11
Or possibly the only OTHER option being
option frame:auto
option win64:1
So basically we have 3 modes:
totally dumb, smart, mostly dumb
So when i was saying earlier about removing all those modes perhaps we just replace all this above complex combinatorial stuff with 2 simple directives..
OPTION WIN64:SIMPLE
OPTION WIN64:AUTO
or something like that.
Here is an example of why I think a lot of these modes are irrelevant :
option casemap : none
option frame : auto
option win64 : 11
OPTION STACKBASE : RSP
OptimalProc PROTO aVar : QWORD, bVar : DWORD
AutoProc PROTO aVar : QWORD, bVar : DWORD
AutoProc2 PROTO aVar : QWORD, bVar : DWORD
.code
sub1 proc private frame dest : ptr, src : ptr, val1 : qword, val2 : qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr[rdx]
add rax, val1
add rax, val2
mov qword ptr[rcx], rax
ret
sub1 endp
getSum proc public frame dest : ptr, src : ptr, val1 : qword, val2 : qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
INVOKE sub1, dest, src, val1, val2
INVOKE AutoProc, 10, 20
INVOKE AutoProc2, 10, 20
INVOKE OptimalProc, 10, 20
ret
getSum endp
; Or using AUTO mode
AutoProc PROC FRAME aVar : QWORD, bVar : DWORD
mov eax, edx
mov rdx, rcx
ret
AutoProc ENDP
; Or using AUTO mode
AutoProc2 PROC FRAME aVar : QWORD, bVar : DWORD
mov eax, edx
mov rdx, aVar
ret
AutoProc2 ENDP
; You might find people doing this to create a "bare" zero - overhead procedure.
OPTION PROLOGUE : NONE
OPTION EPILOGUE : NONE
OptimalProc PROC aVar : QWORD, bVar : DWORD
mov eax, edx ; EAX = bVar
mov rdx, rcx ; RDX = aVar
ret
OptimalProc ENDP
OPTION PROLOGUE:DEFAULTPROLOGUE
OPTION EPILOGUE:DEFAULTEPILOGUE
end
We have 3 procs, OptimalProc which is coded in the way some might to ensure minimal overhead (ie: optimal call), and two version of the same proc using the win64:11 / RSP combination.
AutoProc:
8B C2 mov eax,edx
48 8B D1 mov rdx,rcx
C3 ret
AutoProc2:
48 89 4C 24 08 mov qword ptr [aVar],rcx
8B C2 mov eax,edx
48 8B 54 24 08 mov rdx,qword ptr [aVar]
C3 ret
OptimalProc:
8B C2 mov eax,edx
48 8B D1 mov rdx,rcx
C3 ret
As you can see, there is no benefit.. the autoproc is just as efficient as the zero-overhead one, and in the case of AutoProc2 where we only reference ONE of the parameters by name, only that is copied to shadow space.
So without all the options, you still have full control inside the proc as to how efficient you want it to be.
Quote from: aw27 on March 24, 2017, 04:38:28 AM
Quote from: johnsa on March 24, 2017, 04:21:20 AM
remove option win64:11 AND stackbase:RSP
and it reverts to working the old-fashion JWASM and ML style way, which produces this (and works on that same test project I linked to) :
But why becomes the stack not aligned in my previous example?
I think it's really that win64:11 can only work with stackbase:rsp.. the two are "married" :)
All these options don't give you any flexibility, but they open up the potential for errors/problems when the wrong combinations are used.
Could we go through every single combination and make it work in some way?
Probably, but based on my previous example I'd be hard pushed to go to that amount of effort to fix things that shouldn't even be there in the first place :)
There are so many permutations to handle that it makes it a maintenance nightmare too.
My current vote is to remove them all and replace it with 2 choices, SIMPLE / AUTO (or something along those lines) and then you still have the option to remove the default prologue/epi. as per my example or use a raw label (old school style), and that should give you every combination you need and also doesn't make getting into 64bit asm coding a horrible prospect. It keeps it as simple as it was to get into x86.
John,
What you are suggesting here is the right move, automated stackframe for high level code, something inbetween if you can be bothered and no stack frame at all for people who know what they are doing. I think more than 4 arguments needs some form of automation but if you look at the stack overhead of loading the shadow space then writing to the stack in comparison to the duration of the vast majority of high level API and similar code, the lead and tail procedural code is trivial.
I am of the view that the pile of messy stack options you inherited from JWASM were mainly experiments that were useless, the faster you get rid of them, the more time you will have to write useful stuff.
Agreed.
Right now my biggest challenge is trying to decide on a name for the options
OPTION WIN64:COOL
OPTION WIN64:SUPERCOOL
;)
Quote from: johnsa on March 24, 2017, 06:02:17 AM
I think it's really that win64:11 can only work with stackbase:rsp.. the two are "married" :)
I am sure you did not even notice that stackbase:rsp makes the code bigger and slower. Just test to see. Before appearing in this forum, I had tested the OPTION STACKBASE:RSP on my library with JWASM and abandoned it for such reasons.
This is point 1.
The second point is that you are breaking backward compatibility with something that works. This is never a good idea, but I can't stop you.
The third point is: the most important feature I was looking for was the capability to use XMM registers with the INVOKE statement. I thought the feature was not present in JWASM but it was there although not documented. Instead of confirming that you boasted to have included the feature in HJWASM. Please, gave me a break, lol. :badgrin:
The 4th point is that I am never convinced by arguments such as: "There are so many permutations to handle that it makes it a maintenance nightmare too". For me, you simply lost control, OPTION WIN64 always worked fine with JWASM.
So, I am going to drop out before the administrator decides to expel me, he is enfuriated and is repeating that you guys are doing a great job. Good luck then!
Quote from: aw27 on March 24, 2017, 01:50:15 PMI am sure you did not even notice that stackbase:rsp makes the code bigger and slower.
José,
You are
judging hard-working people here - that is not helpful. Automating the X64 ABI is pretty complex, and the HJWasm team and Hutch as well are doing their best to go beyond what the market offers. And they are doing it for free, as a hobby. Your contributions to solving the problem are certainly appreciated, but your judgments are unfair and unnecessary.
Quote from: jj2007 on March 24, 2017, 03:19:21 PM
Quote from: aw27 on March 24, 2017, 01:50:15 PMI am sure you did not even notice that stackbase:rsp makes the code bigger and slower.
You are judging hard-working people here - that is not helpful.
It is not helpful to state that stackbase:rsp makes the code smaller and faster. It does not.
aw27,
There is not a point to win here. You are obviously experienced in writing assembler and this is very useful to folks who creating the tool in the first place but as I am sure you can imagine, the guts of an assembler is a nightmare to work on and both authors inherited some of this mess in JWASM that had a number of experimental techniques that have proven to be problematic.
Keep in mind that JWASM was supposed to have been MASM compatible yet in its experimental stage it added a range of un-necessary experimental code that was clearly NOT MASM compatible so the pursuit of some clean up and simplification is in fact a good idea. The MACRO capacity of writing your own prologue and epilogue is probably a far better method of producing custom stack usage and procedure calls than a plethora of unreliable experiments. This means the assembler can be cleaned up to produce predictable and reliable code and the brave/foolish/experimental or dedicated can cook their own if they can get it going. This keeps everybody happy.
Constantly hassling people doing complex work is counter productive and it can lead to them thinking "PHUKIT" and do something else for a while.
Quote from: hutch-- on March 24, 2017, 07:01:16 PM
aw27,
This keeps everybody happy.
I am not against tools for dumb people, my first Assembler was TASM in Ideal mode (sighs...).
May be we can make an Ideal mode for HJWASM ?!
Quote from: aw27 on March 24, 2017, 08:44:46 PMMay be we can make an Ideal mode for HJWASM ?!
The Ideal Mode is the one that allows you to port your sources from Masm32 (ML 32, ...) easily, without having to experiment with cryptic options. In phase II, you can pick the routines that you think can be made faster, and start playing with the options.
There is a huge Masm32 codebase. IMHO it would be unwise to tell its authors "we have a better assembler now, all you have to do is to read a 50-page manual".
deleted
Hi,
Thanks for the pointers!
I think given you've supplied working fixes for stackbase:esp , and after some soul searching on the wknd I believe we should put it back in :)
There are definitely cases for optimisation where you'd want it.
With regards to your last point on calling convention I assume you meant fastcall for 32bit not 64bit ?
Habran and I are busy looking at building completely new versions of invoke and proc.c .. to be honest it's gotten very messy and there's just too much going on in the same functions trying to cater for every mode.
We want to refactor it all out into dedicated calls for:
SystemV ABI
Win64 Fastcall(RSP)
Win64 Fastcall(RBP)
Win64 VectorCall( using RSP )
We'd keep the existing code in place to handle all the 32bit options.
deleted
I tend to agree, I would like them all completely separated from language descriptor/type right through the actual source.
One of the biggest issues we've faced is you make a chance to proc or invoke and it's not side-effect free, a change to win64 fastcall can break vectorcall etc etc.. and that scares me long term, it would be so much more maintainable if separated (and also refactored) to account for removing all the needless / experimental options.
john,
There is an approach that should solve that problem, the OPTION CASEMAP style of directive put before and after a procedure could safely control differences between ABI style proc entry and exit and any custom designs you wanted to make available.
OPTION DO_IT_IN_AVX3
; write the code here
OPTION DEFAULT
This would allow you to have a sequence of options for procedure types with potentially different calling conventions.
OPTION SAFE
OPTION NONE
OPTION RSPCALL
OPTION RBPCALL
etc .....
This would allow you to modularise it and then you could safely add anything else you want to provide later.
Definitely in line with what I was thinking.
Over the weekend I'd tentatively started rolling up stackbase/frame:auto etc into a standard combination.. but decided to scrap it as it just had a "bad code smell" :)
I'm leaning more towards leaving those as is for now, and rather.. as you've suggested creating new options for each "mode" which may or may not imply the other settings automatically.
So for example
OPTION WIN64_FASTCALL_RSP
= (frame:auto, win64:11, stackbase:rsp, all procs are auto-decorated with FRAME) [local align 16 guaranteed, smart save to home-space, stack use optimized]
OPTION WIN64_FASTCALL_RBP
= (frame:auto, win64:11, stackbase:rbp, all procs are auto-decorated with FRAME) [local align 16 guaranteed, smart save to home-space, stack use optimized]
OPTION SYSTEM_V
= (64bit Nix* ABI)
OPTION VECTORCALL
etc
Quote from: johnsa on March 28, 2017, 07:12:42 PM
Definitely in line with what I was thinking.
Over the weekend I'd tentatively started rolling up stackbase/frame:auto etc into a standard combination.. but decided to scrap it as it just had a "bad code smell" :)
I'm leaning more towards leaving those as is for now, and rather.. as you've suggested creating new options for each "mode" which may or may not imply the other settings automatically.
So for example
OPTION WIN64_FASTCALL_RSP
= (frame:auto, win64:11, stackbase:rsp, all procs are auto-decorated with FRAME) [local align 16 guaranteed, smart save to home-space, stack use optimized]
OPTION WIN64_FASTCALL_RBP
= (frame:auto, win64:11, stackbase:rbp, all procs are auto-decorated with FRAME) [local align 16 guaranteed, smart save to home-space, stack use optimized]
OPTION SYSTEM_V
= (64bit Nix* ABI)
OPTION VECTORCALL
etc
Another possibility, allow the user to fabricate his own calling convention (which could be entered with the name CUSTOM either in the PROTO or PROC declarations).
incidentally, IDA Pro can deal with custom calling conventions.
And don't forget the PROLOG and EPILOG macros (https://technet.microsoft.com/en-us/library/4zc781yh(v=vs.80).aspx). Especially the last argument, userparms, is a very flexible tool. Not for n00bs, though.
Quote from: jj2007 on March 30, 2017, 04:50:53 AM
And don't forget the PROLOG and EPILOG macros (https://technet.microsoft.com/en-us/library/4zc781yh(v=vs.80).aspx). Especially the last argument, userparms, is a very flexible tool. Not for n00bs, though.
Looking more for a "super" INVOKE able to deal with CUSTOM calling conventions. For example, I deal frequently with the Borland Register calling convention and have to do all the work by hand which is a bit tiring and error prone.
Quote from: aw27 on March 30, 2017, 06:39:53 PMLooking more for a "super" INVOKE able to deal with CUSTOM calling conventions.
Maybe you can get inspiration from line 829ff in \Masm32\MasmBasic\Res\JBasic.inc
jinvoke MACRO apiarg, args:VARARG
It's only about 350 lines to study :P
One of the changes we've made to 2.22+ is that we now have a built-in macro library, it's small now.. just a few to test the idea out, but over time we can extend it.. the hope being that there are some assembler features which are actually smarter to implement as macros rather than modifying the raw code of hjwasm.. this could be an example of that.. where we can create a custom invoke macro, but it's built-in.. so you wouldn't notice the difference! (it relies on existing working logic as opposed to fiddling with core stuff that might introduce new regressions).
So if we have some solid generic macros between us all that we'd like to see built in.. :) you know where to send them!
This is a very good idea john, get the core guts of the assembler stable so you don't have to keep messing with it trying to do the impossible. Once you have the stable core you can then at your leisure keep adding features without messing up the very complex stuff.
The suggestion I made before was to have a simple and safe option so that everything works reliably which would be the default. Then have the options to do things differently, no stack frame at all, SSE call, AVX/2 call, with or without ESP, once you have the core, adding flexibility comes at the risk of the user's coding ability getting it right.
Old 32 bit and earlier MASM had a default stack frame with the optional USES modifier which was safe and reliable but you could always turn it off for short fast code. I imagine there were folks who wrote their own prologue / epilogue code but there were not many. As long as you maintain the custom prologue / epilogue capacity, the brave or foolish can cook up their own if they really want to.
What I did in 64 bit MASM was to write a custom prologue / epilogue and a matching "invoke" style notation macro that handles the routine API and procedure calls, I also wrote a simple reg only call macro for up to the first 4 args that would not accept over 4 args and that did not write to shadow space. One major difference was testing many things with the 64 bit ABI and one thing was no stack address alteration with arguments and this was done by not using PUSH POP at all in the calling technique. This meant that you can pass any size argument up to 64 bit (byte, word, dword and qword) and they all fit into the same stack addresses.
The macro that provides the "invoke" notation writes 64 bit arguments directly to the sequence of stack locations which with MASM turn up at the right locations in a written procedure from arg 5 upwards. The macro auto-fills up to the first four reg arguments into shadow space so with a written "proc" you just use the passed arguments in a normal high level way.
Writing your own assembler would give you many more options than what can be done with macros and this in conjunction with your own inbuilt macro capacity should give you all of the extra calling conventions you are after.