Small'ish update 2.26 released.
Changes:
1) Reduced wasted stack-space for stackbase:RSP / win64:15 as identified while testing Nidud's alignment check examples.
2) Force stackbase:RBP as the default if nothing is specified (also based on testing Nidud's example, when no stackbase was specified , prologues were not generating as planned ). -> helps prevent unwanted side-effects.
3) Added support for USES YMM registers on stackbase:RBP (RSP already featured support for YMM and ZMM).
Cheers,
John and Habran
I have something here for you guys.
option casemap:none
option frame:auto
OPTION WIN64:15
OPTION ARCH:SSE
OPTION STACKBASE:RSP
.code
proc1 proc public FRAME uses xmm6 xmm7 xmm8 xmm9 xmm10 rsi rdi r12 cols: qword, rows : qword
dec rows
.if rows>=1
invoke proc1, cols, rows
.endif
ret
proc1 endp
end
It disassembles to:
proc1:
000000013F611020 mov qword ptr [rsp+8],rcx
000000013F611025 mov qword ptr [rsp+10h],rdx
000000013F61102A push rsi
000000013F61102B push rdi
000000013F61102C push r12
000000013F61102E sub rsp,70h
000000013F611032 movdqa xmmword ptr [rsp+20h],xmm6
000000013F611038 movdqa xmmword ptr [rsp+30h],xmm7
000000013F61103E movdqa xmmword ptr [rsp+40h],xmm8
000000013F611045 movdqa xmmword ptr [rsp+50h],xmm9
000000013F61104C movdqa xmmword ptr [rsp+60h],xmm10
000000013F611053 dec qword ptr [rsp+98h]
000000013F61105B cmp qword ptr [rsp+98h],1
000000013F611064 jb proc1+5Bh (13F61107Bh)
000000013F611066 mov rcx,qword ptr [rsp+90h]
000000013F61106E mov rdx,qword ptr [rsp+98h]
000000013F611076 call proc1 (13F611020h)
000000013F61107B movdqa xmm6,xmmword ptr [rsp-30h]
000000013F611081 movdqa xmm7,xmmword ptr [rsp-20h]
000000013F611087 movdqa xmm8,xmmword ptr [rsp-10h]
000000013F61108E movdqa xmm9,xmmword ptr [rsp]
000000013F611094 movdqa xmm10,xmmword ptr [rsp+10h]
000000013F61109B add rsp,70h
000000013F61109F pop r12
000000013F6110A1 pop rdi
000000013F6110A2 pop rsi
000000013F6110A3 mov rsi,qword ptr [rsp+18h]
000000013F6110A8 mov rdi,qword ptr [rsp+20h]
000000013F6110AD ret
As you see it will corrupt the rsi and rdi registers.
Got it, working on it right now, did you happen to check if it was all fine with RBP ? (I gave it a quick check with stackbase:rbp and that seems ok ).
Ok, packages on the site updated and git. Please try again.
Hi johnsa,
Thanks for the new release. The rbp based stack frame works fine :
option casemap:none
option win64:3
Please grab the packages again, there were a few problems with stackbase:rsp.
Thanks,
John
I warned you guys some time ago about the complexity of what you guys are trying to achieve. I've never actually tried HJWASM but I can tell you from a Decision Table point of view, you guys are going 6x6 depths or even deeper as you introduce more and more parameters into your calling conventions. Worse than that, you're going to need 2 to 4 different decision tables. By the time you managed to 'solve' it (hopefully), it will become way too complicated and extremely not portable.
And btw, you guys haven't given enough credits to aw27. He may sound hostile, but he's the one pointing to you all the right things and saved you lots of years of future bugs. If I were you, I would suggest aw27 be part of the HJWASM team.
Quote from: coder on April 13, 2017, 03:23:22 PM
If I were you, I would suggest aw27 be part of the HJWASM team.
No way, I have already enough cans of worms to deal with. :badgrin:
He's been a big help pointing out stuff! All credit due. :t
But really that's the whole point of something like this, if people don't use it and don't tell us when things need fixing, they won't get fixed. The difference in our approach, and hopefully it's beneficial is that we're quite agile and can release updates often in an hour or less after spotting something, I don't think any other compiler/assembler would have such a fast turn-around. That being said often the updates are to fix things that shouldn't really be issues in the first place but that is the state of affairs with something this complex and without a massive set of automated regression tests. The ones that jwasm had are no longer fit for purpose and it really requires a massive effort to create a set of all-encompassing tests.
We've refactored a lot of the code around these options and calling conventions into totally separate code-paths (since 2.24) specifically for that reason. The functions were unmanageable and trying to handle all configurations in one which meant a slight change to address an RSP issue broke RBP and so on.. now that they're separated they won't cause regressions with each other. So I agree it's very complex, but I don't think it's in anyway not portable. We're not adding anything which isn't part of the fundamental requirements of the standards and calling conventions as they exist. Either you support it fully and properly, or not at all in which case you might as well just have ML64.
Further more, the "complicated" that you refer to is par for the course in writing a compiler/assembler .. the point is to move to "complicated" into the assembler so that the programmer/user doesn't have to deal with it when writing code, that is a trade-off I'm happy to make.. I for one just want to use PROCS, have things aligned, access locals/parameters and get on with my code knowing that it "just works" and is optimal (with a few options that I can change if required, but kept to a minimum).
Quote from: coder on April 13, 2017, 03:23:22 PMI've never actually tried HJWASM
I have been working with JWasm and HJWasm for more than seven years now (proof (http://www.masmforum.com/board/index.php?topic=10698.msg105231#msg105231)), on a daily basis. I am really grateful that Johnsa & Habran develop this tool further, especially since M$ ML has dropped some essential features in the 64-bit version. I am also grateful that aw27 keeps posting helpful advice on the "worms" in the can.
Since I have gone beyond the hello world & little snippets phase several years ago, I am grateful that my sources assemble in one or two seconds instead of 5 or 10 seconds, as with MASM. Kudos to the HJWasm team for their excellent work. And yes, try to keep it simple. Experienced coders can use their own prolog macros, if they believe they need exotic options.
Quote from: johnsa on April 13, 2017, 07:39:14 PM
He's been a big help pointing out stuff! All credit due. :t
Thanks. :biggrin:
There is another one to keep you busy then.
proc1 proc public FRAME ValuePtr: ptr
movq xmm0, qword ptr [rcx]
ret
proc1 endp
Disassemble to:
proc1:
000000013F8B1020 movd xmm0,dword ptr [rcx]
000000013F8B1024 ret
Quote from: jj2007 on April 13, 2017, 07:45:33 PM
I am also grateful that aw27 keeps posting helpful advice on the "worms" in the can.
I am not specially beta testing HJWASM, I am still primarily compiling with JWASM then I try HJWASM. When results diverge I investigate. Simple.
Will fix the movq now and update to 2.26 r2 :)
Fixed, new packages up and git updated. Still 2.26 dated 13th April.
deleted
I don't consider it "that" strange, if this was a funded professional venture, I wouldn't expect users to be compiling our regression tests for us, I would expect us to be able to release a new version on a less regular basis using a large set of generated test cases. Assuming users create test cases, they're definitely valuable but in terms of test coverage it's pretty poor waiting on something to go wrong and a case to be created for it.
(Note I did say to create a set of "ALL ENCOMPASSING" tests would be a massive effort.. as opposed to just a few specific cases).
Given that, it's all we have for now so I agree we should catalog all the ones we get in.
For example what we should have is a test-library of every imaginable combination of PROC, arguments, Uses, stackframe etc (obviously within reason).. The problem is to verify the accuracy of this you need a baseline for comparison, which we don't have as yet until this is all 100% stable.
I think we're reaching a point now where I feel comfortable that we've added enough "new" stuff, so really what I'd like is to leave it alone to stabilize and only release bug-fixes for a while. Then once that is done I will look to generate a full set of new regression tests that verify stack setup, alignments, locals, prologue/epilogue generation, uses etc.. As that is really where the source of all these issues have been.
For the proc related stuff you really want to do it at a binary level comparison wise.. and that is difficult if you don't have something you know to be correct to use for the comparison, so I'm thinking by letting it stabilise we can reach a point "albeit" manually where we're sufficiently satisfied to use what it generates at that point as the baseline for future comparison.
We went through a phase where there were a lot of encoding avx/avx2 and evex encoding fixes but those seem to be settled down now, and we've got a test-suite for them.
deleted
Well specifically for the issues we've been having lately, we need a test case that includes every single combination of proc type, stackbase, win64 setting with procs with a range of parameters, types, uses and locals
and we need to cover things like passing of struct by value, reference, immediates, float literals, string literals and so on.. so I would think we need a PROC test case which has about 100 or so combinations for rsp, rbp, system v (64bit only).
Then we need a totally different set for the vectorcall combinations.. so another 50-100 there.
It would probably take a few days to put it together.. BUT back to my point is that.. apart from hjwasm itself, what can produce the binary output of these combinations to test against ? so the product has to be at a stable point before we can generate the binary for comparison? and even then you'd need to go through each one and hand verify it first to be sure the base is sound.
If someone feels like taking up the challenge to create the test-cases for this :bgrin:
I will get around to it soon if not.
deleted
I'd certainly like to include any regression tests that are in existence ! :t
I've created a few on my side for some of the new features, but they've all been a bit sporadic to be honest, I'd rather put in the effort and make up a set now, also quite a few cases simply didn't exist before so they need new tests anyway, now is as good a time as any!
deleted