News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

HJWasm 2.26 release

Started by johnsa, April 12, 2017, 11:16:34 PM

Previous topic - Next topic

johnsa

Small'ish update 2.26 released.

Changes:

1) Reduced wasted stack-space for stackbase:RSP / win64:15 as identified while testing Nidud's alignment check examples.
2) Force stackbase:RBP as the default if nothing is specified (also based on testing Nidud's example, when no stackbase was specified , prologues were not generating as planned ). -> helps prevent unwanted side-effects.
3) Added support for USES YMM registers on stackbase:RBP (RSP already featured support for YMM and ZMM).

Cheers,
John and Habran


aw27

I have something here for you guys.

option casemap:none
option frame:auto
OPTION WIN64:15
OPTION ARCH:SSE
OPTION STACKBASE:RSP

.code

proc1 proc public FRAME uses xmm6 xmm7 xmm8 xmm9 xmm10 rsi rdi r12 cols: qword, rows : qword
   dec rows
   .if rows>=1
      invoke proc1, cols, rows
   .endif
   ret
proc1 endp

end

It disassembles to:
proc1:
000000013F611020  mov         qword ptr [rsp+8],rcx 
000000013F611025  mov         qword ptr [rsp+10h],rdx 
000000013F61102A  push        rsi 
000000013F61102B  push        rdi 
000000013F61102C  push        r12 
000000013F61102E  sub         rsp,70h 
000000013F611032  movdqa      xmmword ptr [rsp+20h],xmm6 
000000013F611038  movdqa      xmmword ptr [rsp+30h],xmm7 
000000013F61103E  movdqa      xmmword ptr [rsp+40h],xmm8 
000000013F611045  movdqa      xmmword ptr [rsp+50h],xmm9 
000000013F61104C  movdqa      xmmword ptr [rsp+60h],xmm10 
000000013F611053  dec         qword ptr [rsp+98h] 
000000013F61105B  cmp         qword ptr [rsp+98h],1 
000000013F611064  jb          proc1+5Bh (13F61107Bh) 
000000013F611066  mov         rcx,qword ptr [rsp+90h] 
000000013F61106E  mov         rdx,qword ptr [rsp+98h] 
000000013F611076  call        proc1 (13F611020h) 
000000013F61107B  movdqa      xmm6,xmmword ptr [rsp-30h] 
000000013F611081  movdqa      xmm7,xmmword ptr [rsp-20h] 
000000013F611087  movdqa      xmm8,xmmword ptr [rsp-10h] 
000000013F61108E  movdqa      xmm9,xmmword ptr [rsp] 
000000013F611094  movdqa      xmm10,xmmword ptr [rsp+10h] 
000000013F61109B  add         rsp,70h 
000000013F61109F  pop         r12 
000000013F6110A1  pop         rdi 
000000013F6110A2  pop         rsi 
000000013F6110A3  mov         rsi,qword ptr [rsp+18h] 
000000013F6110A8  mov         rdi,qword ptr [rsp+20h] 
000000013F6110AD  ret 

As you see it will corrupt the rsi and rdi registers.


johnsa

Got it, working on it right now, did you happen to check if it was all fine with RBP ? (I gave it a quick check with stackbase:rbp and that seems ok ).

johnsa

Ok, packages on the site updated and git. Please try again.

Vortex

Hi johnsa,

Thanks for the new release. The rbp based stack frame works fine :

option casemap:none
option win64:3


johnsa

Please grab the packages again, there were a few problems with stackbase:rsp.

Thanks,
John

coder

I warned you guys some time ago about the complexity of what you guys are trying to achieve. I've never actually tried HJWASM but I can tell you from a Decision Table point of view, you guys are going 6x6 depths or even deeper as you introduce more and more parameters into your calling conventions. Worse than that, you're going to need 2 to 4 different decision tables. By the time you managed to 'solve' it (hopefully), it will become way too complicated and extremely not portable.

And btw, you guys haven't given enough credits to aw27. He may sound hostile, but he's the one pointing to you all the right things and saved you lots of years of future bugs. If I were you, I would suggest aw27 be part of the HJWASM team.

aw27

Quote from: coder on April 13, 2017, 03:23:22 PM
If I were you, I would suggest aw27 be part of the HJWASM team.
No way, I have already enough cans of worms to deal with.  :badgrin:

johnsa

He's been a big help pointing out stuff! All credit due.  :t

But really that's the whole point of something like this, if people don't use it and don't tell us when things need fixing, they won't get fixed. The difference in our approach, and hopefully it's beneficial is that we're quite agile and can release updates often in an hour or less after spotting something, I don't think any other compiler/assembler would have such a fast turn-around. That being said often the updates are to fix things that shouldn't really be issues in the first place but that is the state of affairs with something this complex and without a massive set of automated regression tests. The ones that jwasm had are no longer fit for purpose and it really requires a massive effort to create a set of all-encompassing tests.

We've refactored a lot of the code around these options and calling conventions into totally separate code-paths (since 2.24) specifically for that reason. The functions were unmanageable and trying to handle all configurations in one which meant a slight change to address an RSP issue broke RBP and so on.. now that they're separated they won't cause regressions with each other. So I agree it's very complex, but I don't think it's in anyway not portable. We're not adding anything which isn't part of the fundamental requirements of the standards and calling conventions as they exist. Either you support it fully and properly, or not at all in which case you might as well just have ML64.

Further more, the "complicated" that you refer to is par for the course in writing a compiler/assembler .. the point is to move to "complicated" into the assembler so that the programmer/user doesn't have to deal with it when writing code, that is a trade-off I'm happy to make.. I for one just want to use PROCS, have things aligned, access locals/parameters and get on with my code knowing that it "just works" and is optimal (with a few options that I can change if required, but kept to a minimum).

jj2007

Quote from: coder on April 13, 2017, 03:23:22 PMI've never actually tried HJWASM

I have been working with JWasm and HJWasm for more than seven years now (proof), on a daily basis. I am really grateful that Johnsa & Habran develop this tool further, especially since M$ ML has dropped some essential features in the 64-bit version. I am also grateful that aw27 keeps posting helpful advice on the "worms" in the can.

Since I have gone beyond the hello world & little snippets phase several years ago, I am grateful that my sources assemble in one or two seconds instead of 5 or 10 seconds, as with MASM. Kudos to the HJWasm team for their excellent work. And yes, try to keep it simple. Experienced coders can use their own prolog macros, if they believe they need exotic options.

aw27

Quote from: johnsa on April 13, 2017, 07:39:14 PM
He's been a big help pointing out stuff! All credit due.  :t

Thanks.  :biggrin:

There is another one to keep you busy then.

proc1 proc public FRAME ValuePtr: ptr
   movq xmm0, qword ptr [rcx]
   ret
proc1 endp

Disassemble to:
proc1:
000000013F8B1020  movd        xmm0,dword ptr [rcx] 
000000013F8B1024  ret 


aw27

Quote from: jj2007 on April 13, 2017, 07:45:33 PM
I am also grateful that aw27 keeps posting helpful advice on the "worms" in the can.

I am not specially beta testing HJWASM, I am still primarily compiling with JWASM then I try HJWASM. When results diverge I investigate. Simple.

johnsa

Will fix the movq now and update to 2.26 r2 :)

johnsa

Fixed, new packages up and git updated. Still 2.26 dated 13th April.

nidud

#14
deleted