News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Crashes in HJWASM but works well in JWASM

Started by aw27, March 01, 2017, 04:37:40 AM

Previous topic - Next topic

jj2007

Quote from: aw27 on March 16, 2017, 07:22:50 PM
Quote from: jj2007 on March 16, 2017, 06:08:42 AM
Would MyFunc move xmm0 into the stack, or use it directly?

xmm0 will not go to the stack, is always passed as is.

My suggestion to use xmm registers in an INVOKE statement would involve 2 possibilities:
1) Just place them there in place of dummy placeholder parameters.
2) Load the xmm registers from whatever you put on the INVOKE command line. This is actually what the INVOKE already does for parameters that go into general purpose registers.

BTW, since you are a specialist in macros you know that macros can take xmm registers as parameters.
Well, thinking better, macros can take literally everything as parameters.  ::)

I've given option 1) a try:include \Masm32\MasmBasic\Res\JBasic.inc ; ## console demo, builds in 32- or 64-bit mode with ML, AsmC, JWasm, HJWasm ##
.code
MyFunc proc <cb xm3> _xmm1, _xmm2, _xmm3, arg4, arg5, arg6 ; callback, 3 xmm regs passed
  nop
  usedeb=1
  deb 4, "MyFunc, original xmm regs", o:xmm1, o:xmm2, o:xmm3 ; o: means oword size
  deb 4, "MyFunc, normal args in stack", x:arg4, x:arg5, x:arg6 ; x: means hex
  usedeb=0
  ret
MyFunc endp

Init ; OPT_64 1 ; put 0 for 32 bit, 1 for 64 bit assembly
  movaps xmm1, Oword16(1A1111111B1111111C1111111D111111h)
  movaps xmm2, Oword16(2A2222222B2222222C2222222D222222h)
  movaps xmm3, Oword16(3A3333333B3333333C3333333D333333h)
;   int 3
  jinvoke MyFunc, xmm1, xmm2, xmm3, 44444444h, 55555555h, 66666666h
  Inkey Chr$("This code was assembled with ", @AsmUsed$(1), " in ", jbit$, "-bit format")
EndOfCode


Output:MyFunc, original xmm regs
o:xmm1  1a111111 1b111111 1c111111 1d111111h
o:xmm2  2a222222 2b222222 2c222222 2d222222h
o:xmm3  3a333333 3b333333 3c333333 3d333333h
MyFunc, normal args in stack
x:arg4  44444444h
x:arg5  55555555h
x:arg6  66666666h
This code was assembled with ml64 in 64-bit format


Builds also as 32-bit code with HJWasm & friends. Under the hood (the 64-bit version):
00000001400018E2 | CC                            | int3                                          |
00000001400018E3 | 41 BA 66 66 66 66             | mov r10d, 66666666                            |
00000001400018E9 | 4C 89 54 24 28                | mov qword ptr ss:[rsp+28], r10                |
00000001400018EE | 41 BA 55 55 55 55             | mov r10d, 55555555                            |
00000001400018F4 | 4C 89 54 24 20                | mov qword ptr ss:[rsp+20], r10                |
00000001400018F9 | 41 B9 44 44 44 44             | mov r9d, 44444444                             | r9d:"@(\\w"
00000001400018FF | E8 FE F6 FF FF                | call 140001002                                |

...
0000000140001002 | 55                            | push rbp                                      |
0000000140001003 | 48 8B EC                      | mov rbp, rsp                                  |
0000000140001006 | 4C 89 4D 28                   | mov qword ptr ss:[rbp+28], r9                 |
000000014000100A | 48 81 EC 90 00 00 00          | sub rsp, 90                                   |
0000000140001011 | 90                            | nop                                           |
0000000140001012 | E8 49 0A 00 00                | call <jdebP>         (the deb macro)
...
00000001400018C2 | C9                            | leave                                         |
00000001400018C3 | C3                            | ret                                           |


The number of xmm regs passed can be 1-4, in this case 3: see <cb xm3>

This is what can done with macros. Your option 2) is more difficult to realise, because the PROLOG macro doesn't give you the list of arguments. It could probably be done in HJWasm itself, though. My example assembles even with ML64, but at that point, one could drop support for that one.

Project attached, requires MasmBasic of today.

johnsa

Just to keep you all updated,

There is a new branch (v2.21) on our git repository.

I've already completed all the arch sse/avx stuff and that is in that branch and tested.

You can now use:

OPTION ARCH:SSE
OPTION ARCH:AVX

or the command line switches -archSSE and -archAVX

and ANY code that is generated and uses any of : movss, movsd, movd, movq, movaps, movdqa, movdqu, movups or their avx counterparts will now be substituted with the correct version for either AVX or SSE.

AVX is set as the default.

I'm now working on adding the XMM arguments to INVOKE, so that should be done soon while Habran is investigating the stack alignment issues.

Cheers
John


aw27

Quote from: jj2007 on March 16, 2017, 11:50:25 PM
This is what can done with macros. Your option 2) is more difficult to realise, because the PROLOG macro doesn't give you the list of arguments. It could probably be done in HJWasm itself, though. My example assembles even with ML64, but at that point, one could drop support for that one.
Project attached, requires MasmBasic of today.

I was very curious to see the MasmBasic in action but after installing and running ml64, I just got errors. Am I missing something?

Microsoft (R) Macro Assembler (x64) Version 10.00.40219.01
Copyright (C) Microsoft Corporation.  All rights reserved.

Assembling: PassXmmRegs.asm
h:\Masm32\MasmBasic\Res\JBasic.inc(47) : error A2008:syntax error : .
h:\Masm32\MasmBasic\Res\JBasic.inc(48) : error A2008:syntax error : .
h:\Masm32\MasmBasic\Res\JBasic.inc(49) : error A2008:syntax error : .
*** using Res\JBasic32.lib ***
h:\Masm32\MasmBasic\Res\JBasic.inc(62) : error A2008:syntax error : rax
h:\Masm32\MasmBasic\Res\JBasic.inc(63) : error A2008:syntax error : rcx
h:\Masm32\MasmBasic\Res\JBasic.inc(64) : error A2008:syntax error : rdx
h:\Masm32\MasmBasic\Res\JBasic.inc(65) : error A2008:syntax error : rsi
h:\Masm32\MasmBasic\Res\JBasic.inc(66) : error A2008:syntax error : rdi
h:\Masm32\MasmBasic\Res\JBasic.inc(67) : error A2008:syntax error : rbx
h:\Masm32\MasmBasic\Res\JBasic.inc(68) : error A2008:syntax error : rbp
h:\Masm32\MasmBasic\Res\JBasic.inc(69) : error A2008:syntax error : rsp
h:\Masm32\MasmBasic\Res\JBasic.inc(558) : fatal error A1000:cannot open file : \Masm32\MasmBasic\Res\DualWin.inc


johnsa

update..

sub rsp,8 rogue item from aw27 fixed.
stack alignment to 16 working for win64:6 and 7 and 15 modes... fixed..

Busy adding xmm support to invoke now ;)

If you're all really lucky and I don't need to take a nap.. 2.21 might still be released today :)

jj2007

Quote from: aw27 on March 17, 2017, 01:54:25 AMAm I missing something?
...
fatal error A1000:cannot open file : \Masm32\MasmBasic\Res\DualWin.inc

Open the PassXmmRegs.asc in \Masm32\MasmBasic\RichMasm.exe and hit F6; the editor should show you a MessageBox "JBasic installed" - did you see that one?

Afterwards, you should have two new files:
C:\Masm32\MasmBasic\Res\pt.inc
C:\Masm32\MasmBasic\Res\DualWin.inc

I just tested with a fresh installation, and on C: something blocks the download of HJWasm32.
However, \Masm32\bin\ml64.exe should work fine, too - just insert OPT_Assembler ML under the EndOfCode.

aw27

Quote from: jj2007 on March 17, 2017, 02:45:04 AM
Open the PassXmmRegs.asc in \Masm32\MasmBasic\RichMasm.exe and hit F6; the editor should show you a MessageBox "JBasic installed" - did you see that one?
Yes, I see.
Quote
Afterwards, you should have two new files:
C:\Masm32\MasmBasic\Res\pt.inc
C:\Masm32\MasmBasic\Res\DualWin.inc
I don't see those. I just noticed a batch file bldallRM.bat
I run it and get:

**** 64-bit assembly ****


*** Assemble, link and run PassXmmRegs ***

*** Assemble using \masm32\bin\ml64 /c /Zp8  tmp_file.asm ***
Microsoft (R) Macro Assembler (x64) Version 10.00.40219.01
Copyright (C) Microsoft Corporation.  All rights reserved.

Assembling: tmp_file.asm
MASM : fatal error A1000:cannot open file : tmp_file.asm
*** Assembly Error ***

johnsa

xmm registers are now supported in invoke for real4/real8 types.



newproc proc FRAME arg1:qword, arg2:real4
movss xmm3,arg2
ret
newproc endp

invoke newproc, rax, xmm4                     ; xmm1 set to xmm4 via movaps
invoke newproc, rax, floatVar                   ; xmm1 set from mem via movd
invoke newproc, rax, xmm1 ; no op, as xmm1 == xmm1


jj2007

Quote from: aw27 on March 17, 2017, 03:21:22 AMMASM : fatal error A1000:cannot open file : tmp_file.asm

RichMasm deletes the tmp_file.asm after a successful build. When you hit F6 again in RichMasm, still no success?

Sorry for that - I sent you a PM.

Quote from: johnsa on March 17, 2017, 03:55:03 AM
xmm registers are now supported in invoke for real4/real8 types.

You should respect the speed limits, johnsa :eusa_naughty:
;)

johnsa

Speed limit is more of a "suggestion" to me ;)

All going to plan it should still be finished tonight.

aw27

Quote from: johnsa on March 17, 2017, 03:55:03 AM
xmm registers are now supported in invoke for real4/real8 types.



newproc proc FRAME arg1:qword, arg2:real4
movss xmm3,arg2
ret
newproc endp

invoke newproc, rax, xmm4                     ; xmm1 set to xmm4 via movaps
invoke newproc, rax, floatVar                   ; xmm1 set from mem via movd
invoke newproc, rax, xmm1 ; no op, as xmm1 == xmm1



Looking forward to test it ASAP.  :icon14:

johnsa

Hi all,

We're so slow.. only ready now ;)

V2.21 is up on the site, we've run all are testing and regression scripts and all seems to be in order.
The list of changes is in the documentation and as discussed previously :

real4/real8 via xmm support on invoke
fixed align 16 on stack local
fixed rogue sub rsp,8
fixed rbp relative stack locations
added command line switches -archSSE, -archAVX and OPTION ARCH:AVX|SSE to switch between generating sse or avx instructions in generated code
refactored the entire code-base to make use of this arch setup instead of hardcoded vmovdqa/movdqu etc..

Give it a test :)

aw27

Quote from: johnsa on March 17, 2017, 08:20:30 PM
real4/real8 via xmm support on invoke
It works! Actually it always worked in JWASM as well and I never noticed. :greenclp:

I found no bugs so far, but will try harder. Compiles very fast and code is about 2% smaller.


jj2007

Quote from: aw27 on March 18, 2017, 12:02:15 AMI found no bugs so far, but will try harder

That is the right spirit :icon_mrgreen:

So far everything fine with my big sources. MB assembles in
  2.2  AsmC
  2.8  HJWasm64
  3.2  HJWasm32
  7.0  Microsoft MASM 6.15 & 10.0

seconds. The first time the 64-bit version is faster. Did you test new compiler optimisations?

aw27

Quote from: jj2007 on March 18, 2017, 12:23:01 AM
Quote from: aw27 on March 18, 2017, 12:02:15 AMI found no bugs so far, but will try harder

That is the right spirit :icon_mrgreen:

So far everything fine with my big sources. MB assembles in
  2.2  AsmC
  2.8  HJWasm64
  3.2  HJWasm32
  7.0  Microsoft MASM 6.15 & 10.0

seconds. The first time the 64-bit version is faster. Did you test new compiler optimisations?

DMath64.asm: 21545 lines, 8 passes, 411 ms, 0 warnings, 0 errors
I don't know what else to optimize  :icon_cool:

TWell

If DMath64.asm belongs to library and had many functions, COMDAT is useful for it and [h]jwasm support it.
(ml64 don't have it, so it is not useful for that kind of one file libraries, so with it source must splitted to several parts)