News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Operand types with AVX instructions

Started by dstanfill, August 31, 2016, 09:40:45 AM

Previous topic - Next topic

dstanfill

I am working on getting a quite large, quite popular project written in MASM working with HJWASM. I'm running into a discrepancy with accepted argument types.

MASM accepts both of the following as valid and works as expected.
(SUBPD xmm1, xmm2/m128)
subpd xmm1, QWORD PTR memaddr
(SUBSD xmm1, xmm2/m64)
subsd xmm2, QWORD PTR memaddr

In the actual program these are defined as

XPTR  EQU <QWORD PTR>
XMM_BIGVAL EQU XPTR [basemem+offset]

subsd xmm1, XMM_BIGVAL
...
subpd xmm1, XMM_BIGVAL

HJWASM gets upset with a nice Error A2049: Invalid instruction operands. I can fix the subpd by changing XPTR to a XMMWORD PTR, however subsd will throw an error there.

Is there some flag or magic I can use to make HJWASM complain less and behave like MASM in this case (treat all as memory pointers of the right size despite type being QWORD PTR)? Currently I am using hjwasm -elf64 -c -Zm -Zf on Linux x64

jj2007

You might try this option:-Zg Generated code is to exactly match Masm's one
Hope it helps - the HJWasm crew seems on holidays.

Other things to try:
OWORD ptr
movlps xmm0, qword ptr [mem]   ; with movhps, often faster than a movupd

rrr314159

Do you have the latest HJWASM? It had a problem like this about a year ago: insisting on a prefix, XMMWORD PTR, that ML64 didn't require. Habran fixed it about 6 months ago, IIRC. I'm not quite sure it's the same problem you have, but it was similar. Don't know if the Linux version was also fixed. Try the latest version. If that doesn't help recommend you PM habran, or the rest of the crew.
I am NaN ;)

hutch--

ML64 does require the full data size definition on YMM registers.


    vmovntdqa ymm0, YMMWORD PTR [rcx+r10]
    vmovntdq YMMWORD PTR [rdx+r10], ymm0


Is it only an issue of typing ?

johnsa

I wish we were on holiday! :)

We're busy finalizing v2.15 of HJWASM.. hence being so quiet.

We've fixed the eip/rip related bug that was mentioned on the forums but the bulk of the work has been around implementing VECTORCALL calling convention and standardized simd types.
I'll have a look at this issue and if something crops up that requires attention we'll squeeze it into 2.15

johnsa

ok,

I've gone through all the scenarios I can think of testing this using the latest hjwasm source and this works perfectly:



.data
        aDouble REAL8 1.0
bDouble REAL8 2.0
XPTR equ <QWORD PTR>
align 16
aVec dq 2 DUP (0)
bVec __m128i <0,0,0,0>

.code
vmovsd xmm0,aDouble
vmovsd xmm1,bDouble
vsubsd xmm1,xmm1,aDouble
vsubsd xmm1,xmm1,qword ptr bDouble
vsubsd xmm1,xmm1,XPTR bDouble
XMEM equ XPTR bDouble+0
vsubsd xmm1,xmm1,XMEM
vmovaps xmm0,aVec
vmovaps xmm1,bVec
XMEM2 equ XPTR aVec+8
vsubsd xmm1,xmm1,XMEM2
XMEM3 equ XPTR aVec+0
XMEM4 equ XPTR bVec+0
vmovaps xmm0,XMEM4
vsubpd xmm0,xmm0,XMEM3

movsd xmm0,aDouble
movsd xmm1,bDouble
subsd xmm1,aDouble
subsd xmm1,qword ptr bDouble
subsd xmm1,XPTR bDouble
XMEM5 equ XPTR bDouble+0
subsd xmm1,XMEM5
movaps xmm0,xmmword ptr aVec
movaps xmm1,bVec
XMEM6 equ XPTR aVec+8
subsd xmm1,XMEM6
XMEM7 equ xmmword ptr aVec+0
XMEM8 equ bVec+0
movaps xmm0,XMEM8
subpd xmm0,XMEM7



So this should be working for you with 2.14 ?

one note however, this line of yours should give you a problem:
subpd xmm1, XMM_BIGVAL

subpd expects a full xmm sized source,

XPTR  EQU <QWORD PTR>
XMM_BIGVAL EQU XPTR [basemem+offset]

is going to land up giving you:

qword ptr [basemem+offset] which will work perfect for subSD, but not for PD.

dstanfill

I pulled down and built from latest git source, version says 2.14.

This is the version that works in ml64

XPTR EQU <QWORD PTR>
XMM_BIGVAL EQU XPTR [mem+offset]

; This works
subpd xmm1, XMM_BIGVAL

; This works also
subsd xmm1, XMM_BIGVAL

However in HJWASM latest (tried -Zg with no effect)
; Try QWORD
XPTR EQU <QWORD PTR>
XMM_BIGVAL EQU XPTR [mem+offset]

; This fails
subpd xmm1, XMM_BIGVAL

; This works
subsd xmm1, XMM_BIGVAL

-------------------------------------------

; Try XMMWORD
XPTR EQU <XMMWORD PTR>
XMM_BIGVAL EQU XPTR [mem+offset]

; This works
subpd xmm1, XMM_BIGVAL

; This fails
subsd xmm1, XMM_BIGVAL



johnsa

That's completely correct!

you can't execute a packed vector subtraction against a qword, it has to be an xmmword either by declaration/type or via forced type reference.
In this case ML64 is wrong in that it's not warning about the incompatible type.

I have noted however that the vex forms (prefixed with v) will allow it and override the type.
I'll see if I can apply this to the sse forms as well.. even though it's technically wrong and it makes me feel dirty doing it :)

dstanfill

Oh I agree it makes logical sense, unfortunately the program I am working on is several hundreds of thousands of lines and made quite frequent use of this quirk in ml64. The code definitely works, so I was wishing for a MASM compatibility mode.

I understand saying it is completely correct to deny subpd to deference a QWORD as m128, however I could see allowing subsd and company to dereference a XMMWORD PTR and only access the lower m64.

For me the alternative is a lot of work creating casted QWORD PTRs for several thousand places..

Is there any interest and allowing this behavior to match ml64, perhaps with -Zg? Or should I get to work?

johnsa

As I mentioned, it appears to allow it in HJWASM when using the vex encoded form:
vsubpd and vsubsd

So given that is already allowed I'll have a look and see if we can make it consistent and do the same for the SSE form.
(It's a bit of a fiddle because they're handled by different instruction tables).

I'll let you know if it's do-able and if so it will be in 2.15 (which should be available very shortly, as in a day or so once we've finalised the testing our side).

johnsa

Just to let you know we've got this in and working and setup to only be applied with -Zg switch on.
movaps, movups, addps, addpd, subps, subpd will now all automatically promote the memory reference to xmmword ptr regardless of the type.
This will be up shortly in 2.15

dstanfill

That is great news, thank you for getting that in there.

Hopefully that is the last issue of compatibility with this project.

rrr314159

For those misguided souls who still cling to ML64, please contrast this response from the HJWasm team with Microsoft's responses to the user community. If, that is, you can find any.
I am NaN ;)

FORTRANS

Hi,

Quote from: rrr314159 on September 02, 2016, 07:12:18 AM
For those misguided souls who still cling to ML64, please contrast this response from the HJWasm team with Microsoft's responses to the user community. If, that is, you can find any.

   Noted.  Kudos to the team.

Regards,

Steve N.

hutch--

 :biggrin:

> For those misguided souls who still cling to ML64, please contrast this response from the HJWasm team with Microsoft's responses to the user community. If, that is, you can find any.

With no deference to the HJWASM team, note that MASM has been around since 1982 and is still current with the most recent version dating 2015. Assembler come and assemblers go but MASM just keeps on going.  :P