News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

avx syntax???

Started by daydreamer, November 14, 2021, 03:25:00 AM

Previous topic - Next topic

daydreamer

I am not used to avx syntax avxmnemonic ymm0,ymm1,ymm2 yet
which is annoying

outc db 32 dup("0")
asciimul dw 1000,100,10,1,1000,100,10,1,1000,100,10,1,1000,100,10,1
digits = ascii input

lea eax, outc

lea ebx, asciimul
vmovaps ymm0, digits
vmovups ymm7, [eax]
vmovups ymm6, [ebx]
vpsubb ymm1, ymm0, ymm7
VPUNPCKLBW ymm1, ymm2, ymm0
; VPUNPCKHBW ymm1, ymm2, ymm0
vpmullw ymm4,ymm1,ymm6
VPHADDW ymm1,ymm2,ymm3
vmovd eax,xmm3


my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

mineiro

hello sir daydreamer;
I tested your avx code and it's not giving correct value.
I can try that to you, no problem. Can you write an xmm version?, so will be more easy to translate that to avx version.
Well, I do not understood that "add eax,1", ymm2(xmm2) register should be initialized I suppose with zeros?
Your idea it's a strong candidate to 64 bits version.

By other side I was thinking in a lookup table with size of 128kb, this will fit in most L1 processor cache and can get result faster to 8 millions conversion.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

daydreamer

I posted my SSE version earlier,in avx two register are used in operations,one is used for preserve previous value in for example mulps reg1,reg2
corrected code
vpsubb works
unpack gives wrong to hibytes in words unpack
I prefer the floating point SIMD instructions,dont lack divps,rcpps,sqrtps


my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

daydreamer

#3
avx syntax xmm1=xmm2+xmm3 :vaddps xmm1,xmm2,xmm3
almost there

                lea eax, outc
lea ebx, asciimul
vmovaps ymm0, digits ;load ascii numbers
vmovups ymm7, [eax] ;"0"'s
vmovups ymm6, [ebx] ;1000,100,10,1 ...
vpxor ymm5, ymm5,ymm5;zero
vpsubb ymm1, ymm0, ymm7
VPUNPCKLBW ymm2, ymm1, ymm5
; VPUNPCKHBW ymm1, ymm2, ymm0
vpmullw ymm4,ymm2,ymm6
VPHADDW ymm3,ymm4,ymm4
                VPHADDW ymm3, ymm3,ymm3
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

daydreamer

unrolled avx2 load 32bytes and unrolled twice

                lea eax, outc
lea ebx, asciimul
                vmovaps ymm0, digits
vmovups ymm7, [eax]
vmovups ymm6, [ebx]
vpxor ymm2, ymm2, ymm2
vpxor ymm5, ymm5, ymm5; zero
vpsubb ymm1, ymm0, ymm7
VEXTRACTF128 xmm2, ymm1, 1;split to 2 128bit regs
VPUNPCKLBW ymm1, ymm1, ymm5
VPUNPCKLBW ymm2, ymm2, ymm5
; VPUNPCKHBW ymm1, ymm2, ymm0
vpmullw ymm1,ymm1,ymm6
vpmullw ymm2, ymm2, ymm6
VPHADDW ymm3,ymm1,ymm1
VPHADDW ymm3, ymm3, ymm3
VPHADDW ymm4, ymm2,ymm2
VPHADDW ymm4, ymm4, ymm4
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding