Hi,
I just can't seem to find the right info on how to create a SIMD module. Or elementary Hello function program.
I get things like..
movaps 0xffffffe8(%ebp),%xmm0
8f: 0f 58 45 d8 addps 0xffffffd8(%ebp),%xmm0
93: 0f 29 45 c8 movaps %xmm0,0xffffffc8(%ebp
But I may as well point to a named variable here?
movaps xmm1,var
addps xmm1, xmm7
movaps var, xmm1
Also, what is the right way to create a module? Is there need for a stack frame? Or is it a matter of preloading the registers?
Can I do?
proc_name:
addps xmm1,xmm2
ret
movaps xmm2,var
call proc_name
Regards,
If using movaps, be sure the data is 16 bit aligned, else use movups.
Not sure what you mean by SIMD module?
If you meant SIMD proc functions, you can do it like this.
.data
align 16
var real4 1.0,2.0,3.0,4.0
var2 real4 10.0,2.0,7.0,6.0
.code
Addfunction proc
movaps xmm1,var
movaps xmm7,var2
addps xmm1,xmm7
movaps var,xmm1
ret
Addfunction endp
call Addfunction
;or like this:
Addfunction2 proc
movaps xmm1,var
addps xmm1,var2
movaps var,xmm1
ret
Addfunction2 endp
call Addfunction2
; or this way
Addfunction3 proc uses esi edi Item1:DWORD,Item2:DWORD
mov esi,Item1
mov edi,Item2
movaps xmm0,oword ptr[esi]
addps xmm0,oword ptr[edi]
movaps oword ptr[esi],xmm0
ret
Addfunction3 endp
invoke Addfunction3,addr var,addr var2
So this creates the stack frame in the background for "Addfunction3"? And using pointers to non SIMD registers?
Addfunction3 proc uses esi edi Item1:DWORD,Item2:DWORD
mov esi,Item1
mov edi,Item2
movaps xmm0,oword ptr[esi]
addps xmm0,oword ptr[edi]
movaps oword ptr[esi],xmm0
ret
Addfunction3 endp
or can I move Item to esi at any time?
mov esi,Item1
mov edi,Item2
Addfunction3 proc
movaps xmm0,oword ptr[esi]
addps xmm0,oword ptr[edi]
movaps oword ptr[esi],xmm0
ret
Addfunction3 endp
Great 8)
Back at it,
So it took me a while to notice it was the integers getting divided into 4 bytes, in conjunction with the align16 instruction.
"Dividing the xmm registers into 16 bytes"
Which made me wonder what the outcome of an integer movaps instruction on an aligned4 register will be?
Will it be: int32, int32, int32, int32,
Or will it be: int32, - , - , - ,
Regards,
movaps must be 16 byte aligned.
you can use movups if unaligned, or movss for a single int32/real4
For me it would be movss then..
I was browsing to forum to rediscover the topic that hosted this link, had to dig down in my bookmarks to find it. I think it's brilliant to study SIMD.
http://softpixel.com/~cwright/programming/simd/sse.php
Actually my problem is more complicated, I have to copy my value into the adjacent variables.
So, does having "Var real8 1, 2 " mean the values of Var end up in register "xmmx = 64,64"
by using the movapd instruction alone?
Or do you have to use logic to do it like so,
movsd xmm2,int64 ;move int64 into lower quadrant
shufpd xmm1,xmm2,0 ;flip quadrants 0 to 1, move result into 1
movsd xmm1,xmm2 ;move int64 into lower quadrant
something like this?
paddb xmm0,xmm1 ; A+B 16*8bit
paddsb xmm0,xmm1 ; A+B 16*8bit with saturation {-128...127}
paddusb xmm0,xmm1 ; A+B 16*8bit with saturation {0...255}
I'm not exactly sure what you're calculating here.
Maybe if you explain in plain arithmetic what you want to achieve, we can try to solve it with SIMD.
I changed my question a bit, hope its more legible like this.
I have a little VB.net console app compiler going but not really in the position to test this yet.
.data
align 16
Real8A real8 1.0,2.0
Real8B real8 0.0,0.0
.code
movapd xmm0,Real8A ; 1.0,2.0
shufpd xmm0,xmm0,01b ; exchange low 64bit with high 64bit
movapd Real8B,xmm0 ; 2.0,1.0
Top!