The MASM Forum

General => The Campus => Topic started by: jj2007 on February 11, 2018, 12:18:46 PM

Title: Move a number to an XMM register
Post by: jj2007 on February 11, 2018, 12:18:46 PM
Some special ways to move a number into an XMM register:
xorps xmm0, xmm0 ; set to zero, 3 bytes

pxor xmm0, xmm0 ; set to zero, 4 bytes

pcmpeqb xmm0, xmm0 ; set to -1 = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFFh, 4 bytes

m2m edx, 127
movd xmm0, edx ; set to 1 ... 127, 7 bytes

movd xmm0, MyDword ; set to any DWORD range number, 8+4=12 bytes

movups xmm0, MyOword ; 7+16=23 bytes
Title: Re: Move a number to an XMM register
Post by: daydreamer on February 12, 2018, 12:27:06 AM
The PUNPCKLBW, PUNPCKLWD, PUNPCKLDQ
I found out by using same reg on both operand you can unpack same dword/real4 or Word or byte values to a full XMM reg
movd xmm0,eax
PUNPCKLDQ xmm0,xmm0
sorry but it works only with mmx regs
maybe combined with other unpack it works for those who havent SSSE3 PSHUFB to fill XMM reg with one byte

wonder if I start a workerthread, if xmm regs are already zeroed or not?

Title: Re: Move a number to an XMM register
Post by: jj2007 on February 12, 2018, 01:06:32 AM
How can you fill a "full" xmm reg? In my tests it fills either the lower or upper half... but pshufd does the full job:

include \masm32\MasmBasic\MasmBasic.inc         ; download (http://masm32.com/board/index.php?topic=94.0)
  Init
  mov eax, 12345678h
  movd xmm0,eax
  PUNPCKLDQ xmm0,xmm0
  deb 4, "PUNPCKLDQ", x:xmm0
  movd xmm0,eax
  PSHUFD xmm0,xmm0, 0
  deb 4, "PSHUFD   ", x:xmm0
EndOfCode


Result:
PUNPCKLDQ       x:xmm0          00000000 00000000 12345678 12345678
PSHUFD          x:xmm0          12345678 12345678 12345678 12345678


> wonder if I start a workerthread, if xmm regs are already zeroed or not?

It seems so, at least on Win7-64, but if it's not a documented feature, you better not rely on it 8)
Title: Re: Move a number to an XMM register
Post by: daydreamer on February 13, 2018, 05:27:22 AM
you can fill xmm regs with LUT of most useful mathematical functions, and when you dont need precision you can take two neighbouring Points and (x1+x2)/2 to get approximation of a Point between them
its useful to change signs of several values into a negative curve
so your zero out gets more useful
xorps xmm0,xmm0
subps xmm0,xmm1 ;xmm1 contains positive constants,for example for a sine curve

Title: Re: Move a number to an XMM register
Post by: jj2007 on February 13, 2018, 05:37:48 AM
Quote from: daydreamer on February 13, 2018, 05:27:22 AM
you can fill xmm regs with LUT of most useful mathematical functions, and when you dont need precision you can take two neighbouring Points and (x1+x2)/2 to get approximation of a Point between them

Sinus() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1334) uses a variant of this method. About 7 times faster than the FPU's fsin, and equally precise, i.e. REAL10 precision.
Title: Re: Move a number to an XMM register
Post by: daydreamer on February 13, 2018, 08:31:56 AM
Quote from: jj2007 on February 13, 2018, 05:37:48 AM
Quote from: daydreamer on February 13, 2018, 05:27:22 AM
you can fill xmm regs with LUT of most useful mathematical functions, and when you dont need precision you can take two neighbouring Points and (x1+x2)/2 to get approximation of a Point between them

Sinus() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1334) uses a variant of this method. About 7 times faster than the FPU's fsin, and equally precise, i.e. REAL10 precision.
Masmbasic looks nice,would be nice to try graphics interface,sprites?tile engine?