What is the fastest way to shift an XMMx to right or left by n bit
PSLLDQ, PSRLDQ, PSRAD etc, depending on your task. Check the manuals.
jj2007
"the PS{L,R}LDQ take a 'byte shift' argument rather than a bit shift argument"
psllw, d, q
psraw, d
psrlw, d, q
are all bit shift. However there's no pslaw (which would be same as psllw), or psraq - dunno why.
pslldq psrldq are, as u say, byte shift
BUT you're not done yet, because bits shifted out are lost. If u really want to shift the entire XMMx, as you say, u want the bits shifted out to go into the next packed data element to right or left.
So, use psllq and psrlq, and figure out how to get the lost bits in the middle into the other half of the XMMx ...
I see a couple ways to do it but not cleanly; let us know when u figure out the fastest way :biggrin:
I think that I have a problem in the direction
To see the change for XMM, I copy it to memory, then read bytes bytes, then read every bit and then print
lea edi,pr__data
lea esi,pr__str
MOVDQU [edi],XMM0
mov n__ ,0
St_Prn_: mov ecx ,n__
cmp ecx,16
je exPrn_
XOR EAX,EAX
mov edx,8
mov ah,byte ptr [edi]
shah_: shl ah,1
jc Is1_
mov byte ptr [esi],"0"
jmp OkNext_
Is1_: mov byte ptr [esi],"1"
OkNext_: dec edx
add esi,2
cmp edx,0
jne shah_
invoke crt_wprintf, cfm$(".%s") ,offset pr__str
lea esi,pr__str
inc n__
inc edi
jmp St_Prn_
exPrn_:
ret
The result was as follows:
XMM0: {XMM = 2 case C1[ 11111111b ] and C2[ 10011001b ]}
{C1}.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111
{C2}.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001
psllw XMM0,1 :Word
{C1}.11111110.11111111.11111110.11111111.11111110.11111111.11111110.11111111.11111110.11111111.11111110.11111111.11111110.11111111.11111110.11111111
{C2}.00110010.00110011.00110010.00110011.00110010.00110011.00110010.00110011.00110010.00110011.00110010.00110011.00110010.00110011.00110010.00110011
pslld XMM0,1 :DWord
{C1}.11111110.11111111.11111111.11111111.11111110.11111111.11111111.11111111.11111110.11111111.11111111.11111111.11111110.11111111.11111111.11111111
{C2}.00110010.00110011.00110011.00110011.00110010.00110011.00110011.00110011.00110010.00110011.00110011.00110011.00110010.00110011.00110011.00110011
psllq XMM0,1 :QWord
{C1}.11111110.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111110.11111111.11111111.11111111.11111111.11111111.11111111.11111111
{C2}.00110010.00110011.00110011.00110011.00110011.00110011.00110011.00110011.00110010.00110011.00110011.00110011.00110011.00110011.00110011.00110011
psraw XMM0,1 :
{C1}.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111
{C2}.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100
PSRAD XMM0,1 :
{C1}.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111
{C2}.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100.11001100
PSLLDQ XMM0,1 :??
{C1}.00000000.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111.11111111
{C2}.00000000.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001.10011001
I'm not sure but I think the PSRAD PSRAW gives the result (but is rotate , not shift)
I'm waiting for confirmation from the experts
xmm0=
.11111111.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.01111110.11111111
psraw XMM0,1=
.01111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.00111111.10111111.11111111
*I'm not sure that used the right method to print the content of the XMM0, please any one knows the correct way, I ask him to teach me
Quote from: mabdelouahab on March 07, 2015, 07:02:23 PM
jj2007
"the PS{L,R}LDQ take a 'byte shift' argument rather than a bit shift argument"
Good. So you checked the manuals :t
PSRAD looks OK?