The MASM Forum

General => The Campus => Topic started by: Farabi on December 21, 2012, 07:58:19 PM

Title: [SSE2]Make all bytes positive
Post by: Farabi on December 21, 2012, 07:58:19 PM
Im substracting 16 8bit integer and want to have the result all positive. How to do that?
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 21, 2012, 08:26:06 PM
Something like this?

.code
whatever   db 123, -127, 99, 255, 128, 127, 0, 100, 123, -127, 99, 255, 128, 127, 0, 100
start:
   mov eax, 7f7f7f7fh
   movd xmm1, eax
   pshufd xmm1, xmm1, 0
   movups xmm0, oword ptr whatever
   andps xmm0, xmm1
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 21, 2012, 08:29:33 PM
Quote from: jj2007 on December 21, 2012, 08:26:06 PM
Something like this?

.code
whatever   db 123, -127, 99, 255, 128, 127, 0, 100, 123, -127, 99, 255, 128, 127, 0, 100
start:
   mov eax, 7f7f7f7fh
   movd xmm1, eax
   pshufd xmm1, xmm1, 0
   movups xmm0, oword ptr whatever
   pandn xmm0, xmm1


:t Well I dont get it how it work yet, but yes that is what I want. I though there are a single instruction out there but that should sufficient, thanks.
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 21, 2012, 08:32:04 PM
It seems andps is the one to choose, not pandn - see corrected code above.
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 21, 2012, 08:52:09 PM
Can I know what this code does?

pshufd xmm1, xmm1, 0

It doesnot seems did nothing.
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 21, 2012, 09:07:03 PM
Farabi, do you want to have the absolute value of the difference? (abs(b-a))
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 21, 2012, 09:14:57 PM
Quote from: Farabi on December 21, 2012, 08:52:09 PM
Can I know what this code does?

pshufd xmm1, xmm1, 0

It doesnot seems did nothing.

It propagates the lowest dword to the other three dwords of the xmm reg, so that xmm1 contains
7f7f7f7f7f7f7f7f7f7f7f7f7f7f7fh
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 03:48:19 AM
Quote from: qWord on December 21, 2012, 09:07:03 PM
Farabi, do you want to have the absolute value of the difference? (abs(b-a))

I think integer will always be absolute.
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 22, 2012, 04:08:20 AM
Quote from: Farabi on December 22, 2012, 03:48:19 AM
Quote from: qWord on December 21, 2012, 09:07:03 PM
Farabi, do you want to have the absolute value of the difference? (abs(b-a))

I think integer will always be absolute.

That doesn't answer the question, Onan. Can you post a real example what you want to see?
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 12:40:37 PM
Quote from: jj2007 on December 22, 2012, 04:08:20 AM
Quote from: Farabi on December 22, 2012, 03:48:19 AM
Quote from: qWord on December 21, 2012, 09:07:03 PM
Farabi, do you want to have the absolute value of the difference? (abs(b-a))

I think integer will always be absolute.

That doesn't answer the question, Onan. Can you post a real example what you want to see?

Sorry maybe Im misunderstood with the term "absolute".

I need to substract a pixel and check if the result was less than ten, and to check all of that I need all to be positive, if is it negative, I need to add another cycle to check wheter it negative or not.


mov edx,bm.bmBits
movd xmm1,[edx]
pxor xmm0,xmm0
psubb xmm0,xmm1
lea edx,buff
movd [edx],xmm0


You want me to post the whole project?
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 12:46:31 PM
I cant use "pshufd" why?
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 22, 2012, 01:06:53 PM
perhaps it's the addressing mode ?
it's SSE2 - i figure your CPU does that
you must be using .686/.MMX/.XMM or the other instructions wouldn't work  :P
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 01:21:41 PM
Quote from: dedndave on December 22, 2012, 01:06:53 PM
perhaps it's the addressing mode ?
it's SSE2 - i figure your CPU does that
you must be using .686/.MMX/.XMM or the other instructions wouldn't work  :P

I can use "psubb" which is SSE2 instruction but not  "pshufd" why?
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 22, 2012, 01:32:43 PM
Quote from: Farabi on December 22, 2012, 01:21:41 PM
Quote from: dedndave on December 22, 2012, 01:06:53 PM
perhaps it's the addressing mode ?
it's SSE2 - i figure your CPU does that
you must be using .686/.MMX/.XMM or the other instructions wouldn't work  :P

I can use "psubb" which is SSE2 instruction but not  "pshufd" why?
error message?

Also remarks the instruction pcmpGTb (and pcmpEQb)
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 02:42:14 PM
Quote from: qWord on December 22, 2012, 01:32:43 PM
Quote from: Farabi on December 22, 2012, 01:21:41 PM
Quote from: dedndave on December 22, 2012, 01:06:53 PM
perhaps it's the addressing mode ?
it's SSE2 - i figure your CPU does that
you must be using .686/.MMX/.XMM or the other instructions wouldn't work  :P

I can use "psubb" which is SSE2 instruction but not  "pshufd" why?
error message?

Also remarks the instruction pcmpGTb (and pcmpEQb)

Here is the error message "error A2008: syntax error : xmm"
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 02:52:29 PM
I Used JWAsm and it worked.
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 03:00:56 PM
Quote from: jj2007 on December 21, 2012, 08:26:06 PM
Something like this?

.code
whatever   db 123, -127, 99, 255, 128, 127, 0, 100, 123, -127, 99, 255, 128, 127, 0, 100
start:
   mov eax, 7f7f7f7fh
   movd xmm1, eax
   pshufd xmm1, xmm1, 0
   movups xmm0, oword ptr whatever
   andps xmm0, xmm1


JJ I think you just remove the positive sign bit, not change the value to a correct positive value.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 22, 2012, 04:35:53 PM
maybe you can make an SSE equiv based on this concept
mov eax,n
cdq
xor eax,edx
sub eax,edx

the byte version would be
mov al,n
cbw
xor al,ah
sub al,ah


Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 22, 2012, 05:24:57 PM
Quote from: Farabi on December 22, 2012, 03:00:56 PM
JJ I think you just remove the positive sign bit,

Yes

Quotenot change the value to a correct positive value.

Until this point, you have not explained what the "correct positive value" would be. If your negative input byte is -123, is the "correct positive value" +123, or zero, or what? GIVE US A RULE.

We need more context. For example, how often will the negative value happen? You can detect it with a member of the pcmpGTb family (as mentioned by qWord), and then manipulate the bytes accordingly, either with "normal" or SSE instructions.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 22, 2012, 06:33:43 PM
i think what he's saying is...
if you remove the sign bit from a positive number, no adjustment is necessary (nothing happens)
if you remove the sign bit from a negative number, it needs adjustment

11111111 is = -1
01111111 is not = +1
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 22, 2012, 06:41:28 PM
Dave, you can't remove the sign bit from a positive number :eusa_naughty:
Jokes apart, let's wait if Farabi can formulate a rule...
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 22, 2012, 08:40:46 PM
I want -123 become 123. -1 become 1. THat is it.

I can simply use neg, but still it need another cycle for checking each bytes for negative. I though there are a single instruction for this.
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 22, 2012, 10:50:29 PM
Ok, now I understand. My idea was to go this road:

.data
whatever   db 123, -127, 99, 255, 128, 127, 0, 100, 123, -127, 99, 255, 128, 127, 0, 100
.code
        movups xmm0, oword ptr whatever
        movups xmm1, oword ptr whatever
        mov eax, 7f7f7f7fh
        movd xmm2, eax
        pshufd xmm2, xmm2, 0
        int 3
        pcmpgtb xmm1, xmm2
        pmovmskb eax, xmm1        ; set byte mask in eax

Status after pcmpgtb:
XMM0 64007F80 FF63817B 64007F80 FF63817B
XMM1 00000000 00000000 00000000 00000000
XMM2 7F7F7F7F 7F7F7F7F 7F7F7F7F 7F7F7F7F


Bad luck, I expected some bytes in Xmm1 set to FF. Right now I have no time to investigate further. Launch Olly and try your luck :icon14:
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 12:33:40 AM
MMX and SSE are not exactly my forte'
but - i wouldn't mind getting my feet wet   :P

not sure what the best way is to set an XMM register to all 0's
but, let's say XMM1 is all 0's

pcmpgtb xmm1,oword ptr whatever

this does the CBW for us (equivalent in this case - doesn't actually make them words)
for each byte in "whatever", the corresponding byte is XMM1 is all 1's if the source byte is negative
from there, you can do the same thing as i showed you earlier

you get the "whatever" bytes into XMM2 (again, not sure what the best way is)

then XOR xxm2,xmm1 (not sure what the instruction is)

then SUB (bytes) xxm2,xmm1 (not sure what the instruction is)

mov al,n
cbw
xor al,ah
sub al,ah
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 12:51:54 AM
I got it!

include \masm32\MasmBasic\MasmBasic.inc        ; download (http://masm32.com/board/index.php?topic=94.0)
.data
whatever   db 123, 99, 127, -127, 255, 128, 0, 100, 123, -127, 255, 128, 99, 127, 0, 100
        Init
        movups xmm0, oword ptr whatever
        movups xmm1, oword ptr whatever
        or eax, -1
        movd xmm2, eax
        pshufd xmm2, xmm2, 0
        pcmpgtb xmm1, xmm2
        pmovmskb eax, xmm1        ; set byte mask in eax
        Inkey Right$(Bin$(eax), 16)
        Exit
end start

Output (read from right to left):
1111000111000111
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 12:58:19 AM
i was close   :P
my first whack at SSE
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 02:27:42 AM
i don't know why this doesn't do anything - lol

;###############################################################################################

        .XCRef
        .NoList
        INCLUDE    \Masm32\Include\Masm32rt.inc
        .686p
        .MMX
        .XMM
        INCLUDE    \Masm32\Macros\Timers.asm
        .List

;###############################################################################################

        .DATA

oData   db 123,99,127,-127,-1,-128,0,100,123,-127,-1,-128,99,127,0,100

;###############################################################################################

        .CODE

;***********************************************************************************************

_main   PROC

        call    show16s
        movups  xmm0,oword ptr oData
        xorps   xmm1,xmm1
        pcmpgtb xmm1,xmm0
        xorps   xmm0,xmm1
        psubb   xmm0,xmm1
        movups  oword ptr oData,xmm0
        call    show16u

        inkey
        exit

_main   ENDP

;***********************************************************************************************

show16s PROC

        mov     esi,offset oData
        mov     ebx,16

sh16s0: movsx   eax,byte ptr [esi]
        print   str$(eax),44,32
        inc     esi
        dec     ebx
        jnz     sh16s0

        print   chr$(13,10)
        ret

show16s ENDP

;***********************************************************************************************

show16u PROC

        mov     esi,offset oData
        mov     ebx,16

sh16u0: movzx   eax,byte ptr [esi]
        print   ustr$(eax),44,32
        inc     esi
        dec     ebx
        jnz     sh16u0

        print   chr$(13,10)
        ret

show16u ENDP

;###############################################################################################

        END     _main
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 02:48:47 AM
Try my version :biggrin:
-123 +99 +127 -127 -1 -128 +0 +100 +123 -127 -1 -128 +99 +127 +0 -100
+123 +99 +127 +127 +1 +127 +0 +100 +123 +127 +1 +127 +99 +127 +0 +100


Besides, your code seems to produce exactly the same result :t

123, 99, 127, -127, -1, -128, 0, 100, 123, -127, -1, -128, 99, 127, 0, 100,
123, 99, 127, 127, 1, 128, 0, 100, 123, 127, 1, 128, 99, 127, 0, 100,


Well, almost. Is byte 128 positive?
;-)
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 02:57:45 AM
oops
i was using ML version 6.14
the PCMPGTB and PSUBB instructions were using MMX registers, not XMM registers   :biggrin:

that's my first SSE code   :eusa_dance:

what ? - you haven't compared timing ?
is that someone else using Jochen's ID ?????   :lol:
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 03:15:58 AM
oh - didn't see the question

the bit pattern 10000000 is -128 as a signed byte, +128 as an unsigned byte
it is a special case (like 0) because, to negate it, you do nothing   :P
of course, +128 is not in the range of signed byte values

Try my version :biggrin:
-123 +99 +127 -127 -1 -128 +0 +100 +123 -127 -1 -128 +99 +127 +0 -100
+123 +99 +127 +127 +1 +127 +0 +100 +123 +127 +1 +127 +99 +127 +0 +100
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 03:21:37 AM
Quote from: dedndave on December 23, 2012, 03:15:58 AM
oh - didn't see the question

That was more a rhetorical question (and your answer is a lil' bit misleading, too - the whole thread is on signed bytes...)
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 03:23:55 AM
anyways, this seems to work ok...

        movups  xmm0,oword ptr oData
        xorps   xmm1,xmm1
        pcmpgtb xmm1,xmm0
        xorps   xmm0,xmm1
        psubb   xmm0,xmm1
        movups  oword ptr oData,xmm0
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 03:26:28 AM
Quote from: jj2007 on December 23, 2012, 03:21:37 AM
and your answer is a lil' bit misleading, too - the whole thread is on signed bytes...

they are no longer signed if you take the absolute value, eh ?
besides - you cannot have +128 in the world of signed bytes
so, you must consider them to be unsigned
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 03:30:21 AM
Yes, but Farabi wants positive bytes. 128 is not a positive byte, so my code converts it to +127...
(did you know that around Christmas people get nervous and stressed and start wars for virtually no reason?  :icon_mrgreen:)
;)
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 03:39:18 AM
128 is positive if you regard it as unsigned

range for unsigned bytes: 0 to 255
range for signed bytes: -128 to +127
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 23, 2012, 05:08:38 AM
MOVUPS, XORPS ... and that with byte data   :eusa_naughty:
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 05:42:37 AM
Quote from: dedndave on December 23, 2012, 03:39:18 AM
128 is positive if you regard it as unsigned

That's actually true! So we can help Farabi with much shorter code, since 129-255 are also positive when regarded as unsigned:
  nop

@qWord: XORPS--Bitwise Logical XOR
... but I would be grateful for a link to some Intel or AMD source that explains in more detail what are the risks of using movups/movaps for integers. Agner Fog's microarchitecture (http://www.agner.org/optimize/microarchitecture.pdf), page 88, offers a fascinating lecture in this respect - see the part on latency & throughput.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 23, 2012, 05:58:28 AM
yah - but the difference is
we do not have -129 to -255 as possible input values
we DO have -128 as a possible input value

this is the nature of two's compliment
i know you have been down this road before - lol
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 06:04:11 AM
Quote from: dedndave on December 23, 2012, 05:58:28 AM
we do not have -129 to -255 as possible input values
we DO have -128 as a possible input value

And I thought the whole point of this thread was to turn negative signed bytes into positive signed bytes. Now, is 128 aka 80h a signed positive byte? What does
mov byte ptr [esi], 128
movsx eax, byte ptr [esi]
return?
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 23, 2012, 07:52:03 AM
Quote from: jj2007 on December 23, 2012, 05:42:37 AM@qWord: XORPS--Bitwise Logical XOR
... but I would be grateful for a link to some Intel or AMD source that explains in more detail what are the risks of using movups/movaps for integers.
Why do you think they introduce different instructions that seems to do same? For fun?
Even, it has several times showed by tests of yourself (sorry I forgot the topics, but one was about your macros) that your habit of using wrong typed instructions cause speed issues on recent processors.
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 07:59:35 AM
How boring. Bring evidence.

QuoteThe important conclusion here is that there is a penalty in terms of latency to using an XMM
instruction of the wrong type on the Nehalem. On previous Intel processors there is no
penalty for using move and shuffle instructions on other types of operands than they are
intended for. 

The bypass delay is important in long dependency chains where latency is a bottleneck, but 
not where it is throughput rather than latency that matters. In fact, the throughput may
actually be improved by using the integer vector versions of the move and Boolean
instructions
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 23, 2012, 08:16:18 AM
Quote from: Intel® 64 and IA-32 Architectures Optimization Reference Manual3.5.1.9 Mixing SIMD Data Types
Previous microarchitectures (before Intel Core microarchitecture) do not have
explicit restrictions on mixing integer and floating-point (FP) operations on XMM
registers. For Intel Core microarchitecture, mixing integer and floating-point opera-
tions on the content of an XMM register can degrade performance. Software should
avoid mixed-use of integer/FP operation on XMM registers. Specifically,

  • Use SIMD integer operations to feed SIMD integer operations. Use PXOR for
    idiom.
  • Use SIMD floating point operations to feed SIMD floating point operations. Use
    XORPS for idiom.
  • When floating point operations are bitwise equivalent, use PS data type instead
    of PD data type. MOVAPS and MOVAPD do the same thing, but MOVAPS takes one
    less byte to encode the instruction.
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 23, 2012, 08:32:05 AM
Intel recommendations are one thing, evidence is a different one. The latter is a testbed showing whether using movups instead of movdqu does degrade performance (not "can" degrade performance). Go ahead, set up a testbed, and let's have some fun in the lab :icon14:
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 23, 2012, 08:42:28 AM
Thanks for the trouble.

I see that there is difficulty to determine wheter -1 and 255 is the same or not. So I dont think this should be done after the substraction and cannot be done after all substraction were done. But judgjing that there would be no substraction yield a result of 255 we can assume 255 is never exist and treat it as -1.
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 23, 2012, 09:37:41 AM
Quote from: jj2007 on December 23, 2012, 08:32:05 AMcan[/b]" degrade performance). Go ahead, set up a testbed, and let's have some fun in the lab :icon14:
that is boring   ;)
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 24, 2012, 06:39:45 AM
Quote from: Farabi on December 23, 2012, 08:42:28 AM
Thanks for the trouble.

I see that there is difficulty to determine wheter -1 and 255 is the same or not. So I dont think this should be done after the substraction and cannot be done after all substraction were done. But judgjing that there would be no substraction yield a result of 255 we can assume 255 is never exist and treat it as -1.

no, that's not the issue

the issue is: what to do with the value -128
and, are the resulting bytes signed or unsigned

range for unsigned bytes: 0 to 255
range for signed bytes: -128 to +127

so, when you convert -128 to a positive value, it exceeds the range of signed bytes

it boils down to: do you expect all resulting bytes to be representable with 7-bits

the other issue is: which sse mov instruction to use   :P
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 24, 2012, 02:40:13 PM
Quote from: dedndave on December 24, 2012, 06:39:45 AM
Quote from: Farabi on December 23, 2012, 08:42:28 AM
Thanks for the trouble.

I see that there is difficulty to determine wheter -1 and 255 is the same or not. So I dont think this should be done after the substraction and cannot be done after all substraction were done. But judgjing that there would be no substraction yield a result of 255 we can assume 255 is never exist and treat it as -1.

no, that's not the issue

the issue is: what to do with the value -128
and, are the resulting bytes signed or unsigned

range for unsigned bytes: 0 to 255
range for signed bytes: -128 to +127

so, when you convert -128 to a positive value, it exceeds the range of signed bytes

it boils down to: do you expect all resulting bytes to be representable with 7-bits

the other issue is: which sse mov instruction to use   :P

Yes youre right, I better convert the bytes to word and then do the substraction so it will fit the bits. I just want to substract a pixel with it neightbour and then check if it is below ten, if it was, then it was a different pixel, if it was not, then it is the same pixel just different intensity.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 24, 2012, 02:55:23 PM
then, this is good code - no need for words

        movups  xmm0,oword ptr oData
        xorps   xmm1,xmm1
        pcmpgtb xmm1,xmm0
        xorps   xmm0,xmm1
        psubb   xmm0,xmm1
        movups  oword ptr oData,xmm0


i am just not sure if i am using MOVUPS correctly - there may be a better instruction for that
qWord and Jochen were discussing it - then they went to discussing 64-bit moves   :dazzled:
i don't understand the outcome - lol

according to qWord, i should use PXOR instead of XORPS
the documents i read didn't seem to say anything about that
some testing my be needed
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 24, 2012, 03:00:58 PM
Hi

movups  xmm0,oword ptr oData
        xorps   xmm1,xmm1
        pcmpgtb xmm1,xmm0
        xorps   xmm0,xmm1
        psubb   xmm0,xmm1
        movups  oword ptr oData,xmm0


Can you please tell me what this code do ? I dont understand the xorps part.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 24, 2012, 03:05:36 PM
it does this, but SSE on 16 bytes at once
mov al,n
cbw
xor al,ah
sub al,ah


if n is negative, then CBW makes AH = 0FFh
if n is positive, then CBW makes AH = 0

if n is negative, then XOR inverts all the bits
if n is positive, then XOR does nothing to AL

if n is negative, then SUB AL,AH adds one to AL (subtracts -1)
if n is positive, then SUB AL,AH does nothing (AH = 0)

the idea is this:
one way to negate a value is to invert all the bits, then add 1
(you could also subtract 1, then invert all the bits)
for absolute value, if the initial value is positive, we do not want to negate it
Title: Re: [SSE2]Make all bytes positive
Post by: frktons on December 24, 2012, 03:07:27 PM
Quote
XORPS—Bitwise Logical XOR for Single-Precision Floating-Point Values

Description
------------------------
Performs a bitwise logical exclusive-OR of the four packed single-precision floating-point values from the source
operand (second operand) and the destination operand (first operand), and stores the result in the destination
operand. The source operand can be an XMM register or a 128-bit memory location. The destination operand is an
XMM register.

Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 24, 2012, 03:08:20 PM
Quote from: dedndave on December 24, 2012, 03:05:36 PM
it does this, but SSE on 16 bytes at once
mov al,n
cbw
xor al,ah
sub al,ah


if n is negative, then CBW makes AH = 0FFh
if n is positive, then CBW makes AH = 0

if n is negative, then XOR inverts all the bits
if n is positive, then XOR does nothing to AL

if n is negative, then SUB AL,AH adds one to AL (subtracts -1)
if n is positive, then SUB AL,AH does nothing (AH = 0)

the idea is this:
one way to negate a value is to invert all the bits, then add 1
for absolute value, if the initial value is positive, we do not want to negate it

Great idea. :U
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 24, 2012, 04:32:12 PM
Here is my code so far


SSECmpPixel proc uses esi edi pix1:dword,pix2:dword,tol:dword
LOCAL buff[256]:dword

lea esi,buff
movd xmm0,pix1
movd xmm1,pix2
psubb xmm0,xmm1
movd [esi],xmm0

test byte ptr[esi],10000000b
.if !ZERO?
neg byte ptr[esi]
.endif

test byte ptr[esi+1],10000000b
.if !ZERO?
neg byte ptr[esi+1]
.endif

test byte ptr[esi+2],10000000b
.if !ZERO?
neg byte ptr[esi+2]
.endif

test byte ptr[esi+3],10000000b
.if !ZERO?
neg byte ptr[esi+3]
.endif

movd xmm0,[esi]
movd xmm1,tol
pcmpgtb xmm0,xmm1
movd eax,xmm0


ret
SSECmpPixel endp
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 24, 2012, 04:47:03 PM
This code do the same, but without SSE, and it is faster.


CmpPix proc uses esi edi ecx  pix1:dword,pix2:dword,tol:dword
LOCAL r3,g3,b3,r4,g4,b4:dword
LOCAL rr,rg,rb:dword

mov eax,pix1
mov edx,pix2

movzx ecx,al
mov r3,ecx
shr eax,8
movzx ecx,al
mov g3,ecx
shr eax,8
movzx ecx,al
mov b3,ecx

movzx ecx,dl
mov r4,ecx
shr edx,8
movzx ecx,dl
mov g4,ecx
shr edx,8
movzx ecx,dl
mov b4,ecx

mov eax,r3
sub eax,r4
cmp eax,0
jg @f
neg eax
@@:
cmp eax,tol
jle @f
xor eax,eax
mov ecx,0
ret
@@:
mov rr,eax

mov eax,g3
sub eax,g4
cmp eax,0
jg @f
neg eax
@@:
cmp eax,tol
jle @f
xor eax,eax
mov ecx,1
ret
@@:
mov rg,eax

mov eax,b3
sub eax,b4
cmp eax,0
jg @f
neg eax
@@:
cmp eax,tol
jle @f
xor eax,eax
mov ecx,2
ret
@@:


xor eax,eax
inc eax
mov ecx,3

ret
CmpPix endp


I dont think SSE that great.
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 24, 2012, 06:18:54 PM
Quote from: Farabi on December 24, 2012, 04:47:03 PMI dont think SSE that great.
because you use the wrong approach.

.data
    align 16
msk1 LABEL OWORD
db 16 dup (1)
.code

movdqa xmm0,16 bytes
pcmpeqb xmm3,xmm3
pxor xmm1,xmm1
pcmpgtb xmm1,xmm0
movdqa xmm2,xmm1
pandn xmm2,xmm0
pand xmm1,xmm0
pandn xmm1,xmm3
paddb xmm1,msk1
por xmm1,xmm2
; xmm1 = abs(xmm0)


EDIT: daves solution is of course much better :t
Title: Re: [SSE2]Make all bytes positive
Post by: jj2007 on December 24, 2012, 06:25:34 PM
Quote from: dedndave on December 24, 2012, 02:55:23 PM
according to qWord, i should use PXOR instead of XORPS
the documents i read didn't seem to say anything about that
some testing my be needed

Xmas present for you :biggrin:

  m2m ecx, 7
  LoopAlign        ; same for all algos
.Repeat
        movups xmm0, OWORD PTR [esi]
        ?xor? xmm1, xmm1
        pcmpgtb xmm1, xmm0
        ?xor? xmm0, xmm1
        psubb xmm0, xmm1
        movups OWORD PTR [edi], xmm0
        dec ecx
  .Until Sign?

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
Testing with 10000000 loops
453 ms for psubb with pxor
453 ms for psubb with xorpd
454 ms for psubb with xorps

453 ms for psubb with pxor
454 ms for psubb with xorpd
453 ms for psubb with xorps

453 ms for psubb with pxor
452 ms for psubb with xorpd
454 ms for psubb with xorps

27      bytes for psubb with pxor
27      bytes for psubb with xorpd
25      bytes for psubb with xorps
Title: Re: [SSE2]Make all bytes positive
Post by: qWord on December 24, 2012, 06:38:20 PM
Quote from: dedndave on December 24, 2012, 02:55:23 PMaccording to qWord, i should use PXOR instead of XORPS
the documents i read didn't seem to say anything about that
Intel's optimization manual says: "Use SIMD integer operations to feed SIMD integer operations. Use PXOR for
idiom"
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 25, 2012, 02:04:56 AM
 :t

thanks qWord

Jochen - didn't run the test
i know you don't want to see P4 results - lol
what you want is newer CPU results
Title: Re: [SSE2]Make all bytes positive
Post by: hool on December 25, 2012, 03:24:03 AM
should work

CmpPix proc uses esi edi ecx  pix1:dword,pix2:dword,tol:dword

        movd    xmm0, pix1
        movd    xmm1, pix2
        movd    xmm5, tol       ; for the sake of simplicity every byte of "tol" has same value

        ; difference between 2 values (not specifically between 0 and a value)
        pxor    xmm4, xmm4
        movdqa  xmm3, xmm1
        psubusb xmm1, xmm0
        pcmpeqb xmm4, xmm1
        pand    xmm0, xmm4
        psubusb xmm0, xmm3
        por     xmm0, xmm1      ; xmm0[7:0], xmm[15:8], xmm[23:16]   = diff betw colors

        pxor    xmm4, xmm4
        psubusb xmm0, xmm5
        pcmpeqb xmm0, xmm4
        pmovmskb eax, xmm0      ; low 3 bits indicate if color component was different or not

        ; optional
        not      eax
        and      eax, 0xffff
        bsf      ecx, eax       ; ecx = 1st color component that was different
        ;jz       all_identical
       
        ret
CmpPix endp   
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 25, 2012, 12:03:19 PM
Quote from: hool on December 25, 2012, 03:24:03 AM
should work

CmpPix proc uses esi edi ecx  pix1:dword,pix2:dword,tol:dword

        movd    xmm0, pix1
        movd    xmm1, pix2
        movd    xmm5, tol       ; for the sake of simplicity every byte of "tol" has same value

        ; difference between 2 values (not specifically between 0 and a value)
        pxor    xmm4, xmm4
        movdqa  xmm3, xmm1
        psubusb xmm1, xmm0
        pcmpeqb xmm4, xmm1
        pand    xmm0, xmm4
        psubusb xmm0, xmm3
        por     xmm0, xmm1      ; xmm0[7:0], xmm[15:8], xmm[23:16]   = diff betw colors

        pxor    xmm4, xmm4
        psubusb xmm0, xmm5
        pcmpeqb xmm0, xmm4
        pmovmskb eax, xmm0      ; low 3 bits indicate if color component was different or not

        ; optional
        not      eax
        and      eax, 0xffff
        bsf      ecx, eax       ; ecx = 1st color component that was different
        ;jz       all_identical
       
        ret
CmpPix endp   


:t Thanks, great job.
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 25, 2012, 12:10:57 PM
Hool:
The speed is impressive. 4 times faster than my fastest compare algo.
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 25, 2012, 07:52:32 PM
(http://www.iquilezles.org/prods/dem_41_p.jpg)

The white color is example of same color but different intensity, but on real life, sometime some color even the same, is distorted, it had different pattern of color.

For example, a color consist of (RGB) 22h-22h-22h is the same with 66h-66h-66h but in real life, sometime the color become 66h-64h-65h and it make my algo confused to determine wheter if this pixel is the same color or not. So, now I decided, if the substraction had equal result, it is the same. For example, 33h-33h-33h substract with 22h-22h-22h which have result RGB 11h for R and 11h for G and 11h for B is the same color because the intensity is the same. But, if the R is 10 and the G is 9 and the B is 10 it was a different color because the result component is different. But to make thing easier, I make a tolerance( I used word "Tol") to 10d for the difference tobe acceptable.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 26, 2012, 01:29:11 AM
color space is not a cube, either

R, G, and B each have different non-linear weights

this will approximate the "distance" between 2 colors
(http://img692.imageshack.us/img692/5482/colordiffapproxmod.png)

there are probably a number of good short-cuts   :P
this equation is a short-cut, already
the real formula would be extremely complex
Title: Re: [SSE2]Make all bytes positive
Post by: hool on December 26, 2012, 09:42:47 PM
faster version of absolute difference between 2 values

simply:movdqa  xmm3, xmm1
psubusb xmm1, xmm0
psubusb xmm0, xmm3
por     xmm0, xmm1
Title: Re: [SSE2]Make all bytes positive
Post by: Farabi on December 30, 2012, 02:36:35 AM
Quote from: dedndave on December 26, 2012, 01:29:11 AM
color space is not a cube, either

R, G, and B each have different non-linear weights

this will approximate the "distance" between 2 colors
(http://img692.imageshack.us/img692/5482/colordiffapproxmod.png)

there are probably a number of good short-cuts   :P
this equation is a short-cut, already
the real formula would be extremely complex

Anyway dave, where did you get that formula, that formula is exactly what Im looking for.
Title: Re: [SSE2]Make all bytes positive
Post by: dedndave on December 30, 2012, 03:07:15 AM
sorry - i should have mentioned that
it is a slightly modified version of Thiadmer's equation found here...

http://www.compuphase.com/cmetric.htm (http://www.compuphase.com/cmetric.htm)

he based it on previous work by Charles Poynton, who is a guru for such things
then, he did some experimentation and testing to arrive at that equation