News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Comparing 128-bit numbers aka OWORDs

Started by jj2007, August 12, 2013, 08:25:24 PM

Previous topic - Next topic

dedndave


nidud

#181
deleted

dedndave

interesting that qWord's "magic" number (that's a pun, really) is needed
that means that my set of data values is insufficient to perform a comprehensive test
it also means that some of our "thought-to-be-good" algos may not be

i may have to look at it a little closer to see what other values are needed
i thought i had it covered with the walking 1's and 0's

Antariy

Quote from: nidud on August 23, 2013, 01:21:06 AM
Quote from: Antariy on August 23, 2013, 12:37:25 AM
*Suspeciously looking to the proc name and then to the source*
sorry about that  :lol:

Do not worry about that :biggrin:

Quote from: nidud on August 23, 2013, 05:19:52 AM
The xor test passes unsigned macros:
Cmp128NidudU
Cmp128DaveU

QuoteCheckIt <Cmp128NidudSEE [esi],[edi]>

I looked at the code, and the TestVal data is not align 16

I aligned the table by removing the test DWORD in front of the OWORD
I then changed the CheckIt macro:
CheckIt MACRO howToInvoke:REQ
CheckIt2 <howToInvoke>, 0, 0
CheckIt2 <howToInvoke>, 16, 0
CheckIt2 <howToInvoke>, 0, 16
CheckIt2 <howToInvoke>, 16, 16
ENDM


Now the Cmp128NidudSEE macro passes, and:
  Cmp128NidudSEEU
  Cmp128NidudU
  Cmp128DaveU

Yes, it is not aligned, but it has the reason, too: our algos should work with unaligned data, too, even if they are SSE (that's whay we use MOVUPS/MOVUPD) (though JJAxCMP128bit is not currently aware of unaligned data, but it's question of one instruction more), and then, I use offset changement by 4 to not only change the numbers, but also to make a "check DWORD" as the part of the numbers - it will have different position in them in different passes, so we actually have, roughly speaking, 4 times more testing numbers than original Dave's OWORDs set.


Quote from: dedndave on August 23, 2013, 01:10:21 PM
interesting that qWord's "magic" number (that's a pun, really) is needed
that means that my set of data values is insufficient to perform a comprehensive test
it also means that some of our "thought-to-be-good" algos may not be

i may have to look at it a little closer to see what other values are needed
i thought i had it covered with the walking 1's and 0's

In my testbed I get very many errors for algos, though, some numbers are cross-repeated, others are not appeared when I first used your testing data with only one pass using OWORDs just like they were prepared to be used (i.e. skipping DWORD and checking OWORDs). After than I added a playing with offsets and using an additional dword as a part of a numbers, there are much new numbers revealed.
So, I think, probably it's a idea to go - we may craft the data as OWORDs, but then walk through it with step of a DWORD, or even byte, this will increase possibility of detection.

nidud

#184
deleted

FORTRANS

Mobile Intel(R) Celeron(R) processor     600MHz (SSE2)
---------------------------------------------------------
1986385   cycles for Cmp128Dave
6307221   cycles for Cmp128Dave2
1832520   cycles for Cmp128Nidud
3162995   cycles for Cmp128NidudSEE (xor)
970403   cycles for Cmp128Alex (xor)
806938   cycles for Cmp128JJ (xor)
866598   cycles for Cmp128DaveU (unsigned)
754468   cycles for Cmp128NidudU (unsigned)
---------------------------------------------------------
1980527   cycles for Cmp128Dave
6309359   cycles for Cmp128Dave2
1853104   cycles for Cmp128Nidud
3159145   cycles for Cmp128NidudSEE (xor)
964008   cycles for Cmp128Alex (xor)
807636   cycles for Cmp128JJ (xor)
867367   cycles for Cmp128DaveU (unsigned)
749661   cycles for Cmp128NidudU (unsigned)
---------------------------------------------------------

--- ok ---

nidud

#186
deleted

dedndave

i like that method, nidud   :t
i was thinking of doing something like that, and adding parity - lol
but i don't have time to mess with it, right now

i did create a new set of values
but, i haven't had time to validate the standard flags proc

TestVal dd 0,0,0,0
        dd 1,0,0,0
        dd 0,1,0,0
        dd 0,0,1,0
        dd 0,0,0,1

        dd 7FFFFFFFh,0,0,0
        dd 0FFFFFFFFh,0,0,0
        dd 0FFFFFFFFh,1,0,0
        dd 0FFFFFFFFh,7FFFFFFFh,0,0
        dd 0FFFFFFFFh,0FFFFFFFFh,0,0
        dd 0FFFFFFFFh,0FFFFFFFFh,1,0
        dd 0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh,0
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,1

        dd 0,0,0,40000000h
        dd 1,0,0,40000000h
        dd 0,1,0,40000000h
        dd 0,0,1,40000000h
        dd 0,0,0,40000001h

        dd 7FFFFFFFh,0,0,40000000h
        dd 0FFFFFFFFh,0,0,40000000h
        dd 0FFFFFFFFh,1,0,40000000h
        dd 0FFFFFFFFh,7FFFFFFFh,0,40000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0,40000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,1,40000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh,40000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,40000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,40000001h

        dd 0,0,0,80000000h
        dd 1,0,0,80000000h
        dd 0,1,0,80000000h
        dd 0,0,1,80000000h
        dd 0,0,0,80000001h

        dd 7FFFFFFFh,0,0,80000000h
        dd 0FFFFFFFFh,0,0,80000000h
        dd 0FFFFFFFFh,1,0,80000000h
        dd 0FFFFFFFFh,7FFFFFFFh,0,80000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0,80000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,1,80000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh,80000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,80000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,80000001h

        dd 0,0,0,0C0000000h
        dd 1,0,0,0C0000000h
        dd 0,1,0,0C0000000h
        dd 0,0,1,0C0000000h
        dd 0,0,0,0C0000001h

        dd 7FFFFFFFh,0,0,0C0000000h
        dd 0FFFFFFFFh,0,0,0C0000000h
        dd 0FFFFFFFFh,1,0,0C0000000h
        dd 0FFFFFFFFh,7FFFFFFFh,0,0C0000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0,0C0000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,1,0C0000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh,0C0000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0C0000000h
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0C0000001h

        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,3FFFFFFFh
        dd 0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh,3FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFEh,0FFFFFFFFh,3FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFEh,3FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,3FFFFFFEh

        dd 80000000h,0FFFFFFFFh,0FFFFFFFFh,3FFFFFFFh
        dd 0,0FFFFFFFFh,0FFFFFFFFh,3FFFFFFFh
        dd 0,0FFFFFFFEh,0FFFFFFFFh,3FFFFFFFh
        dd 0,80000000h,0FFFFFFFFh,3FFFFFFFh
        dd 0,0,0FFFFFFFFh,3FFFFFFFh
        dd 0,0,0FFFFFFFEh,3FFFFFFFh
        dd 0,0,80000000h,3FFFFFFFh
        dd 0,0,0,3FFFFFFFh
        dd 0,0,0,3FFFFFFEh

        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh
        dd 0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFEh,0FFFFFFFFh,7FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFEh,7FFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,7FFFFFFEh

        dd 80000000h,0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh
        dd 0,0FFFFFFFFh,0FFFFFFFFh,7FFFFFFFh
        dd 0,0FFFFFFFEh,0FFFFFFFFh,7FFFFFFFh
        dd 0,80000000h,0FFFFFFFFh,7FFFFFFFh
        dd 0,0,0FFFFFFFFh,7FFFFFFFh
        dd 0,0,0FFFFFFFEh,7FFFFFFFh
        dd 0,0,80000000h,7FFFFFFFh
        dd 0,0,0,7FFFFFFFh
        dd 0,0,0,7FFFFFFEh

        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0BFFFFFFFh
        dd 0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh,0BFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFEh,0FFFFFFFFh,0BFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFEh,0BFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0BFFFFFFEh

        dd 80000000h,0FFFFFFFFh,0FFFFFFFFh,0BFFFFFFFh
        dd 0,0FFFFFFFFh,0FFFFFFFFh,0BFFFFFFFh
        dd 0,0FFFFFFFEh,0FFFFFFFFh,0BFFFFFFFh
        dd 0,80000000h,0FFFFFFFFh,0BFFFFFFFh
        dd 0,0,0FFFFFFFFh,0BFFFFFFFh
        dd 0,0,0FFFFFFFEh,0BFFFFFFFh
        dd 0,0,80000000h,0BFFFFFFFh
        dd 0,0,0,0BFFFFFFFh
        dd 0,0,0,0BFFFFFFEh

        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
        dd 0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFEh,0FFFFFFFFh
        dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFEh

        dd 80000000h,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
        dd 0,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
        dd 0,0FFFFFFFEh,0FFFFFFFFh,0FFFFFFFFh
        dd 0,80000000h,0FFFFFFFFh,0FFFFFFFFh
        dd 0,0,0FFFFFFFFh,0FFFFFFFFh
        dd 0,0,0FFFFFFFEh,0FFFFFFFFh
        dd 0,0,80000000h,0FFFFFFFFh
        dd 0,0,0,0FFFFFFFFh
        dd 0,0,0,0FFFFFFFEh
TestVal_end LABEL BYTE

nidud

#188
deleted

jj2007

I haven't followed this as intensely as I should, sorry. Now I ran your test, and picked arbitrarily one of the "failed" values, and I don't quite understand why you consider that a failure:

include \masm32\MasmBasic\MasmBasic.inc        ; download
ox0 oword 0FFFFFFFFFFFFFFFF00000001FFFFFFFFh
ox1 oword 0FFFFFFFF00000001FFFFFFFF00000001h
qx0 qword 0FFFFFFFF0001FFFFh
qx1 qword 0FFFF0001FFFF0001h
dx0 dd    0FFFF01FFh
dx1 dd    0FF01FF01h

  Init
  Ocmp ox0, ox1
  movups xmm0, ox0
  movups xmm1, ox1
  deb 4, "OWORD size", x:xmm0, x:xmm1, flags
  Qcmp qx0, qx1
  deb 4, "QWORD size", qx0, qx1, x:qx0, x:qx1, flags
  mov eax, dx0
  cmp eax, dx1
  deb 4, "DWORD size", dx0, dx1, x:dx0, x:dx1, flags
  Inkey CrLf$, "was: NO NS NZ CY should be: NO NS NZ NC"
  Exit
end start

Output:
OWORD size
x:xmm0          FFFFFFFF FFFFFFFF 00000001 FFFFFFFF
x:xmm1          FFFFFFFF 00000001 FFFFFFFF 00000001
flags:          czso

QWORD size
qx0             -4294836225
qx1             -281466386841599
x:qx0           FFFFFFFF 0001FFFF
x:qx1           FFFF0001 FFFF0001
flags:          czso

DWORD size
dx0             -65025
dx1             -16646399
x:dx0           FFFF01FF
x:dx1           FF01FF01
flags:          czso  <<<<<<<<< lowercase means "not set"

was: NO NS NZ CY should be: NO NS NZ NC


Or do I misunderstand something? Apologies if that is the case...

nidud

#190
deleted

nidud

#191
deleted

dedndave

DWORD size
dx0             -65025
dx1             -16646399
x:dx0           FFFF01FF
x:dx1           FF01FF01
flags:          czso  <<<<<<<<< lowercase means "not set"

was: NO NS NZ CY should be: NO NS NZ NC


the carry flag was set by the algorithm and should not have been

jj2007

Quote from: dedndave on August 24, 2013, 03:28:05 AM
flags:          czso  <<<<<<<<< lowercase means "not set"

was: NO NS NZ CY should be: NO NS NZ NC

the carry flag was set by the algorithm and should not have been

Well, not by my algo... ::)

nidud

#194
deleted