News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Comparing 128-bit numbers aka OWORDs

Started by jj2007, August 12, 2013, 08:25:24 PM

Previous topic - Next topic

dedndave

Alex
if you find a mismatch in any of those 4 flags, i can find a corresponding branch that will not work properly
conversely, if the 4 flags match, all the conditional branches will work correctly (except parity)

there is also the parity flag, but we haven't tested for it because it's so rarely used
the parity flag only applies to the low-order byte   ::)
that actually makes it somewhat useless except for serial comm
i could add that to my test very easily, but all our OWORD algos would probably fail - lol

in addition to all those flags, there is an auxiliary carry flag
however, it is not used for any branches
it's really more or less a "CPU internal" flag

****************************************************************
Equality Branches (Used for Signed or Unsigned Comparisons)
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JZ           Jump if equal             ZF=1            JE
JNZ          Jump if not equal         ZF=0            JNE
****************************************************************


****************************************************************
Unsigned Branches
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JA           Jump if above             CF=0 and ZF=0   JNBE
JAE          Jump if above or equal    CF=0            JNC JNB
JB           Jump if below             CF=1            JC JNAE
JBE          Jump if below or equal    CF=1 or ZF=1    JNA
****************************************************************


****************************************************************
Signed Branches
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JG           Jump if greater           SF=OF or ZF=0   JNLE
JGE          Jump if greater or equal  SF=OF           JNL
JL           Jump if less              SF<>OF          JNGE
JLE          Jump if less or equal     SF<>OF or ZF=1  JNG
JO           Jump if overflow          OF=1
JNO          Jump if no overflow       OF=0
JS           Jump if sign              SF=1
JNS          Jump if no sign           SF=0
****************************************************************


you are right, though - the code could use XOR
    mov eax,ebx
    xor eax,ebp
    test eax,8C1h
    jnz fail

i don't really see the advantage, though
also, by keeping the flags in EBX and EBP intact, i can use them for the failure report   :P
    xor ebx,ebp
that would destroy one set of flags for the report

dedndave

cmp128tm
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
---------------------------------------------------------
5534615 cycles for Cmp128Dave
9100335 cycles for Cmp128Dave2
5937448 cycles for Cmp128Nidud
---------------------------------------------------------
5646211 cycles for Cmp128Dave
9094478 cycles for Cmp128Dave2
5587906 cycles for Cmp128Nidud
---------------------------------------------------------

FORTRANS

HI Dave,

   A nit to pick.

JS           Jump if sign              SF=1
JNS          Jump if no sign           SF=1


Should be;

JS           Jump if sign              SF=1
JNS          Jump if no sign           SF=0


   As an aside, this thread has me looking at my fixed point arithmetic
program again.  Such a comparison might be useful.

Cheers,

Steve N.

dedndave


Antariy

Quote from: dedndave on August 21, 2013, 09:19:08 PM
Alex
if you find a mismatch in any of those 4 flags, i can find a corresponding branch that will not work properly
conversely, if the 4 flags match, all the conditional branches will work correctly (except parity)

there is also the parity flag, but we haven't tested for it because it's so rarely used
the parity flag only applies to the low-order byte   ::)
that actually makes it somewhat useless except for serial comm
i could add that to my test very easily, but all our OWORD algos would probably fail - lol

in addition to all those flags, there is an auxiliary carry flag
however, it is not used for any branches
it's really more or less a "CPU internal" flag

****************************************************************
Equality Branches (Used for Signed or Unsigned Comparisons)
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JZ           Jump if equal             ZF=1            JE
JNZ          Jump if not equal         ZF=0            JNE
****************************************************************


****************************************************************
Unsigned Branches
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JA           Jump if above             CF=0 and ZF=0   JNBE
JAE          Jump if above or equal    CF=0            JNC JNB
JB           Jump if below             CF=1            JC JNAE
JBE          Jump if below or equal    CF=1 or ZF=1    JNA
****************************************************************


****************************************************************
Signed Branches
----------------------------------------------------------------
Instruction  Description               Condition       Aliases
----------------------------------------------------------------
JG           Jump if greater           SF=OF or ZF=0   JNLE
JGE          Jump if greater or equal  SF=OF           JNL
JL           Jump if less              SF<>OF          JNGE
JLE          Jump if less or equal     SF<>OF or ZF=1  JNG
JO           Jump if overflow          OF=1
JNO          Jump if no overflow       OF=0
JS           Jump if sign              SF=1
JNS          Jump if no sign           SF=0
****************************************************************


you are right, though - the code could use XOR
    mov eax,ebx
    xor eax,ebp
    test eax,8C1h
    jnz fail

i don't really see the advantage, though
also, by keeping the flags in EBX and EBP intact, i can use them for the failure report   :P
    xor ebx,ebp
that would destroy one set of flags for the report


Dave, no need in huge tables. Just look to the logic.

Sign jumps check for SF and OF (I did not mention ZF just because I told about SF and OF, but that was implied and I thought I should not make millions of reservations) to check if number is greates or less than. And if you would look to the tables you posted, you'll find that for signed jumps only one condition is important - in relation of SF and OF flags, we talk now about them and not about ZF: the mutual state of SF and OF flags. There is no any "agreement", HOW should be set these flags.

If the number 1 is greater than number 2, then both OF and SF should be equal each to other. I.e.:

OF = 1, SF = 1 : JG will jump, JB will NOT jump, JGE will jump, JBE will NOT jump
OF = 0, SF = 0 : JG the same as above


If the number 1 is less than number 2, then both OF and SF should NOT be equal each to other. I.e.:

OF = 0, SF = 1 : JG will NOT jump, JB will jump, JGE will NOT jump, JBE will jump
OF = 1, SF = 0 : JG the same as above


You don't get what I trying to suggest: not to don't take attention on ZF and CF, but to make code aware of that SF and OF may have more than one set of states, and follow to the docs and CPU design at the same time.

The way I suggested works so: if after XOR of two flags set you have zero - then they both are equal, the test passed. But if it is not zero - you should to check, if the mutual state of OF and SF flags is the same in both flags sets. I.e., if in one flags set OF was 1 and SF was 0, then in second flags set it may be OF=0 and SF=1, and this is still RIGHT RESULT, because it is CPU specs.

OF=1, SF=0  ==  OF=0, SF=1

OF=1, SF=1  ==  OF=0, SF=0

This is the spec, you may read it again to just to check it in any way you want - CPU will follow, for an instance, JG for SF=1 + OF=1 the same as for SF=0 + OF=0.


So, if two flags set have equal mutual state of OF and SF, and they are equal, for example, SF=1 and OF=0 in both flags sets, then after XOR you'll get zero (if all other flags are equal); if the state of OF and SF is mutually equal, but swapped, for example SF=0 and OF=1, then you'll after XOR get both bits set (and other flags bits unset, if they were equal). So, checking for zero or availiability of both SF and OF set to 1, after XOR, is a proper way to go, to emulate proper CPU's behaviour.


    xor ebp,ebx
    .if ebp!=0 && ebp!=100010000000Y


This code meets that requirements.


If you are still not agree, then, please, in the dump of "failed" comparsions for my code, choose any two numbers you want, and then we will make a test - compare them, and then make every possible conditional jump for numbers comparsion. You would not find that numbers, because CPUs behaviour is independed on that how your checking method assumes it behaves.

Besides of this, the Jochen's idea, which is used in his algo and in my algo - just perfect. It's undisputable.

Quote
i could add that to my test very easily, but all our OWORD algos would probably fail - lol

We make comparsion code, but not full CPU emulation :lol:

Antariy

Quote
OF = 1, SF = 1 : JG will jump, JB will NOT jump, JGE will jump, JBE will NOT jump

I mistyped here, meant JLE instead of JBE and JL instead of JB ::) (well, this is not usual "typo" but rather a hurry + tiredness about this disput)

Quote
if the state of OF and SF is mutually equal, but swapped, for example SF=0 and OF=1

In full form it should be "... for example SF=0 and OF=1 in one flags set, and SF=1 and OF=0 in the other flags set ..."





Well, for an example, I inserted my algo into your testing code, well, grabbing right first "failed" test numbers

cmp 00000000_00000000_00000000_00000000 , 80000000_00000000_00000000_00000001
was: OV NG NZ CY should be: NV PL NZ CY



After your code compares the "checking DWORD", it has the flags CF=1, ZF=0, SF=0, OF=0.
After execution of my comparsion code for that number it returns the flags: CF=1, ZF=0, SF=1, OF=1.

This is proper result. CF and ZF are the same, SF=CF.

Moreover, if you will trace the code, you'll find than your code first checks the "checking DWORD" by loading first DWORD in EAX - it is zero, and then comparing it with 80000100. And there SF and OF flags are 1.
My code loads high order DWORD of first OWORD, it's zero, then it comparing it with highest order DWORD of second OWORD, it's 80000000. And right after this it goes to exit from algo. And SF and OF flags are 0.

The internal CPUs logic decided to set SF and OF to 1 when the second number was 80000100, and to set SF and OF to 0 when the second number was 80000000. We cannot say why it does that - we don't know CPUs exact circuit, but it anyway has no meaning, because they defined that it's important only mutual state of SF and OF flags, but not HOW they should be set exactly - in this case they both may be set to 1 or 0, and both cases will be right.



Quote
Besides of this, the Jochen's idea, which is used in his algo and in my algo - just perfect. It's undisputable.

Probably, one may say that my checking algo for correctness of comparsion two numbers is perfect :P

nidud

#156
deleted

dedndave

JP/JPO and JNP/JPE are practically never used
and in the rare cases when they are used, they are operating on byte data, not OWORD's

JO and JNO are used reasonably often in signed math, so are JS and JNS
if those flags have to be correct, i don't see what the argument is about - lol

you have to emulate the exact behaviour in OF, SF, ZF, and CF

Alex is missing an important point
it's not enough that SF=OF or SF<>OF
because YOU DON'T KNOW WHICH Jxx INSTRUCTION WILL BE USED
you have to set the flags so that any of the ones listed above will work

nidud

#158
deleted

dedndave

#159
i updated my version of the validation test program   :P

compare data is 16 aligned
no more control dword's - instead, i use a known-good routine to set the flags
at the end, it reports the fail count

EDIT: i also simplified usage:
_main   PROC

    INVOKE  AllTests,C128nidud

    inkey
    exit

_main   ENDP

nidud

#160
deleted

dedndave

it doesn't seem to mean much that it passed the "xor test"   :badgrin:

mov edx,[esi+12]
and edx,80000000h
or eax,edx
mov edx,[edi+12]


ESI and EDI are never preserved or initialized - oops

dedndave

i set it up in my validation code and got 800 failures out of 3160 tests

nidud

#163
deleted

dedndave

Cmp128tm2
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
---------------------------------------------------------
6121006 cycles for Cmp128Dave
9888967 cycles for Cmp128Dave2
6120274 cycles for Cmp128Nidud
3453129 cycles for Cmp128NidudSEE (xor)
1127987 cycles for Cmp128Axel (xor)
1121311 cycles for Cmp128DaveU (unsigned)
841821  cycles for Cmp128NidudU (unsigned)
---------------------------------------------------------
6297867 cycles for Cmp128Dave
9968284 cycles for Cmp128Dave2
6103287 cycles for Cmp128Nidud
3369442 cycles for Cmp128NidudSEE (xor)
1309918 cycles for Cmp128Axel (xor)
1181227 cycles for Cmp128DaveU (unsigned)
959974  cycles for Cmp128NidudU (unsigned)
---------------------------------------------------------