It's just like a CMP instruction, but for 128 bits instead of 32

So, you can use a constructs like:
AxCMP128bit thenumber1,thenumber2
jge @Signed_jumpIfTheNumber1IsGreaterThanTheNumber2OrEqualToItThe only one difference is in that the code does not (always) set SF flag, but all other conditions are met the "standard". The sign maybe just checked by checking highest order bit of the OWORD, though.
65 cycles for tn7F00-tn7FFF
42 cycles for short cut
Compare tn7F00 with tn7FFF
JBE passed
JB passed
JLE passed
JL passed
Compare tn7FFF with tn7F00
JAE passed
JA passed
JGE passed
JG passed
Compare tn7FFF with tn0
JAE passed
JA passed
JGE passed
JG passed
Compare tn0 with tn7F00
JBE passed
JB passed
JLE passed
JL passed
Compare tn7FFF with tnN1
JGE passed
JG passed
JBE passed
JB passed
Compare tnN1 with tn7F00
JAE passed
JA passed
JLE passed
JL passed
Compare tnN1 with tnN1
JZ passed
Compare tnN1 with tnN2
JAE passed
JA passed
JGE passed
JG passed
Compare tnN2 with tnN1
JBE passed
JB passed
JLE passed
JL passed
Compare tnNx1y1 with tn0
JAE passed
JA passed
JLE passed
JL passed
Compare tn0 with tnx1y1
JGE passed
JG passed
JBE passed
JB passed
Compare tnNx1y1 with tnNx1y2
JAE passed
JA passed
JGE passed
JG passed
Compare tnNx1y2 with tnNx1y1
JBE passed
JB passed
JLE passed
JL passed
Compare tnNx1y1 with tnNx2y1
JAE passed
JA passed
JGE passed
JG passed
Compare tnNx2y1 with tnNx1y1
JBE passed
JB passed
JLE passed
JL passed
Compare tnNx1y2 with tnNx2y1
JAE passed
JA passed
JGE passed
JG passed
Compare tnNx2y1 with tnNx1y2
JBE passed
JB passed
JLE passed
JL passed
Compare tnNx1y1 with tnNx1y2
JAE passed
JA passed
JGE passed
JG passed
Compare tnNx1y2 with tnNx1y2
JZ passed
Compare tnNx1y1 with tnNx1y2
JAE passed
JA passed
JGE passed
JG passed
Compare tn0 with tn0
JZ passed
Compare tn7F00 with tn7F00
JZ passed
Press any key to continue ...
The results are exact for signed/unsigned/mixed comparsions.
Since the code is SSE1-capable, and uses an interesting trycky replacement (my, but someone on the planet for sure used it somewhere, too) of PCMPEQD (SSE2), it's interesting to see how the code works on a wide range of SSE1+ capable harware, so, I'm asking for a test in this new thread.