Assumed that they are aligned to 16 bytes boundaries on a 64 bit machine windows.
What would be faster way (performance wise) to compare two 128 bits integers?
Not only for equality but also for greater and less.
PCMPEQB/PCMPEQW/PCMPEQD--Packed Compare for Equal
and
PCMPGTB/PCMPGTW/PCMPGTD--Packed Compare for Greater Than
Depends on what you really want to achieve, though. One 128 bit number, or four dwords? If it's one number, use PCMPGTB, then check with bsf for the first bit set by e.g. PMOVMSKB eax, xmm0
For signed integers it is maybe more effective to use SUB+SBB for compare, because of the needed carry-logic.
Hi,
Using the compare strings instructions is easy.
REPNE CMPSD
Steve
the problem with using string instructions is that you want to compare the high dword's first
so, you'd have to use STD and CLD, which adds about 200 cycles
here is a brute-force compare using 2 dword registers :P
OwordA OWORD ?
OwordB OWORD ?
mov eax,dword ptr OwordA[12]
mov edx,dword ptr OwordA[8]
cmp eax,dword ptr OwordB[12]
jnz FlagsSet
cmp edx,dword ptr OwordB[8]
mov eax,dword ptr OwordA[4]
jnz FlagsSet
cmp eax,dword ptr OwordB[4]
mov edx,dword ptr OwordA[0]
jnz FlagsSet
cmp edx,dword ptr OwordB[0]
FlagsSet:
;flags are set as though you had executed CMP OwordA,OwordB
A stock 6 instruction method with 8byte general purpose regs and 2 jumps will be the fastest (first you compare high digits and then low).
PCMPGT(B/W/D) are completely useless since they treat their sub elements as signed numbers
For 4byte x86 regs SSE might have some advantage, however using general purpose regs offers premature ending of code execution with nowadays fast short conditional jumps
EDIT: See timings in the Lab (http://masm32.com/board/index.php?topic=2222.0)
Quote from: hool on August 12, 2013, 05:45:09 AM
A stock 6 instruction method with 8byte general purpose regs and 2 jumps will be the fastest (first you compare high digits and then low).
Depends on the nature of the data - unfortunately the OP has not been very specific. With PCMPEQB, you could quickly eliminate the equal cases, and if not equal, decide where to enter the loop.
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
.data
O0 OWORD 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFh
O1 OWORD 0FFFFFFFFCCFFFFFFFFFFCCFFFFFFFFFFh
Init
movups xmm0, O0
movups xmm1, O1
movaps xmm2, xmm0 ; copy for pcmpeqb
PCMPEQB xmm2, xmm1
Print CrLf$, "16 bits", Tb$, Tb$, Tb$, "5432109876543210", CrLf$
PMOVMSKB eax, xmm2 ; show in ax where xmm0 differs to xmm1
xor ecx, ecx
not ax ; we are interested in bits set
deb 4, "MSKB", b:ax
bsr ecx, eax
deb 4, "msb set", ecx
bsf ecx, eax
deb 4, "lsb set", ecx
Inkey
Exit
end startOutput:
16 bits 5432109876543210
MSKB b:ax 0000100000100000
msb set ecx 11
lsb set ecx 5