Author Topic: 128 bit Comparison on 32-bit CPU  (Read 617 times)

colinr

  • Regular Member
  • *
  • Posts: 15
128 bit Comparison on 32-bit CPU
« on: November 05, 2019, 10:27:21 AM »
Hello all,

Firstly, thank's for letting me join this forum.

I'm trying to figure out how to conduct comparison of 128-bit unsigned integers, I've been trying this for months without getting anywhere.

Essentially, I have a data set of IPv6 addresses, which comprises of a start address, end address and the two digit ISO country code to which this block of v6 addresses is assigned to. These are stored as 128-bit little endian integers, and I would like to iterate through this data set to establish which block a particular IP address sits.

So, I'm not testing for equality, but rather if the IP address that I'm looking for is greater or less than each of these blocks.

IPv4 was no problem, these 128-bit IPv6 addresses are killing me!

I'm using WinAsm Studio and MASM32 if that makes any difference.

Thank you in advance.


nidud

  • Member
  • *****
  • Posts: 1800
    • https://github.com/nidud/asmc
Re: 128 bit Comparison on 32-bit CPU
« Reply #1 on: November 05, 2019, 11:16:53 AM »

    mov eax,[ecx+12]
    .if eax == [edx+12]
        mov eax,[ecx+8]
        .if eax == [edx+8]
            mov eax,[ecx+4]
            .if eax == [edx+4]
                mov eax,[ecx]
                cmp eax,[edx]
            .endif
        .endif
    .endif
    je is_equal
    ja is_above
    jb is_below

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 6769
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: 128 bit Comparison on 32-bit CPU
« Reply #2 on: November 05, 2019, 12:16:18 PM »
Hi Colin,

If the task is what I think it is, I would create a structure to match the data layout and then just read the bits you want.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

AW

  • Member
  • *****
  • Posts: 2442
  • Let's Make ASM Great Again!
Re: 128 bit Comparison on 32-bit CPU
« Reply #3 on: November 05, 2019, 03:25:33 PM »
Quote
I'm trying to figure out how to conduct comparison of 128-bit unsigned integers, I've been trying this for months without getting anywhere

The solution is quite easy but you have 2 cases, either the list of Start IPs is sorted or is unsorted.
Start by making a flowchart, this will make a difference.

colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #4 on: November 05, 2019, 06:46:19 PM »
Thanks for your replies so far (I've not checked the sample above yet).

Essentially, the data set is numerically in order, low to high, each entry that I iterate through is 34 bytes in length (arbitrary example below).

20010df7 82000000 00000000 00000000 / 20010df7 8200ffff ffffffff ffffffff / ID

u_int128, u_int128, CHAR, CHAR

The IPv6 data is little endian, so I've flipped it around above for ease on the eyeball!

So, for example, it I was looking for this IP address: 20010df7 82000000 1234abcde 12345678, it would fit within the block above.

I've tried using XMM but it would appear that these are floating point registers and not supported as a GP register for direct comparison.

As it can be seen from the example from the data set above, it's not possible to simply compare the first DWORD as this value may also appear in the next entry (not shown) of the data set.


EDIT: The sample above does not seem to work


 
« Last Edit: November 05, 2019, 08:20:56 PM by colinr »

AW

  • Member
  • *****
  • Posts: 2442
  • Let's Make ASM Great Again!
Re: 128 bit Comparison on 32-bit CPU
« Reply #5 on: November 05, 2019, 09:14:09 PM »
This is a sample that is suitable when all ranges are contiguous (i.e no empty IPs zones between ranges). So, here you don't need to know about the End IP. Were the ranges not contiguous you might have multiple Start IPs above your test IP, so you would need to check the End IP to decide what is the country.

Here, this is done with vector instructions SSE 4.1. Everybody has that (except here, some people don't).

Code: [Select]
.model flat, stdcall

includelib \masm32\lib\msvcrt.lib
printf proto C :ptr, :vararg

IPV6Struct struct
country db 20 dup (0)
startIP OWORD (0)
endIP OWORD (0)
IPV6Struct ends

.data

ipv6_1 IPV6Struct <'country1',20010df7820000000000000000000000h, 20010df78200ffffffffffffffffffffh>
ipv6_2 IPV6Struct <'country2',20020df7820000000000000000000000h, 20020df78200ffffffffffffffffffffh>
ipv6_3 IPV6Struct <'country3',20030df7820000000000000000000000h, 20030df78200ffffffffffffffffffffh>
ipv6_4 IPV6Struct <'country4',20040df7820000000000000000000000h, 20040df78200ffffffffffffffffffffh>
myIP oword 20030df7820000000000000000000001h
msg db "Country is: %s",10,0

.code

main proc uses esi
lea esi, ipv6_1
@checkStart:
movups xmm0, (IPV6Struct ptr [esi]).startIP
movups xmm1, myIP
PCMPGTQ  xmm0, xmm1
or eax,-1 ; clear zero flag
ptest xmm0, xmm1
jnz @exit
add esi, sizeof IPV6Struct
jmp short @checkStart
@exit:
sub esi, sizeof IPV6Struct
invoke printf, offset msg, esi
ret
main endp

end


colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #6 on: November 05, 2019, 09:57:34 PM »
Thanks,

When I try to assemble this code, the PCMPGTQ and ptest instructions are not recognised.

I'm setting this at the top of the code:

.686
.xmm

I'm using WinAsm Studio and MASM32 (unsure of which version).


AW

  • Member
  • *****
  • Posts: 2442
  • Let's Make ASM Great Again!
Re: 128 bit Comparison on 32-bit CPU
« Reply #7 on: November 05, 2019, 10:10:23 PM »
You don't need to change anything. Paste exactly what is above in Notepad and save as test.asm.
Use a recent version of MASM (they come free with Visual Studio community edition). The MASM that comes with MASM32 predates SSE.

Run a batch file containing:
ml -c -coff test.asm
link /entry:main /machine:x86 test.obj

colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #8 on: November 06, 2019, 09:07:58 PM »
@AW - Thank you

I've got VS2017 installed on my workstation, and all that I needed to do was copy out ml.exe, link,exe, msobj140.dll and mspdb140.dll to my existing MASM32 folder, seems to assemble fine.

So my next steps are to scrutinse the code and attempt to integrate it into my current test code, I'll let you know how I get on.

colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #9 on: November 06, 2019, 10:07:42 PM »
OK, I've implemented the code, but not getting the expected results.

It's probably the jump condition that I'm getting wrong, but in the sample posted below, I kind of thought that the first instance of the test should fail and and jump back into the loop as the sample IP address should fit within the second block (GB).

The code at the beginning is simply to swap the Endianness to allow the IP addresses to be viewed easier.

I replaced jnz with jb and it appeared to work, but will need to do further testing to make sure, I think it would also be prudent to test against the end address too.

Sorry if I'm being stupid with this, but my brain is mashed.

Code: [Select]
.data

ipv6 dd 02A002381h,0E8C00000h,000000000h,000000000h
dd 02A002381h,0E8C0FFFFh,0FFFFFFFFh,0FFFFFFFFh
db "CH" ;Switzerland
dd 02A002381h,0E8C10000h,000000000h,000000000h
dd 02A0023FFh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "GB" ;Great Britain
dd 02A002400h,000000000h,0000000000h,000000000h
dd 02A003FFFh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "XX" ;Unassigned
dd 02A004000h,000000000h,000000000h,000000000h
dd 02A004007h,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "IR" ;Iran
dd 02A004008h,000000000h,000000000h,000000000h
dd 02A00401Fh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "XX" ;Unassigned
dd 02A004020h,000000000h,000000000h,000000000h
dd 02A004020h,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "DK" ;Denmark
dd 02A004021h,000000000h,000000000h,000000000h
dd 02A00403Fh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
db "XX" ;Unassigned

myIP dd 02a0023a8h, 04825eda1h, 0ac528803h, 0001b9000h

.data?
hInstance HINSTANCE ?

V6Start DWORD 4 dup (?)
V6End DWORD 4 dup (?)
ToFind DWORD 4 dup (?)

.const

.code

start:
lea edi, myIP
mov eax, dword ptr[edi]
mov ebx, dword ptr[edi+4]
mov ecx, dword ptr[edi+8]
mov edx, dword ptr[edi+12]

lea edi, ToFind
mov dword ptr[edi], edx
mov dword ptr[edi+4], ecx
mov dword ptr[edi+8], ebx
mov dword ptr[edi+12], eax

lea esi, ipv6

checkStart:
mov eax, dword ptr[esi]
mov ebx, dword ptr[esi+4]
mov ecx, dword ptr[esi+8]
mov edx, dword ptr[esi+12]

mov dword ptr[V6Start], edx
mov dword ptr[V6Start+4], ecx
mov dword ptr[V6Start+8], ebx
mov dword ptr[V6Start+12], eax

mov eax, dword ptr[esi+16]
mov ebx, dword ptr[esi+20]
mov ecx, dword ptr[esi+24]
mov edx, dword ptr[esi+28]

mov dword ptr[V6End], edx
mov dword ptr[V6End+4], ecx
mov dword ptr[V6End+8], ebx
mov dword ptr[V6End+12], eax

movups xmm0, V6Start ;[esi]
movups xmm1, ToFind
PCMPGTQ xmm0, xmm1
or eax,-1 ; clear zero flag
ptest xmm0, xmm1
jnz exit
lea esi, [esi+34]
jmp short checkStart
exit:
sub esi, 34
ret

invoke ExitProcess,0h

End start
« Last Edit: November 06, 2019, 11:08:04 PM by colinr »

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 6769
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: 128 bit Comparison on 32-bit CPU
« Reply #10 on: November 06, 2019, 11:57:33 PM »
Colin,

If you begin at the starting offset plus the offset for the member you are after, then you just ADD that data length to get the next member offset in the following data set. If I was working on a data set of this type, I would construct an array of pointers to the start of each data set then just reference the data set plus offset You will find techniques of this type reasonably fast.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #11 on: November 07, 2019, 12:10:37 AM »
Thanks Hutch

The dataset has over 527,000 entries, and changes on a monthly basis as allocations change.

I self generate these datasets from an ANSI encoded csv file which contains  the IP addresses in their decimal notation.

I think I might have figured it out, just testing it today, I'll report back.

AW

  • Member
  • *****
  • Posts: 2442
  • Let's Make ASM Great Again!
Re: 128 bit Comparison on 32-bit CPU
« Reply #12 on: November 07, 2019, 12:35:29 AM »
Well, my example is broken  :sad:
What happens is that PCMPGTQ makes a signed compare. So my example worked but  you choose an IP where this qword 0ac528803h0001b9000h is negative. I am not seeing another easy SSE solution at the moment because there is no equivalent packed unsigned compare. Probably you should try to do it using General Purpose Registers as suggested here http://masm32.com/board/index.php?topic=8148.msg89445#msg89445.

colinr

  • Regular Member
  • *
  • Posts: 15
Re: 128 bit Comparison on 32-bit CPU
« Reply #13 on: November 07, 2019, 12:46:20 AM »
@AW  - Don't be so disappointed! Actually you have been a great help.

I've implemented the code into my DLL that does the lookup, and it seems to work!

The code is still in that swaps the Endianness of the addresses, but that is just a necessary evil at the moment.

Take a look at the routine below, is there anything you can think of that will cause a lookup failure?

EDIT: I've tested against the IP address previously quoted and tested against the very last entry in the data-set by 'faking' an IP address to sit within this final range, both worked!

Code: [Select]
LookUpIPv6Country:

xor edx, edx
xor eax, eax
mov eax, dword ptr[IPv6CountryFileSize]
mov ebx, 34 ;Will be dividing by 34 bytes
div ebx

mov edx, eax
mov esi, dword ptr[IPv6CountryMem] ;Pointer to IPv6 data

lea edi, CountryFormedIpV6_1 ;The IP to find
FindV6CountryLoop:

push edx ;Save loop counter

mov eax, dword ptr[esi]
mov ebx, dword ptr[esi+4]
mov ecx, dword ptr[esi+8]
mov edx, dword ptr[esi+12]
bswap eax
bswap ebx
bswap ecx
bswap edx
mov dword ptr[TempIpV6StartBlock1], edx
mov dword ptr[TempIpV6StartBlock1+4], ecx
mov dword ptr[TempIpV6StartBlock1+8], ebx
mov dword ptr[TempIpV6StartBlock1+12], eax

movups xmm0, [TempIpV6StartBlock1]
movups xmm1, [edi]
PCMPGTQ xmm0, xmm1
or eax,-1 ; clear zero flag
ptest xmm0, xmm1
jb short V6CountryNext

pop edx
lea esi, [esi+34]
dec edx
jne FindV6CountryLoop
jmp NotAV6CountryAddress

V6CountryNext:
lea esi,[esi-34]
mov eax, dword ptr[esi+16]
mov ebx, dword ptr[esi+20]
mov ecx, dword ptr[esi+24]
mov edx, dword ptr[esi+28]
bswap eax
bswap ebx
bswap ecx
bswap edx
mov dword ptr[TempIpV6EndBlock1], edx
mov dword ptr[TempIpV6EndBlock1+4], ecx
mov dword ptr[TempIpV6EndBlock1+8], ebx
mov dword ptr[TempIpV6EndBlock1+12], eax

movups xmm0, [TempIpV6EndBlock1] ;[esi]
movups xmm1, [edi]
PCMPGTQ xmm0, xmm1
or eax,-1 ; clear zero flag
ptest xmm0, xmm1
jb short V6CountryFound

pop edx
lea esi, [esi+34]
dec edx
jne FindV6CountryLoop
jmp short NotAV6CountryAddress

V6CountryFound:
pop edx

lea eax, IPv6CountryISOCode
xor ecx, ecx
mov cl, byte ptr[esi+32]
mov byte ptr[eax], cl
mov cl, byte ptr[esi+33]
mov byte ptr[eax+1], cl
ret

;Routine unused - maybe use in proxy lookup
IPv6CountryIsSame:
ret

NotAV6CountryAddress:
mov byte ptr[IPv6CountryISOCode], "Z"
mov byte ptr[IPv6CountryISOCode+1], "Z"
lea eax, IPv6CountryISOCode
ret

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 6769
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: 128 bit Comparison on 32-bit CPU
« Reply #14 on: November 07, 2019, 01:12:46 AM »
Colin,

With a data layout of this type, what is the search criterion you need to use ?
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy: