Author Topic: byte ptr comparison  (Read 28026 times)

Ryan

  • Guest
byte ptr comparison
« on: June 08, 2012, 10:43:10 PM »
I'm still working on my search/find routine.  I want to do a byte by byte comparison.  If I can't do ".if byte ptr [eax]==byte ptr [edi]", what is the best alternative?  I already have a workaround by just moving [edi] to dl.  Is there an easy way to eliminate the step?  I'd also like to keep the edx register open for other things.  How does cmpsb do it?

BogdanOntanu

  • Global Moderator
  • Member
  • *****
  • Posts: 64
    • Solar_OS, Solar_Asm and HE RTS Game
Re: byte ptr comparison
« Reply #1 on: June 08, 2012, 11:08:05 PM »
I'm still working on my search/find routine.  I want to do a byte by byte comparison.  If I can't do ".if byte ptr [eax]==byte ptr [edi]", what is the best alternative?  I already have a workaround by just moving [edi] to dl.  Is there an easy way to eliminate the step?  I'd also like to keep the edx register open for other things.  How does cmpsb do it?

mov al,[esi]
.if al == [edi]

.endif

cmpsb does mainly  the same thing but it is slow...

You need to explain more / better what exactly do you want to compare?

Two strings? one substring inside a string? A char inside a string?
Ambition is a lame excuse for the ones not brave enough to be lazy, www.oby.ro

Ryan

  • Guest
Re: byte ptr comparison
« Reply #2 on: June 08, 2012, 11:17:13 PM »
If cmpsb moves [esi] to al, then I guess there's no way around it.  No big deal.  I was just curious.  Thanks.

jj2007

  • Member
  • *****
  • Posts: 13957
  • Assembly is fun ;-)
    • MasmBasic
Re: byte ptr comparison
« Reply #3 on: June 09, 2012, 12:22:36 AM »
No cmpsb doesn't use al, but it needs both esi and edi, of course. As regards "slow": That may depend on your CPU. Test yourself 8)

Code: [Select]
AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
235     cycles for cmpsb
518     cycles for cmp [esi+ecx]
313     cycles for cmp [esi]

235     cycles for cmpsb
517     cycles for cmp [esi+ecx]
314     cycles for cmp [esi]

xandaz

  • Member
  • ****
  • Posts: 529
  • I luv you babe
    • My asm examples
Re: byte ptr comparison
« Reply #4 on: June 09, 2012, 12:25:43 AM »
    I'm not the kind of counting clocks and wasnt aware it was slow. I've done some compare routines more or less on these lines:
Code: [Select]
Compare PROC lpSource:DWORD,cbSource:DWORD,lpCompare:DWORD
mov esi,lpSource
mov edi,lpCompare
mov ecx,cbSource ; the size of the source string
compare:
cmpsb
jnz no_match
dec ecx
jcxz  match
jmp compare
match:
mov eax,TRUE
ret
no_match:
mov eax,FALSE
ret

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: byte ptr comparison
« Reply #5 on: June 09, 2012, 12:28:12 AM »
Ryan,

Bogdan is right here, the old string instructions are generally slow but they also have the irritation of being locked into specific registers which may not fit into the rest of the code you need to write. The mechanics of doing string comparisons vary on what you are doing, for a normal search you load the start character in one register then compare it to a memory operand pointed to by BYTE PTR.


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    .data?
      value dd ?

    .data
      txt db "one two three four five six seven", 0

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    mov eax, "f"            ; the start character to find

    mov edx, OFFSET txt     ; the address of the text to find it in
    sub edx, 1


  lbl0:
    add edx, 1
    cmp BYTE PTR [edx], 0   ; is it the zero terminator ?
    je outa_here
    cmp BYTE PTR [edx], al  ; compare against the BYTE sized part of EAX
    jne lbl0

  ; in a search algo you would branch here to compare the
  ; rest of the search word to the current text location
  ; then return to lbl0 if it does not match

  outa_here:

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

dedndave

  • Member
  • *****
  • Posts: 8828
  • Still using Abacus 2.0
    • DednDave
Re: byte ptr comparison
« Reply #6 on: June 09, 2012, 12:30:08 AM »
REPZ CMPSB might not be as bad as you think   :P
we should probably do some comparisons
at any rate, have a look at Hutch's szCmp and szCmpi routines in the \masm32\m32lib folder

jj2007

  • Member
  • *****
  • Posts: 13957
  • Assembly is fun ;-)
    • MasmBasic
Re: byte ptr comparison
« Reply #7 on: June 09, 2012, 12:42:22 AM »
See above, Reply #3 :biggrin:

dedndave

  • Member
  • *****
  • Posts: 8828
  • Still using Abacus 2.0
    • DednDave
Re: byte ptr comparison
« Reply #8 on: June 09, 2012, 12:48:53 AM »
yah - that is what i might expect
if we were doing REPZ CMPSD on aligned DWORD's, things might be different

prescott w/htt
Code: [Select]
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
100     cmpsb
100     [esi+ecx]
100     [esi]

500     cycles for cmpsb
448     cycles for cmp [esi+ecx]
544     cycles for cmp [esi]

496     cycles for cmpsb
448     cycles for cmp [esi+ecx]
541     cycles for cmp [esi]

string length may play an important role
although, we rarely compare strings longer than, say, 100 bytes or so

Ryan

  • Guest
Re: byte ptr comparison
« Reply #9 on: June 09, 2012, 12:54:25 AM »
If cmpsb doesn't use al, how does it do the comparison?  I've tried "cmp byte ptr [eax], byte ptr [edx]", but it doesn't work.  I assume it must be using some other method to do the comparison.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 10583
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: byte ptr comparison
« Reply #10 on: June 09, 2012, 01:01:38 AM »
Ryan,

That will not work because there is no mnemonic in an x86 processor that will directly compare memory to memory, you must load at least one into a register. CMP BYTE PTR [ESI], AL
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :skrewy:

dedndave

  • Member
  • *****
  • Posts: 8828
  • Still using Abacus 2.0
    • DednDave
Re: byte ptr comparison
« Reply #11 on: June 09, 2012, 01:05:04 AM »
i suppose CMPSB loads a byte from [ESI] into a temporary register

Ryan

  • Guest
Re: byte ptr comparison
« Reply #12 on: June 09, 2012, 01:18:30 AM »
I'm not sure what to think.  jj says cmpsb doesn't use al, but Hutch and Dave allude to the possibility that it might?

dedndave

  • Member
  • *****
  • Posts: 8828
  • Still using Abacus 2.0
    • DednDave
Re: byte ptr comparison
« Reply #13 on: June 09, 2012, 01:47:01 AM »
no - the CMPS instructions do not use AL/AX/EAX   :biggrin:

they perform the equivalent of
Code: [Select]
        cmp     [esi],[edi]in byte, word, or dword form, as applicable
then, they adjust the index registers ESI and EDI

i mentioned a temporary register
that is a register (or plural) that the CPU uses for certain operations

the reason i say that is - i don't think there is any way for the hardware DMA to compare values at two different addresses - it's not part of the design
i.e., the CPU has to get one of the values into an internal register to make the comparison

jj2007

  • Member
  • *****
  • Posts: 13957
  • Assembly is fun ;-)
    • MasmBasic
Re: byte ptr comparison
« Reply #14 on: June 09, 2012, 02:22:49 AM »
One more test with extra long strings:
Code: [Select]
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
10000   cmpsb
10000   [esi+ecx]
10000   [esi]

40124   cycles for cmpsb
30076   cycles for cmp [esi+ecx]
31215   cycles for cmp [esi]

40127   cycles for cmpsb
30055   cycles for cmp [esi+ecx]
31208   cycles for cmp [esi]

On the AMD, cmpsb was clearly faster, but on Intel it is exactly 33% slower.