General > The Laboratory

strcmp with wildcards

(1/2) > >>

Queue:
A recent thread in here reminded me of a time I needed simple wildcard support (specifically * and ?) for a rough lstrcmpA equivalent. While this doesn't handle all sorts of special circumstances, it did its job, but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions? Or is it just screwed because MASM doesn't let you break out of multiple loops (or does it and I've just never figured out the syntax)?


--- Code: ---include \masm32\include\masm32rt.inc

_lstrcmpA_wildcard PROTO :DWORD,:DWORD

; int 3 align
align_int3 macro _:REQ
local $$, $$$
$$ equ $
align _
$$$ equ $ - $$
if $$$
org $$
db $$$ dup(0CCh)
endif
endm

; strip prologue
prologue_none macro
local $$
$$ db 0CCh
nop
org $$
endm

.data

align_int3 16
sResult db "0 == match",0Ah,"1 == no match",0Ah,0Ah
sValue dd "0"

align_int3 16
sString1 db "This is the string to search.",0

align_int3 16 ; expect no match
sString2a db "*the string x*",0

align_int3 16 ; expect match
sString2b db "*the?string?t*",0

align_int3 16 ; expect no match (*is matches too early)
sString2c db "*is the strin*",0

align_int3 16 ; expect match
sString2d db "*is*the strin*",0

align_int3 16 ; expect no match (* escapes ?)
sString2e db "*?",0

align_int3 16 ; expect match
sString2f db "*.",0

align_int3 16

;~========================================================================================

.code

align_int3 16
EntryPoint proc <forceframe> uses ebx esi edi
invoke _lstrcmpA_wildcard, offset sString1, offset sString2a
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2b
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2c
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2d
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2e
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2f
call showmsg

exit
ret
EntryPoint endp

;~........................................................................................

align_int3 16
showmsg proc
.if eax == 0
mov sValue, "0"
.else
mov sValue, "1"
.endif
invoke MessageBoxA, NULL, offset sResult, NULL, MB_OK
ret
showmsg endp

;~........................................................................................

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0
.repeat
mov ah, byte ptr [ecx]
; .break .if !(ah & ah) ; ah == 0
test ah, ah
jz break
inc ecx
.until ah == al
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
break: movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................

align_int3 16

end EntryPoint

--- End code ---

The example strings aren't comprehensive, but do cover quirks like "*?". I realize naming it _lstrcmpA_* isn't great since it doesn't match lstrcmpA behavior, but in my defense, when it was written, it matched behavior in the specific circumstance it was designed for.

Queue

felipe:

--- Quote from: Queue on August 13, 2017, 08:35:33 AM --- but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions?

--- End quote ---

You can use only jumps and labels.

 :biggrin:

jj2007:
There are no "multiple breaks", but you can always jump to a label, as you have done in your code. The stack should be balanced, of course.

You show some interesting programming techniques :t

The official way of skipping the stack frame is this one, though:

--- Code: ---OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
; prologue_none
...
retn DWORD * 2
_lstrcmpA_wildcard endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
--- End code ---

If you use UAsm, you don't even need that step: UAsm determines automatically that no stack frame is needed.

Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.

Queue:

--- Quote from: jj2007 ---The official way of skipping the stack frame is...

--- End quote ---
Right, notice the macro is named to mimic the option. Sometimes declaring and then redeclaring the prologue option is a pain in the butt. Note that this macro causes issues if there's a relocatable address in an instruction in the proc.

--- Quote from: jj2007 ---Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.

--- End quote ---
Any time you want to check if part of a string matches your search criteria without needing a complete match. I guess an example could be: use with EnumWindows and GetWindowTextA to find a window whose title contains both the program's name and a filename. It's an imperfect example because there'd be better ways to find said window, and you'd be better off using unicode functions if you resorted to a text search, but that's beside the point. It'd arguably be better (more precise) than using "strstr" and way less expensive than a RegExp implementation.

Regarding UASM, neat, though I rely on some weirdness of MASM for an everyday macro I use that I'm fairly certain UASM doesn't (and likely wouldn't) support. I'll post about that in a new topic some time in the future (it involves option oldmacros).

Queue

nidud:
Asmc is an attempt to remove labels. I have a similar function with the following test.


--- Code: ---cmpwarg PROC uses esi path:LPSTR, wild:LPSTR

    mov esi,path
    mov ecx,wild
    xor eax,eax

    .while 1

        lodsb
        mov ah,[ecx]
        inc ecx

        .if ah == '*'

            .while 1
                mov ah,[ecx]
                .if !ah
                    mov eax,1
                    .break(1)
                .endif
                inc ecx
                .continue .if ah != '.'
                xor edx,edx
                .while al
                    .if al == ah
                        mov edx,esi
                    .endif
                    lodsb
                .endw
                mov esi,edx
                .continue(1) .if edx
                mov ah,[ecx]
                inc ecx
                .continue .if ah == '*'
                test eax,eax
                mov  ah,0
                setz al
                .break(1)
            .endw

        .endif

        mov edx,eax
        xor eax,eax
        .if !dl
            .break .if edx
            inc eax
            .break
        .endif
        .break .if !dh
        .continue .if dh == '?'
        .if dh == '.'
            .continue .if dl == '.'
            .break
        .endif
        .break .if dl == '.'
        or edx,0x2020
        .break .if dl != dh
    .endw
    test eax,eax
    ret

cmpwarg ENDP

--- End code ---


--- Code: ---main proc

    .assert( cmpwarg("file",     "*"        ) == 1 )
    .assert( cmpwarg("file",     "*.*"      ) == 1 )
    .assert( cmpwarg("file",     "f*.*"     ) == 1 )
    .assert( cmpwarg("file",     "file*"    ) == 1 )
    .assert( cmpwarg("file.c",   "file.?"   ) == 1 )
    .assert( cmpwarg("file.c",   "file.??"  ) == 0 )
    .assert( cmpwarg("file.c",   "???.?"    ) == 0 )
    .assert( cmpwarg("file.c",   "????.?"   ) == 1 )
    .assert( cmpwarg("file.c",   "file.c*"  ) == 1 )
    .assert( cmpwarg("file.c",   "file.c?"  ) == 0 )
    .assert( cmpwarg("file.x.c", "*.c"      ) == 1 )
    .assert( cmpwarg("file.x.c", "????.*.c" ) == 1 )
    .assert( cmpwarg("file.x.c", "????.*.b" ) == 0 )
    .assert( cmpwarg("file.x.c", "*.?.b"    ) == 0 )
    .assert( cmpwarg("file.x.c", "*.*.b"    ) == 0 )
    .assert( cmpwarg("file.x.c", "*.*.c"    ) == 0 )
    .assert( cmpwarg("file.x.c", "*.?.c"    ) == 0 )
    .assert( cmpwarg("file.x.c", "*?x.c"    ) == 1 )
    .assert( cmpwarg("file.ext", "*"        ) == 1 )
    .assert( cmpwarg("file.prj", "*.*"      ) == 1 )
    .assert( cmpwarg("file.ext", "x*.*"     ) == 0 )
    .assert( cmpwarg("ab39.ext", "?B39.??T" ) == 1 )
    .assert( cmpwarg("abcd.ext", "?b?.?x?"  ) == 0 )
    .assert( cmpwarg("abcd.ext", "?b*.?x?"  ) == 1 )
    .assert( cmpwarg("abcd.ext", "?b*.?z?"  ) == 0 )

    xor eax,eax
    ret

main endp

--- End code ---

Navigation

[0] Message Index

[#] Next page

Go to full version