News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

strcmp with wildcards

Started by Queue, August 13, 2017, 08:35:33 AM

Previous topic - Next topic

Queue

A recent thread in here reminded me of a time I needed simple wildcard support (specifically * and ?) for a rough lstrcmpA equivalent. While this doesn't handle all sorts of special circumstances, it did its job, but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions? Or is it just screwed because MASM doesn't let you break out of multiple loops (or does it and I've just never figured out the syntax)?


include \masm32\include\masm32rt.inc

_lstrcmpA_wildcard PROTO :DWORD,:DWORD

; int 3 align
align_int3 macro _:REQ
local $$, $$$
$$ equ $
align _
$$$ equ $ - $$
if $$$
org $$
db $$$ dup(0CCh)
endif
endm

; strip prologue
prologue_none macro
local $$
$$ db 0CCh
nop
org $$
endm

.data

align_int3 16
sResult db "0 == match",0Ah,"1 == no match",0Ah,0Ah
sValue dd "0"

align_int3 16
sString1 db "This is the string to search.",0

align_int3 16 ; expect no match
sString2a db "*the string x*",0

align_int3 16 ; expect match
sString2b db "*the?string?t*",0

align_int3 16 ; expect no match (*is matches too early)
sString2c db "*is the strin*",0

align_int3 16 ; expect match
sString2d db "*is*the strin*",0

align_int3 16 ; expect no match (* escapes ?)
sString2e db "*?",0

align_int3 16 ; expect match
sString2f db "*.",0

align_int3 16

;~========================================================================================

.code

align_int3 16
EntryPoint proc <forceframe> uses ebx esi edi
invoke _lstrcmpA_wildcard, offset sString1, offset sString2a
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2b
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2c
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2d
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2e
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2f
call showmsg

exit
ret
EntryPoint endp

;~........................................................................................

align_int3 16
showmsg proc
.if eax == 0
mov sValue, "0"
.else
mov sValue, "1"
.endif
invoke MessageBoxA, NULL, offset sResult, NULL, MB_OK
ret
showmsg endp

;~........................................................................................

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0
.repeat
mov ah, byte ptr [ecx]
; .break .if !(ah & ah) ; ah == 0
test ah, ah
jz break
inc ecx
.until ah == al
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
break: movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................

align_int3 16

end EntryPoint


The example strings aren't comprehensive, but do cover quirks like "*?". I realize naming it _lstrcmpA_* isn't great since it doesn't match lstrcmpA behavior, but in my defense, when it was written, it matched behavior in the specific circumstance it was designed for.

Queue

felipe

Quote from: Queue on August 13, 2017, 08:35:33 AM
but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions?

You can use only jumps and labels.

:biggrin:

jj2007

There are no "multiple breaks", but you can always jump to a label, as you have done in your code. The stack should be balanced, of course.

You show some interesting programming techniques :t

The official way of skipping the stack frame is this one, though:
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
; prologue_none
...
retn DWORD * 2
_lstrcmpA_wildcard endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


If you use UAsm, you don't even need that step: UAsm determines automatically that no stack frame is needed.

Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.

Queue

Quote from: jj2007
The official way of skipping the stack frame is...
Right, notice the macro is named to mimic the option. Sometimes declaring and then redeclaring the prologue option is a pain in the butt. Note that this macro causes issues if there's a relocatable address in an instruction in the proc.
Quote from: jj2007
Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.
Any time you want to check if part of a string matches your search criteria without needing a complete match. I guess an example could be: use with EnumWindows and GetWindowTextA to find a window whose title contains both the program's name and a filename. It's an imperfect example because there'd be better ways to find said window, and you'd be better off using unicode functions if you resorted to a text search, but that's beside the point. It'd arguably be better (more precise) than using "strstr" and way less expensive than a RegExp implementation.

Regarding UASM, neat, though I rely on some weirdness of MASM for an everyday macro I use that I'm fairly certain UASM doesn't (and likely wouldn't) support. I'll post about that in a new topic some time in the future (it involves option oldmacros).

Queue

nidud

#4
deleted

aw27

Quote from: Queue on August 13, 2017, 08:35:33 AM
it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc

In this case a function call will alleviate the pain of seeing a jump and label.
Something like this:


;~........................................................................................

proc1 proc private
.repeat
mov ah, byte ptr [ecx]
test ah, ah
.if ZERO?
mov edx, 0
ret
.endif
inc ecx
.until ah == al
ret
proc1 endp

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0

call proc1
.break .if edx==0
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................


aw27

Another (better) way:


;~........................................................................................

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0
.repeat
mov ah, byte ptr [ecx]
; .break .if !(ah & ah) ; ah == 0
test ah, ah
.break .if ZERO?
inc ecx
.until (ah == al)
.break .if  (ah!=al)
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
       movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................

jj2007

Quote from: Queue on August 13, 2017, 11:24:35 AMinvolves option oldmacros

To my knowledge, this is not implemented in UAsm. But most probably your macro needs only a minor modification to work without that option (I am curious to see the new thread). One reason why UAsm64 is my default nowadays is speed: It makes a real difference if you have to wait 3 (UAsm) or 10 (ML) seconds for your build to complete...

hutch--

Queue,

If you have the MASM32 distribution, have a look at one of the library routines called "partial.asm". It does not do what you are doing but it is doing a similar things, scanning a text for partial matches.