The MASM Forum

General => The Laboratory => Topic started by: Queue on August 13, 2017, 08:35:33 AM

Title: strcmp with wildcards
Post by: Queue on August 13, 2017, 08:35:33 AM
A recent thread in here reminded me of a time I needed simple wildcard support (specifically * and ?) for a rough lstrcmpA equivalent. While this doesn't handle all sorts of special circumstances, it did its job, but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions? Or is it just screwed because MASM doesn't let you break out of multiple loops (or does it and I've just never figured out the syntax)?


include \masm32\include\masm32rt.inc

_lstrcmpA_wildcard PROTO :DWORD,:DWORD

; int 3 align
align_int3 macro _:REQ
local $$, $$$
$$ equ $
align _
$$$ equ $ - $$
if $$$
org $$
db $$$ dup(0CCh)
endif
endm

; strip prologue
prologue_none macro
local $$
$$ db 0CCh
nop
org $$
endm

.data

align_int3 16
sResult db "0 == match",0Ah,"1 == no match",0Ah,0Ah
sValue dd "0"

align_int3 16
sString1 db "This is the string to search.",0

align_int3 16 ; expect no match
sString2a db "*the string x*",0

align_int3 16 ; expect match
sString2b db "*the?string?t*",0

align_int3 16 ; expect no match (*is matches too early)
sString2c db "*is the strin*",0

align_int3 16 ; expect match
sString2d db "*is*the strin*",0

align_int3 16 ; expect no match (* escapes ?)
sString2e db "*?",0

align_int3 16 ; expect match
sString2f db "*.",0

align_int3 16

;~========================================================================================

.code

align_int3 16
EntryPoint proc <forceframe> uses ebx esi edi
invoke _lstrcmpA_wildcard, offset sString1, offset sString2a
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2b
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2c
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2d
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2e
call showmsg

invoke _lstrcmpA_wildcard, offset sString1, offset sString2f
call showmsg

exit
ret
EntryPoint endp

;~........................................................................................

align_int3 16
showmsg proc
.if eax == 0
mov sValue, "0"
.else
mov sValue, "1"
.endif
invoke MessageBoxA, NULL, offset sResult, NULL, MB_OK
ret
showmsg endp

;~........................................................................................

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0
.repeat
mov ah, byte ptr [ecx]
; .break .if !(ah & ah) ; ah == 0
test ah, ah
jz break
inc ecx
.until ah == al
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
break: movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................

align_int3 16

end EntryPoint


The example strings aren't comprehensive, but do cover quirks like "*?". I realize naming it _lstrcmpA_* isn't great since it doesn't match lstrcmpA behavior, but in my defense, when it was written, it matched behavior in the specific circumstance it was designed for.

Queue
Title: Re: strcmp with wildcards
Post by: felipe on August 13, 2017, 10:03:38 AM
Quote from: Queue on August 13, 2017, 08:35:33 AM
but it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc. Anyone have any suggestions?

You can use only jumps and labels.

:biggrin:
Title: Re: strcmp with wildcards
Post by: jj2007 on August 13, 2017, 10:23:55 AM
There are no "multiple breaks", but you can always jump to a label, as you have done in your code. The stack should be balanced, of course.

You show some interesting programming techniques :t

The official way of skipping the stack frame is this one, though:
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
; prologue_none
...
retn DWORD * 2
_lstrcmpA_wildcard endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


If you use UAsm (http://www.terraspace.co.uk/uasm.html#p2), you don't even need that step: UAsm determines automatically that no stack frame is needed.

Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.
Title: Re: strcmp with wildcards
Post by: Queue on August 13, 2017, 11:24:35 AM
Quote from: jj2007
The official way of skipping the stack frame is...
Right, notice the macro is named to mimic the option. Sometimes declaring and then redeclaring the prologue option is a pain in the butt. Note that this macro causes issues if there's a relocatable address in an instruction in the proc.
Quote from: jj2007
Btw can you give a short real life example where to use this wildcard comparison? I have never thought of it, just curious how it could be applied.
Any time you want to check if part of a string matches your search criteria without needing a complete match. I guess an example could be: use with EnumWindows and GetWindowTextA to find a window whose title contains both the program's name and a filename. It's an imperfect example because there'd be better ways to find said window, and you'd be better off using unicode functions if you resorted to a text search, but that's beside the point. It'd arguably be better (more precise) than using "strstr" and way less expensive than a RegExp implementation.

Regarding UASM, neat, though I rely on some weirdness of MASM for an everyday macro I use that I'm fairly certain UASM doesn't (and likely wouldn't) support. I'll post about that in a new topic some time in the future (it involves option oldmacros).

Queue
Title: Re: strcmp with wildcards
Post by: nidud on August 13, 2017, 12:19:26 PM
deleted
Title: Re: strcmp with wildcards
Post by: aw27 on August 13, 2017, 05:47:11 PM
Quote from: Queue on August 13, 2017, 08:35:33 AM
it always bugged me that I had a jump and label when the rest of the function was able to use .if/.repeat/etc

In this case a function call will alleviate the pain of seeing a jump and label.
Something like this:


;~........................................................................................

proc1 proc private
.repeat
mov ah, byte ptr [ecx]
test ah, ah
.if ZERO?
mov edx, 0
ret
.endif
inc ecx
.until ah == al
ret
proc1 endp

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0

call proc1
.break .if edx==0
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................

Title: Re: strcmp with wildcards
Post by: aw27 on August 13, 2017, 06:10:59 PM
Another (better) way:


;~........................................................................................

align_int3 16
_lstrcmpA_wildcard proc lpString1:DWORD, lpString2:DWORD
prologue_none
mov edx, [esp + DWORD * 2] ; lpString2
mov ecx, [esp + DWORD * 1] ; lpString1
.while TRUE
mov al, byte ptr [edx]
.if al == "*"
.repeat
inc edx
mov al, byte ptr [edx]
.until al != "*"
.break .if !(al & al) ; al == 0
.repeat
mov ah, byte ptr [ecx]
; .break .if !(ah & ah) ; ah == 0
test ah, ah
.break .if ZERO?
inc ecx
.until (ah == al)
.break .if  (ah!=al)
.else
mov ah, byte ptr [ecx]
.break .if !(ah & ah) ; ah == 0
.if !(al & al) ; al == 0
xchg al, ah
.break
.endif
.break .if ah != al && al != "?"
inc ecx
.endif
inc edx
.endw
       movzx eax, al
retn DWORD * 2
_lstrcmpA_wildcard endp

;~........................................................................................
Title: Re: strcmp with wildcards
Post by: jj2007 on August 13, 2017, 06:18:06 PM
Quote from: Queue on August 13, 2017, 11:24:35 AMinvolves option oldmacros

To my knowledge, this is not implemented in UAsm. But most probably your macro needs only a minor modification to work without that option (I am curious to see the new thread). One reason why UAsm64 is my default nowadays is speed: It makes a real difference if you have to wait 3 (UAsm) or 10 (ML) seconds for your build to complete...
Title: Re: strcmp with wildcards
Post by: hutch-- on August 14, 2017, 12:10:36 AM
Queue,

If you have the MASM32 distribution, have a look at one of the library routines called "partial.asm". It does not do what you are doing but it is doing a similar things, scanning a text for partial matches.