News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Comparing strings from console input?

Started by JavaProphet, May 15, 2013, 05:17:35 AM

Previous topic - Next topic

JavaProphet

Here's my code:
include \masm32\include\masm32rt.inc

.data
inp1 dd "none"
inpend dd "stop"

.code

start:
    mov inp1, cfm$(input("Type text here : "))
    printc "You Entered: "
    printc inp1
    printc "\n"
    cld
    mov ecx, 100
    mov esi, offset inp1
    mov edi, offset inpend
    repe cmpsb
    jne start
    inkey
    exit
end start


It works, but when I type stop in the console, it doesn't exit the program. If i change jne start to je start (so you get to type again) it crashes, although I assume it thinks it never matches, and crashes when it exits? This is my first attempt, and I've been browsing google for a while, combining solutions to try and figure it out. I am following the tutorial here: http://www.oocities.org/codeteacher/x86asm/asml1001.html It is is 16-bit code tasm, but I am writing this in masm32. I appreciate any help I can get, thanks.

RuiLoureiro

hi
the buffer starts 4 bytes forward
the question seems to be
this dd here


include \masm32\include\masm32rt.inc

.data
inp1     dd "none"   ; <--- here
inpend dd "stop"

.code

start:
    mov inp1, cfm$(input("Type text here : "))
    print "You Entered: "
    print inp1
    print "\n"
    cld
    mov ecx, 100
    mov esi, offset inp1
    add esi, 4                           ; <----------- this trick works
    mov edi, offset inpend
    repe cmpsb
    jne start
   
    inkey
    exit
end start

jj2007

Minor changes make it work...

include \masm32\include\masm32rt.inc

.data
inp1 dd ?
inpend db "stop"

.code

start:
    mov inp1, cfm$(input("Type text here : "))
    printc "You Entered: "
    printc inp1
    printc "\n"
    cld
    mov ecx, 100
    mov esi, inp1
    mov edi, offset inpend
    repe cmpsb
    cmp byte ptr [esi-1], 0
    jne start
    inkey
    exit
end start


P.S.: \Masm32\macros\macros.asm
    input MACRO prompt:VARARG
...
        EXITM <OFFSET buffer>


buffer is 260 TCHARs long, so with mov ecx, 100 you are on the safe side.

Vortex

Hi JavaProphet,

You could also use the lstrcmp API or the Masm32 functions to compare NULL terminated strings.

qWord

You might try to avoid reading memory that isn't allocated by checking the size of the input. The usage of cfm$() also makes not much sense in this context, because it is commonly used to format string literals. Some variations:
include \masm32\include\masm32rt.inc

.data
    inp1    BYTE "none"
    inpend  BYTE "stop"
            BYTE 0 ; termination zero for lstrcmp()
.data?
    pszInput PCHAR ?
.code

start:
    mov pszInput,input("Type text here : ")
    printc "You Entered: "
    printc pszInput
    printc "\n"
   
    cld
    mov edi,pszInput
    xor eax,eax
    mov ecx,-1
    repne scasb
    sub edi,pszInput        ; edi = lengthof string + termination zero.
    lea ecx,[edi-1]         ; ecx = lengthof string
   
    cmp ecx,LENGTHOF inpend ; strings must have the same length!
    jne start
   
    mov esi,OFFSET inpend
    mov edi,pszInput
    repe cmpsb              ; compare strings
    jnz start
   
    print "method 2:",13,10

start2:
    mov pszInput,input("Type text here : ")
    invoke lstrcmp,OFFSET inpend,pszInput ; WinAPI
    test eax,eax
    jnz start2
   
    print "method 3:",13,10
   
start3:
    mov pszInput,input("Type text here : ")
    switch$ pszInput     ; MASM32 SDK
    case$ "stop"
        ; nothing to do here
    else$
        jmp start3
    endsw$
   
   
    inkey
    exit
end start
MREAL macros - when you need floating point arithmetic while assembling!

JavaProphet

Thanks, that works well, but can you explain these instructions to me?

cmp byte ptr [esi-1], 0

So you have a byte pointer, I assume that gets the memory value(not address?) of the input, but what is the -1 for? Also, what is the 0 for? And I'm also assuming that setting the question mark as the value(aka unknown value), that it automatically pulls the offsets, so you have no need to write the offset word? Thank you for helping.

JavaProphet

Quote from: Vortex on May 15, 2013, 06:15:01 AM
Hi JavaProphet,

You could also use the lstrcmp API or the Masm32 functions to compare NULL terminated strings.
I tried, but I couldn't figure out how lstrcmp or the other functions worked.

Quote from: RuiLoureiro on May 15, 2013, 05:58:58 AM
hi
the buffer starts 4 bytes forward
the question seems to be
this dd here


include \masm32\include\masm32rt.inc

.data
inp1     dd "none"   ; <--- here
inpend dd "stop"

.code

start:
    mov inp1, cfm$(input("Type text here : "))
    print "You Entered: "
    print inp1
    print "\n"
    cld
    mov ecx, 100
    mov esi, offset inp1
    add esi, 4                           ; <----------- this trick works
    mov edi, offset inpend
    repe cmpsb
    jne start
   
    inkey
    exit
end start


How exactly is it 4 bytes forward? I know that dd is a dword, so it's 4 bytes, but why would you add if it's forward? Wouldn't you need to subtract?

JavaProphet

Quote from: qWord on May 15, 2013, 06:16:17 AM
You might try to avoid reading memory that isn't allocated by checking the size of the input. The usage of cfm$() also makes not much sense in this context, because it is commonly used to format string literals. Some variations:
include \masm32\include\masm32rt.inc

.data
    inp1    BYTE "none"
    inpend  BYTE "stop"
            BYTE 0 ; termination zero for lstrcmp()
.data?
    pszInput PCHAR ? ; what is a pchar?
.code

start:
    mov pszInput,input("Type text here : ")
    printc "You Entered: "
    printc pszInput
    printc "\n"
   
    cld
    mov edi,pszInput
    xor eax,eax ; so set eax to 0?
    mov ecx,-1
    repne scasb ; so i know thats a string function, scanning, but i don't understand it
    sub edi,pszInput        ; edi = lengthof string + termination zero.  ; whaaaaa? your subtracting strings? no idea what this does
    lea ecx,[edi-1]         ; ecx = lengthof string ; i dont exactly know what lea does, but i assume its like move, but like mov <to>, offset <from>
   
    cmp ecx,LENGTHOF inpend ; strings must have the same length! ; how did youget the length of input from ecx?
    jne start
   
    mov esi,OFFSET inpend
    mov edi,pszInput ; why no offset here?
    repe cmpsb              ; compare strings
    jnz start
   
    print "method 2:",13,10

start2:
    mov pszInput,input("Type text here : ")
    invoke lstrcmp,OFFSET inpend,pszInput ; WinAPI ; offset in one, not on the other?
    test eax,eax ; no idea what test is, although i assume it compares
    jnz start2
   
    print "method 3:",13,10
   
start3:
    mov pszInput,input("Type text here : ")
    switch$ pszInput     ; MASM32 SDK ; i like this, but this makes masm32 almost like a high-level language, so im not fond of it
    case$ "stop"
        ; nothing to do here
    else$
        jmp start3
    endsw$
   
   
    inkey
    exit
end start


There is so much I don't understand there. I added comments for my questions, thanks.

jj2007

Quote from: JavaProphet on May 15, 2013, 06:18:37 AM
Thanks, that works well, but can you explain these instructions to me?

cmp byte ptr [esi-1], 0

So you have a byte pointer, I assume that gets the memory value(not address?) of the input, but what is the -1 for? Also, what is the 0 for?

cmpsb increments esi and edi. Your inpend is "stop", and there should be a zero byte afterwards - actually, you should have declared inpend db "stop", 0; since you forgot this, what follows is the "Type..." used for input.

cmpsb stops at the first difference, and there you must check if there is a zero terminator at position esi-1 (remember esi was incremented, so you have to go back one position).

If you typed stop, then cmpsb stops at a position where [esi-1] is the zero that terminates inp1 (input inserts the zero terminator for you). If you find that zero, it means the four bytes that came before where identical, i.e. the user typed "stop".

QuoteAnd I'm also assuming that setting the question mark as the value(aka unknown value), that it automatically pulls the offsets, so you have no need to write the offset word? Thank you for helping.

As quoted from macros.asm, you are really doing a mov inp1, offset buffer. Therefore you just reserve a dword for inp1. No need to put a value, it will become offset buffer anyway.

P.S.: As qWord already wrote, cfm$ does nothing useful here. Actually, it really does nothing at all, see acfm$ in macros.asm

RuiLoureiro

Quote from: JavaProphet on May 15, 2013, 06:23:33 AM
Quote from: RuiLoureiro on May 15, 2013, 05:58:58 AM
hi
the buffer starts 4 bytes forward
the question seems to be
this dd here


include \masm32\include\masm32rt.inc

.data
inp1     dd "none"   ; <--- here
inpend dd "stop"

.code

start:
    mov inp1, cfm$(input("Type text here : "))
    print "You Entered: "
    print inp1
    print "\n"
    cld
    mov ecx, 100
    mov esi, offset inp1
    add esi, 4                           ; <----------- this trick works
    mov edi, offset inpend
    repe cmpsb
    jne start
   
    inkey
    exit
end start


How exactly is it 4 bytes forward? I know that dd is a dword, so it's 4 bytes, but why would you add if it's forward? Wouldn't you need to subtract?
No. The buffer starts 4 bytes forward with the word "stop"
when we type "stop", no doubt. I think it is something to do with
this instruction: mov inp1, cfm$(input("Type text here : ")). It doesnt work if
we did inp1     db  "none". So, the question is here. no doubt.
I said forward to means that we need to add 4.

qWord

Quote from: JavaProphet on May 15, 2013, 06:31:14 AMThere is so much I don't understand there. I added comments for my questions, thanks.
QuotepszInput PCHAR ? ; what is a pchar?
PCHAR (case sensitive!) is a type definition. Actual it will be resolved to a DWORD, because pointer are of that size. The experienced reader will see that PCHAR is a Pointer. I prefer such definitions because they are some kind of documentation (even if the type LPCSTR would be more correct in this case... but that's an other story).


Quoterepne scasb ; so i know thats a string function, scanning, but i don't understand it
The best source for information about instructions are AMD's or Intel's manuals :t
-> AMD's manuals (5 volumes)
-> Intel's manuals (3 volumes)

Quotesub edi,pszInput        ; edi = lengthof string + termination zero.  ; whaaaaa? your subtracting strings? no idea what this does
I'm subtracting pointers.


Quotelea ecx,[edi-1]         ; ecx = lengthof string ; i dont exactly know what lea does, but i assume its like move, but like mov <to>, offset <from>
LEA = Load effective address. In this case it is used to calculate: ecx = edi - 1.
You might read up about memory addressing in the manuals, especially about SIB (scale index base).

Quotecmp ecx,LENGTHOF inpend ; strings must have the same length! ; how did youget the length of input from ecx?
using the instruction REPNE SCASB.

Quotemov edi,pszInput ; why no offset here?
OFFSET is an MASM operator that resolves the relative address of a label while assembling. Using it here, you would load EDI with the address of the pointer variable pszInput (which means EDI would be a pointer to a pointer to a string)
Quotetest eax,eax ; no idea what test is, although i assume it compares
see the manuals.
MREAL macros - when you need floating point arithmetic while assembling!

Vortex

Hi JavaProphet,

Here is an example for you :

include     \masm32\include\masm32rt.inc

.data

str1        db 'Orange',0
str2        db 'ORANGE',0
msg1        db 'The strings are equal.',0
msg2        db 'The strings are not equal.',0

.code

start:

    invoke  lstrcmp,ADDR str1,ADDR str2
    cmp     eax,0       ;  If the strings are equal, return value = 0
    jne     @f
    invoke  StdOut,ADDR msg1
    jmp     finish
@@:   
    invoke  StdOut,ADDR msg2

finish:

    invoke  ExitProcess,0

END start