News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Need some help on figuring out basic syntax

Started by gelatine1, December 10, 2013, 08:31:04 AM

Previous topic - Next topic

gelatine1

Hello,
I decided to first get familiar with this language and especially it's syntax. It's more different than what I was used too. I tried to write a program that simply should output ' Hello World! '.
Though my data section is defined like this:
.data
HelloWorld db "Ijmmp",21h,"Xpsme",22h, 0

As some may notice the value of each character is increased by one. And thus I tried to write following program to output as it should be outputted.
I don't really know where I can find alot of documentation about this so I just searched some examples and googled around a bit but I can't get out know. This is my code:


.386
.model flat, stdcall
option casemap :none

include \masm32\include\kernel32.inc
include \masm32\include\masm32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\masm32.lib

.data
HelloWorld db "Ijmmp",21h,"Xpsme",22h, 0

.code
start:
mov esi,offset HelloWorld
decChar:
mov al,[esi]
or al ;if al is zero then the zero flag will be set.
jz output
dec [esi] ;decrease value of the character
inc esi ;go to next character
jmp decChar

output:
invoke StdOut, addr HelloWorld
invoke ExitProcess, 0
end start


This code does not work. Can anyone tell me how to change it to make it work? I also have a few questions. (a link to an explanation would be fine too)
What is the difference between 'offset' and 'addr'?
Is 'or al' a valid instruction? Should it be 'or eax'?
What exactly is esi? is it just a pointer register or something like that?
What is the meaning of the the brackets like '[esi]'? Does it simply mean the value of the address pointed by esi?
The instruction 'or al' or something like that is faster than 'cmp al' right?

Thanks in advance,
Jannes

jj2007

- or al with what?
- dec [esi]: byte, word or dword?

        mov al,[esi]
        or al, al ;if al is zero then the zero flag will be set.
        jz output
        dec byte ptr [esi] ;decrease value of the character
        inc esi ;go to next character

Now make it Ifmmp, and your code will work as expected ;-)

> What is the difference between 'offset' and 'addr'?
None for global variables. Local variables need addr

> What exactly is esi? is it just a pointer register or something like that?
A general purpose register (more)

> What is the meaning of the the brackets like '[esi]'? Does it simply mean the value of the address pointed by esi?
Yes.

> The instruction 'or al' or something like that is faster than 'cmp al' right?
Maybe. You would have to time it. For loops with more than ten Million iterations the difference might be noticeable.

dedndave

first, you must realize what the StdOut function does
it will display a zero-terminated string - everything from the address passed to it until it sees a 0 byte

so -
szMessage db 'Hello World',13,10,0
;
;
    INVOKE  StdOut,offset szMessage


will display the string, "Hello World", followed by a carriage return and a line feed
the terminating 0 will not be displayed

gelatine1

#3
Oh okay, I thought the or instruction always works with the eax register. (Old habits)

And dedndave i already knew how null terminated strings work.
In my example i even made use of that fact. As long as it is not zero decrease. Otherwise output whole string.

dedndave

EAX, EBX, ECX, EDX, ESI, EDI, and EBP are refered to as "general registers"
they are dword registers (32 bits wide)

portions of EAX, EBX, ECX, EDX are also accessable as bytes (8 bits wide) or words (16 bits wide)
the lower 8 bits of EAX may be accessed as AL
the next 8 bits of EAX may be accessed as AH
the lower 16 bits of EAX may be accessed as AX
so, BL, BH, BX, CL, CH, CX, DL, DH, DX are also valid register names
the upper 16 bits of EAX, EBX, ECX, EDX may not be individually accessed this way

sometimes, we put the address of some data in one of the registers
        mov     esi,offset HelloWorld

ESI now holds a 32-bit address that the assembler has assigned to the HelloWorld define

when we reference [ESI], we are refering to "the value at the address contained in ESI"
but - the assembler doesn't always know whether that data is a byte, word, or dword
sometimes, it does know because the other operand specifies the size

for example...
        mov     al,[esi]
        mov     [esi],al


the assembler knows we are addressing a byte, because AL is a byte-register

however, in these examples...
        mov     [esi],10
        cmp     [esi],0


the assembler does not know whether we mean a byte, word, or dword
so, a "size override" operator is needed
        mov byte ptr [esi],10
        cmp dword ptr [esi],0

dedndave

OFFSET is generally used as an operater that means "address of" for global data
ADDR is similar, but is most often used for local data, and was designed for use with invoke

however, they designed a little flexibility into the ADDR operator
it may be used in places other than invoke
and - if the data is global, it will work the same as OFFSET

OFFSET, however, is not flexible - the operand must be fixed-address global data or fixed-address code

the meaning of these operands is more apparent if you look at the disassembled program code   :biggrin:

dedndave

let's say you have a global data item
Global1 db 'xyz'
for the sake of discussion, we'll assume the assembler has assigned the address 00400008 to Global1
        mov     esi,offset Global1
        mov     esi,addr Global1


the assembler generates the same code for both of those lines
        mov     esi,00400008

now, let's say we create a local data item
MyFunc PROC

    LOCAL   Local1   :DWORD

        mov     esi,offset Local1     ;the assembler generates an error
        lea     esi,Local1          ;the stack address of Local1 is calculated

        INVOKE  SomeFunc,offset Local1     ;the assembler generates an error
        INVOKE  SomeFunc,addr Local1      ;the stack address of Local1 is calculated

MyFunc ENDP


for the last INVOKE line, the assembler generates the following code
        lea     eax,addr Local1         ;which has the form [EBP-x]
        push    eax
        call    SomeFunc