Wow, this assembly language programming is good stuff! Really getting into now. It seems that absolutely everything is just either a 0 or a 1. How much simpler could it be than that! So all I need to work on now is making sure that all the 0's and 1's are arranged in the correct sequence. But to get to the point...
I could do with a bit of guidance on the use of offset, addr and lea under the flat memory model. As I currently understand it (so please correct me):
The offset operator returns (evaluates to) the address (i.e. offset) of its operand, which is an expression that can be resolved at assemble/link time to a constant memory location, such as a global variable. It can therefore be used as a direct memory operand.
The addr operator seems to do the same job in every respect as offset, so what's the difference?
The processor instruction lea loads an address into a register, so it can only be resolved at run time?
If I have declared a global variable called myvar, then can I use any of these instructions with exactly the same effect during execution?:
lea eax, myvar
mov eax, offset myvar
mov eax, addr myvar
I'm assuming that offset and addr are more efficient because they will resolve to constant values during assembly, so will not eat up ticks and bytes during execution as lea would. But is there more to it than that? Are there any simple guidelines as to which is best used for what, and in what circumstances?
Thanks in anticipation...
lea eax, myvar
mov eax, offset myvar
Both do the same for global variables, but lea is one byte longer.
You must use lea for local variables.
mov eax, addr myvar ; syntax error
Addr is only valid in connection with the invoke macro.
For global variables,
invoke whatever, addr myvar
invoke whatever, offset myvar
are synonyms, i.e. they produce the same encoding.
Local variables cannot be addressed by offset myvar (as you probably guessed already).
It is therefore a good idea to always use
invoke whatever, offset myvar ; for global variables
invoke whatever, addr myvar ; for local variables
You learn fast :t
You could learn even more if you decide to look at all these examples through the eyes of Olly (http://www.ollydbg.de/version2.html). Easy to learn:
int 3 - insert where you want to make a pause
F7 - stepwise execution
F8 - stepwise but do not dive into procs
F9 - go until the program ends, or you hit an int 3 instruction.
And check my tips & traps (http://www.webalice.it/jj2006/Masm32_Tips_Tricks_and_Traps.htm) - very dense info but it should be ok for you.
ADDR and LEA are closely related
they are primarily used for getting the addresses of LOCAL variables
LEA can be used in other ways - it can even perform simple math for you
MyFunc PROC SomeParm:DWORD
LOCAL Local_1 :DWORD
INVOKE Something,addr Local_1
ret
MyFunc ENDP
the assembler cannot know the address of this local variable at assembly-time
instead, it will generate code that looks like this...
lea eax,Local_1
push eax
call Something
to be more accurate....
push ebp
mov ebp,esp
sub esp,4 ;make room for Local_1 at [EBP-4]
lea eax,[ebp-4]
push eax
call Something
leave ;same as mov esp,ebp then pop ebp
ret 4 ;discard the stack parameter
LEA calculates the address of Local_1 by taking the contents of EBP, and subtracting 4
Thanks guys, it's getting a little bit less foggy. But jj2207, you say "You must use lea for local variables" (which makes sense to me, because as I understand it local variables have to be declared (using local) on entry, are created onto the stack, then destroyed on exit, so can't be resolved to constant offsets at assembly time). BUT a bit further down you then give an example:
invoke whatever, addr myvar ; for local variables
? or am I just misunderstanding what you meant ?
So addr and offset are merely different names for exactly the same operation? I can use whichever one I want and they will always do the same job under all circumstances? I have to say though that I suspect that there must be some kind of a difference between them somewhere along the line, otherwise why re-invent the wheel? Neither are processor instructions, so both are M$ assembler directives, but why the two for the same purpose?
Sorry to be such a nit-picker, but I'm still a bit unclear on addr vs. offset.
Quote from: hamper on February 04, 2013, 10:36:12 AM
So addr and offset are merely different names for exactly the same operation?
Yes and no:
Yes for global variables.
No for local ones: Offset is not valid syntax, and invoke abc, addr MyLocalWhatever produces
lea eax, MyLocalWhatever
push eax
under the hood.
Example:
include \masm32\include\masm32rt.inc
.code
start: call MyTest
exit
MyTest proc
LOCAL dw1, dw2, rc:RECT, buffer[100]:BYTE
int 3 ; Olly will stop here after pressing F9
mov dw2, 1234
invoke dwtoa, dw2, addr buffer
MsgBox 0, addr buffer, "Hi", MB_OK
ret
MyTest endp
end start
**** What you see in Olly: ****
CPU Disasm
Address Hex dump Command Comments
00401012 ³. CC int3
00401013 ³. C745 F8 D2040000 mov dword ptr [ebp-8], 4D2 <<<<<<<< mov dw2, 1234
0040101A ³. 8D45 84 lea eax, [ebp-7C] <<<<<<< addr buffer
0040101D ³. 50 push eax ; ÚArg2 => offset LOCAL.31
0040101E ³. FF75 F8 push dword ptr [ebp-8] ; ³Arg1 => 4D2
00401021 ³. E8 1A000000 call 00401040 <<<<<< dwordtoascii aka dwtoa ; ÀNewWin32.00401040
you had it very close in your original post
OFFSET is used when the address is known at assembly-time
ADDR or LEA is used when the address is not known until run-time
what many newbies find confusing, at first, is that they MAY be interchangable - lol
but - only if the variable is global
.DATA?
dwSomeStuff dd 12345678h
.CODE
INVOKE SomeProc,addr dwSomeStuff
INVOKE SomeProc,offset dwSomeStuff
to simplify it, let's just say it this way.....
the assembler knows the address of dwSomeStuff at assembly-time
so - it thinks you meant OFFSET
but - allows the use of ADDR
when the variable is LOCAL, however, ADDR must be used
the assembler isn't as smart :P
Ok guys, thanks for all the info. I think I've got it now, but I'm going to have to sleep on it and have a fresh look tomorrow at everything you've both said. No doubt it'll be obvious with a fresh brain, and I'll wonder what all the fuss was about.
Many thanks
now that you have that down, let's use them without invoke :biggrin:
sometimes, you may want to load the address of a globally allocated buffer into a register
you would use
mov edx,offset MyBuffer
very straightforward
the assembler knows the address of MyBuffer at assembly time
it generates code that actually loads that address as an "immediate" constant
mov edx,402000h
the assembler can also do some simple math for you, so long as it resolves to a constant at assembly time
these are valid
mov edx,offset MyBuffer+30
mov edx,offset MyBuffer+4*30
however, if you want to calculate an offset into the buffer (or an array) in a register.....
mov edx,SomeNumber
shl edx,4
there are a few ways to handle this
you can multiply a register by 2, 4, or 8, but not larger powers of 2, and use it as an index
in the code above, we multiplied by 16
mov edx,SomeNumber
shl edx,4
mov eax,MyBuffer[edx] ;loads the content at that address into EAX
mov edx,SomeNumber
shl edx,4
lea eax,MyBuffer[edx] ;the assembler calculates the address and loads it into EAX
for smaller powers of 2
mov edx,SomeNumber
mov eax,[4*edx]
mov edx,SomeNumber
lea eax,MyBuffer[4*edx]
so....
if the address needs to be calculated from a value in register, use LEA
LEA can also be used to do simple math
lea eax,[4*eax+eax] ;loads EAX with 5*EAX
lea eax,[edx+ecx] ;loads EAX with EDX + ECX
lea eax,[edx+ecx+4] ;loads EAX with EDX + ECX + 4