Could do with some clarification on offset, addr and lea

hamper · February 04, 2013, 09:54:00 AM

Wow, this assembly language programming is good stuff! Really getting into now. It seems that absolutely everything is just either a 0 or a 1. How much simpler could it be than that! So all I need to work on now is making sure that all the 0's and 1's are arranged in the correct sequence. But to get to the point...

I could do with a bit of guidance on the use of offset, addr and lea under the flat memory model. As I currently understand it (so please correct me):

The offset operator returns (evaluates to) the address (i.e. offset) of its operand, which is an expression that can be resolved at assemble/link time to a constant memory location, such as a global variable. It can therefore be used as a direct memory operand.

The addr operator seems to do the same job in every respect as offset, so what's the difference?

The processor instruction lea loads an address into a register, so it can only be resolved at run time?

If I have declared a global variable called myvar, then can I use any of these instructions with exactly the same effect during execution?:

lea eax, myvar
mov eax, offset myvar
mov eax, addr myvar

I'm assuming that offset and addr are more efficient because they will resolve to constant values during assembly, so will not eat up ticks and bytes during execution as lea would. But is there more to it than that? Are there any simple guidelines as to which is best used for what, and in what circumstances?

Thanks in anticipation...

jj2007 · February 04, 2013, 10:00:45 AM

lea eax, myvar
mov eax, offset myvar
Both do the same for global variables, but lea is one byte longer.
You must use lea for local variables.

mov eax, addr myvar ; syntax error

Addr is only valid in connection with the invoke macro.
For global variables,
invoke whatever, addr myvar
invoke whatever, offset myvar
are synonyms, i.e. they produce the same encoding.
Local variables cannot be addressed by offset myvar (as you probably guessed already).
It is therefore a good idea to always use
invoke whatever, offset myvar ; for global variables
invoke whatever, addr myvar ; for local variables

You learn fast :t
You could learn even more if you decide to look at all these examples through the eyes of Olly. Easy to learn:
int 3 - insert where you want to make a pause
F7 - stepwise execution
F8 - stepwise but do not dive into procs
F9 - go until the program ends, or you hit an int 3 instruction.

And check my tips & traps - very dense info but it should be ok for you.

dedndave · February 04, 2013, 10:04:21 AM

ADDR and LEA are closely related
they are primarily used for getting the addresses of LOCAL variables
LEA can be used in other ways - it can even perform simple math for you

Code Select

MyFunc  PROC    SomeParm:DWORD

    LOCAL   Local_1 :DWORD

    INVOKE  Something,addr Local_1

    ret

MyFunc  ENDP

the assembler cannot know the address of this local variable at assembly-time
instead, it will generate code that looks like this...

Code Select

    lea     eax,Local_1
    push    eax
    call    Something

to be more accurate....

Code Select

    push    ebp
    mov     ebp,esp
    sub     esp,4                ;make room for Local_1 at [EBP-4]

    lea     eax,[ebp-4]
    push    eax
    call    Something

    leave                        ;same as mov esp,ebp then pop ebp
    ret     4                    ;discard the stack parameter

LEA calculates the address of Local_1 by taking the contents of EBP, and subtracting 4

hamper · February 04, 2013, 10:36:12 AM

Thanks guys, it's getting a little bit less foggy. But jj2207, you say "You must use lea for local variables" (which makes sense to me, because as I understand it local variables have to be declared (using local) on entry, are created onto the stack, then destroyed on exit, so can't be resolved to constant offsets at assembly time). BUT a bit further down you then give an example:

invoke whatever, addr myvar ; for local variables

? or am I just misunderstanding what you meant ?

So addr and offset are merely different names for exactly the same operation? I can use whichever one I want and they will always do the same job under all circumstances? I have to say though that I suspect that there must be some kind of a difference between them somewhere along the line, otherwise why re-invent the wheel? Neither are processor instructions, so both are M$ assembler directives, but why the two for the same purpose?

Sorry to be such a nit-picker, but I'm still a bit unclear on addr vs. offset.

jj2007 · February 04, 2013, 10:40:18 AM

Quote from: hamper on February 04, 2013, 10:36:12 AM
So addr and offset are merely different names for exactly the same operation?

Yes and no:
Yes for global variables.
No for local ones: Offset is not valid syntax, and invoke abc, addr MyLocalWhatever produces
lea eax, MyLocalWhatever
push eax
under the hood.

Example:

include \masm32\include\masm32rt.inc

.code
start: call MyTest
exit

MyTest proc
LOCAL dw1, dw2, rc:RECT, buffer[100]:BYTE
int 3 ; Olly will stop here after pressing F9
mov dw2, 1234
invoke dwtoa, dw2, addr buffer
MsgBox 0, addr buffer, "Hi", MB_OK
ret
MyTest endp
end start

**** What you see in Olly: ****

CPU Disasm
Address Hex dump Command Comments
00401012 ³. CC int3
00401013 ³. C745 F8 D2040000 mov dword ptr [ebp-8], 4D2 <<<<<<<< mov dw2, 1234
0040101A ³. 8D45 84 lea eax, [ebp-7C] <<<<<<< addr buffer
0040101D ³. 50 push eax ; ÚArg2 => offset LOCAL.31
0040101E ³. FF75 F8 push dword ptr [ebp-8] ; ³Arg1 => 4D2
00401021 ³. E8 1A000000 call 00401040 <<<<<< dwordtoascii aka dwtoa ; ÀNewWin32.00401040

dedndave · February 04, 2013, 10:45:41 AM

you had it very close in your original post
OFFSET is used when the address is known at assembly-time
ADDR or LEA is used when the address is not known until run-time

what many newbies find confusing, at first, is that they MAY be interchangable - lol
but - only if the variable is global

Code Select

    .DATA?

dwSomeStuff dd 12345678h

    .CODE

    INVOKE  SomeProc,addr dwSomeStuff
    INVOKE  SomeProc,offset dwSomeStuff

to simplify it, let's just say it this way.....

the assembler knows the address of dwSomeStuff at assembly-time
so - it thinks you meant OFFSET
but - allows the use of ADDR

when the variable is LOCAL, however, ADDR must be used
the assembler isn't as smart :P

hamper · February 04, 2013, 10:56:31 AM

Ok guys, thanks for all the info. I think I've got it now, but I'm going to have to sleep on it and have a fresh look tomorrow at everything you've both said. No doubt it'll be obvious with a fresh brain, and I'll wonder what all the fuss was about.

Many thanks

dedndave · February 04, 2013, 12:13:11 PM

now that you have that down, let's use them without invoke

sometimes, you may want to load the address of a globally allocated buffer into a register
you would use

Code Select

mov edx,offset MyBuffer
very straightforward
the assembler knows the address of MyBuffer at assembly time
it generates code that actually loads that address as an "immediate" constant

Code Select

mov edx,402000h

the assembler can also do some simple math for you, so long as it resolves to a constant at assembly time
these are valid

Code Select

        mov     edx,offset MyBuffer+30
        mov     edx,offset MyBuffer+4*30

however, if you want to calculate an offset into the buffer (or an array) in a register.....

Code Select

        mov     edx,SomeNumber
        shl     edx,4

there are a few ways to handle this
you can multiply a register by 2, 4, or 8, but not larger powers of 2, and use it as an index
in the code above, we multiplied by 16

Code Select

        mov     edx,SomeNumber
        shl     edx,4
        mov     eax,MyBuffer[edx]  ;loads the content at that address into EAX

Code Select

        mov     edx,SomeNumber
        shl     edx,4
        lea     eax,MyBuffer[edx]  ;the assembler calculates the address and loads it into EAX

for smaller powers of 2

Code Select

        mov     edx,SomeNumber
        mov     eax,[4*edx]

Code Select

        mov     edx,SomeNumber
        lea     eax,MyBuffer[4*edx]

so....
if the address needs to be calculated from a value in register, use LEA

LEA can also be used to do simple math

Code Select

        lea     eax,[4*eax+eax]    ;loads EAX with 5*EAX
        lea     eax,[edx+ecx]      ;loads EAX with EDX + ECX
        lea     eax,[edx+ecx+4]    ;loads EAX with EDX + ECX + 4

The MASM Forum

News:

Could do with some clarification on offset, addr and lea

hamper

jj2007

dedndave

hamper

jj2007

dedndave

hamper

dedndave