News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Could do with some clarification on offset, addr and lea

Started by hamper, February 04, 2013, 09:54:00 AM

Previous topic - Next topic

hamper

Wow, this assembly language programming is good stuff! Really getting into now. It seems that absolutely everything is just either a 0 or a 1. How much simpler could it be than that! So all I need to work on now is making sure that all the 0's and 1's are arranged in the correct sequence. But to get to the point...

I could do with a bit of guidance on the use of offset, addr and lea under the flat memory model. As I currently understand it (so please correct me):

The offset operator returns (evaluates to) the address (i.e. offset) of its operand, which is an expression that can be resolved at assemble/link time to a constant memory location, such as a global variable. It can therefore be used as a direct memory operand.

The addr operator seems to do the same job in every respect as offset, so what's the difference?

The processor instruction lea loads an address into a register, so it can only be resolved at run time?

If I have declared a global variable called myvar, then can I use any of these instructions with exactly the same effect during execution?:

      lea eax, myvar
      mov eax, offset myvar
      mov eax, addr myvar

I'm assuming that offset and addr are more efficient because they will resolve to constant values during assembly, so will not eat up ticks and bytes during execution as lea would. But is there more to it than that? Are there any simple guidelines as to which is best used for what, and in what circumstances?

Thanks in anticipation...

jj2007

      lea eax, myvar
      mov eax, offset myvar
Both do the same for global variables, but lea is one byte longer.
You must use lea for local variables.

      mov eax, addr myvar  ; syntax error

Addr is only valid in connection with the invoke macro.
For global variables,
      invoke whatever, addr myvar
      invoke whatever, offset myvar
are synonyms, i.e. they produce the same encoding.
Local variables cannot be addressed by offset myvar (as you probably guessed already).
It is therefore a good idea to always use
      invoke whatever, offset myvar   ; for global variables
      invoke whatever, addr myvar   ; for local variables

You learn fast :t
You could learn even more if you decide to look at all these examples through the eyes of Olly. Easy to learn:
int 3 - insert where you want to make a pause
F7 - stepwise execution
F8 - stepwise but do not dive into procs
F9 - go until the program ends, or you hit an int 3 instruction.

And check my tips & traps - very dense info but it should be ok for you.

dedndave

ADDR and LEA are closely related
they are primarily used for getting the addresses of LOCAL variables
LEA can be used in other ways - it can even perform simple math for you

MyFunc  PROC    SomeParm:DWORD

    LOCAL   Local_1 :DWORD

    INVOKE  Something,addr Local_1

    ret

MyFunc  ENDP

the assembler cannot know the address of this local variable at assembly-time
instead, it will generate code that looks like this...

    lea     eax,Local_1
    push    eax
    call    Something


to be more accurate....

    push    ebp
    mov     ebp,esp
    sub     esp,4                ;make room for Local_1 at [EBP-4]

    lea     eax,[ebp-4]
    push    eax
    call    Something

    leave                        ;same as mov esp,ebp then pop ebp
    ret     4                    ;discard the stack parameter


LEA calculates the address of Local_1 by taking the contents of EBP, and subtracting 4

hamper

Thanks guys, it's getting a little bit less foggy. But jj2207, you say "You must use lea for local variables" (which makes sense to me, because as I understand it local variables have to be declared (using local) on entry, are created onto the stack, then destroyed on exit, so can't be resolved to constant offsets at assembly time). BUT a bit further down you then give an example:

      invoke whatever, addr myvar      ; for local variables

? or am I just misunderstanding what you meant ?

So addr and offset are merely different names for exactly the same operation? I can use whichever one I want and they will always do the same job under all circumstances? I have to say though that I suspect that there must be some kind of a difference between them somewhere along the line, otherwise why re-invent the wheel? Neither are processor instructions, so both are M$ assembler directives, but why the two for the same purpose?

Sorry to be such a nit-picker, but I'm still a bit unclear on addr vs. offset.


jj2007

Quote from: hamper on February 04, 2013, 10:36:12 AM
So addr and offset are merely different names for exactly the same operation?

Yes and no:
Yes for global variables.
No for local ones: Offset is not valid syntax, and invoke abc, addr MyLocalWhatever produces
lea eax, MyLocalWhatever
push eax
under the hood.

Example:

include \masm32\include\masm32rt.inc

.code
start:   call MyTest
   exit

MyTest proc
LOCAL dw1, dw2, rc:RECT, buffer[100]:BYTE
  int 3  ; Olly will stop here after pressing F9
  mov dw2, 1234
  invoke dwtoa, dw2, addr buffer
  MsgBox 0, addr buffer, "Hi", MB_OK
  ret
MyTest endp
end start

**** What you see in Olly: ****

CPU Disasm
Address        Hex dump                   Command                               Comments
00401012       ³.  CC                     int3
00401013       ³.  C745 F8 D2040000       mov dword ptr [ebp-8], 4D2  <<<<<<<< mov dw2, 1234
0040101A       ³.  8D45 84                lea eax, [ebp-7C]  <<<<<<< addr buffer
0040101D       ³.  50                     push eax                              ; ÚArg2 => offset LOCAL.31
0040101E       ³.  FF75 F8                push dword ptr [ebp-8]                ; ³Arg1 => 4D2
00401021       ³.  E8 1A000000            call 00401040  <<<<<< dwordtoascii aka dwtoa                       ; ÀNewWin32.00401040

dedndave

you had it very close in your original post
OFFSET is used when the address is known at assembly-time
ADDR or LEA is used when the address is not known until run-time

what many newbies find confusing, at first, is that they MAY be interchangable - lol
but - only if the variable is global

    .DATA?

dwSomeStuff dd 12345678h

    .CODE

    INVOKE  SomeProc,addr dwSomeStuff
    INVOKE  SomeProc,offset dwSomeStuff


to simplify it, let's just say it this way.....

the assembler knows the address of dwSomeStuff at assembly-time
so - it thinks you meant OFFSET
but - allows the use of ADDR

when the variable is LOCAL, however, ADDR must be used
the assembler isn't as smart   :P

hamper

Ok guys, thanks for all the info. I think I've got it now, but I'm going to have to sleep on it and have a fresh look tomorrow at everything you've both said. No doubt it'll be obvious with a fresh brain, and I'll wonder what all the fuss was about.

Many thanks

dedndave

now that you have that down, let's use them without invoke   :biggrin:

sometimes, you may want to load the address of a globally allocated buffer into a register
you would use
        mov     edx,offset MyBuffer
very straightforward
the assembler knows the address of MyBuffer at assembly time
it generates code that actually loads that address as an "immediate" constant
        mov     edx,402000h

the assembler can also do some simple math for you, so long as it resolves to a constant at assembly time
these are valid
        mov     edx,offset MyBuffer+30
        mov     edx,offset MyBuffer+4*30


however, if you want to calculate an offset into the buffer (or an array) in a register.....
        mov     edx,SomeNumber
        shl     edx,4

there are a few ways to handle this
you can multiply a register by 2, 4, or 8, but not larger powers of 2, and use it as an index
in the code above, we multiplied by 16
        mov     edx,SomeNumber
        shl     edx,4
        mov     eax,MyBuffer[edx]  ;loads the content at that address into EAX

        mov     edx,SomeNumber
        shl     edx,4
        lea     eax,MyBuffer[edx]  ;the assembler calculates the address and loads it into EAX

for smaller powers of 2
        mov     edx,SomeNumber
        mov     eax,[4*edx]

        mov     edx,SomeNumber
        lea     eax,MyBuffer[4*edx]


so....
if the address needs to be calculated from a value in register, use LEA

LEA can also be used to do simple math
        lea     eax,[4*eax+eax]    ;loads EAX with 5*EAX
        lea     eax,[edx+ecx]      ;loads EAX with EDX + ECX
        lea     eax,[edx+ecx+4]    ;loads EAX with EDX + ECX + 4