News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Help undertanding stack frame

Started by jayanthd, February 21, 2013, 03:27:39 AM

Previous topic - Next topic

jayanthd

I have a C/C++ code like below


//Function prototype
int _sum(int _op1, int _op2);

//Main Function
int main() {

int op1, op2, sum;

op1 = 25;
op2 = 75;

//Calling function
_sum(op1, op2);

return (0);

}

//Function Definition
//Called function
int _sum(int _op1, int _op2) {

int result;

result = _op1 + _op2;
return result;

}


When calling function is executed first the value 75 is pushed to the stack and then value 25 is pushed to the stack. Then return address is pushed on to the stack. Return address will be the address of the next instruction after the calling function. Right? How does the return address calculated?

Then ebp, esi, edi are pushed on the stack and ebp is set to esp. So, ebp and esp will be pointing to the top of the stack which contains edi.

Then when the called function is executed, a local variable result is created on the stack and stack will be pointing to result variable.

then values of _op1 and _op2 on the stack is referenced and value for result is computed and stored in result variable on the stack.

How is the result returned to the calling function?
Is _sum(op1, op2) the calling function or is it main() the calling function?
Is the function definition of _sum() the called function or is it _sum() in the main() the called function?

See the asm code below and complete the process after executing the _sum() function


push 75
push 25
push return address
push ebp
mov ebp, esp
push esi
push edi
push result
mov ax, 25
add ax, 75
mov [result], ax
.
.
.
pop edi
pop esi
mov esp, ebp
pop edi
pop esi
ret


Where actuslly the stack frame gets created. Is it when mov ebp, esp is executed?
old value of esp is stored in ebp and then ebp is used to reference the variables on the stack but ebp never changes but esp changes during stack operation. Finally when returning from the function esp is assigned its old value which is in ebp. Right?

In the asm code show how value of result is returned to main function?

Can mov ebp, esp coded after pushing esi and edi onto the stack?

Greenhorn

Hi jayanthd,

this is the wrong subforum for your question ...  ;)

However, the return value is stored in (r/e)ax.
The "result" variable is not necessary in this case.


Cheers
Greenhorn
Kole Feut un Nordenwind gift en krusen Büdel un en lütten Pint.

jayanthd

Quote from: Greenhorn on February 21, 2013, 04:37:12 AM
Hi jayanthd,

this is the wrong subforum for your question ...  ;)

However, the return value is stored in (r/e)ax.
The "result" variable is not necessary in this case.


Cheers
Greenhorn

Thanks for replying Greenhorn.  :biggrin:

Why cant mov ebp, esp put after push esi and push edi?
How is the value in the result variable returned to the sum variable in the main()?
How is the return address calculated after pushing the function arguments on the stack?

dedndave

Quote from: jayanthd on February 21, 2013, 03:27:39 AM
When calling function is executed first the value 75 is pushed to the stack and then value 25 is pushed to the stack. Then return address is pushed on to the stack. Return address will be the address of the next instruction after the calling function. Right? How does the return address calculated?
that's a pretty good description
the CALL instruction calculates the return address and pushes it onto the stack before branching to the routine

Quote from: jayanthd on February 21, 2013, 03:27:39 AM
How is the result returned to the calling function?
in most high-level compilers, the result is returned in EAX
in assembly language, we may also use ECX and/or EDX to return values, as they need not be preserved
if more space is required, the address of a structure is generally passed and the routine fills it with values

Quote from: jayanthd on February 21, 2013, 03:27:39 AM
Is _sum(op1, op2) the calling function or is it main() the calling function?
Is the function definition of _sum() the called function or is it _sum() in the main() the called function?
i would say main is the calling function, _sum(op1, op2) is the actual call
the called function is defined here
//Function Definition
//Called function
int _sum(int _op1, int _op2) {

int result;

result = _op1 + _op2;
return result;

}


Quote from: jayanthd on February 21, 2013, 03:27:39 AM
Where actually the stack frame gets created. Is it when mov ebp, esp is executed?
old value of esp is stored in ebp and then ebp is used to reference the variables on the stack but ebp never changes but esp changes during stack operation. Finally when returning from the function esp is assigned its old value which is in ebp. Right?
that's pretty close
Quoteold value of esp is stored in ebp
not exactly worded right
the current value of ESP is copied into EBP
Quoteebp is used to reference the variables on the stack but ebp never changes but esp changes during stack operation
very good   :t
many beginners have a hard time with that one
Quotewhen returning from the function esp is assigned its old value which is in ebp
correct, this is often done with a LEAVE instruction, which is essentially the same as
    mov     esp,ebp
    pop     ebp


Quote from: jayanthd on February 21, 2013, 03:27:39 AM
In the asm code show how value of result is returned to main function?
again, return values are passed in EAX

Quote from: jayanthd on February 21, 2013, 03:27:39 AM
Can mov ebp, esp coded after pushing esi and edi onto the stack?
yes - i sometimes write my own stackframe code so i can do it that way

here is how equivalent code might look in assembler....

function prototype, typically near beginning of source
_sum    PROTO   :DWORD,:DWORD
we might like to type them as INT's, but INT is a reserved word in ASM - an instruction for INTerrupt
so - we just type them as DWORD's - assembly does not use strong typing like C

function definition
_sum    PROC    op1:DWORD,op2:DWORD

    mov     eax,op2
    add     eax,op1
    ret

_sum    ENDP


calling the function
    INVOKE  _sum,op1,op2
in this case, you are calling with constants, so...
    INVOKE  _sum,75,25

the actual code generated by the assembler looks like this
_sum    PROC    op1:DWORD,op2:DWORD

    push    ebp
    mov     ebp,esp
    mov     eax,[ebp+12]
    add     eax,[ebp+8]
    leave
    ret     8

_sum    ENDP


and, for the INVOKE...
    push    25
    push    75
    call    _sum

KeepingRealBusy

Quote from: dedndave on February 21, 2013, 05:36:47 AM
.
.
.
_sum    PROTO   :DWORD,:DWORD
we might like to type them as INT's, but INT is a reserved word in ASM - an instruction for INTerrupt
so - we just type them as DWORD's - assembly does not use strong typing like C

function definition
_sum    PROC    op1:DWORD,op2:DWORD

    mov     eax,op2
    add     eax,op1
    ret

_sum    ENDP


calling the function
    INVOKE  _sum,op1,op2
in this case, you are calling with constants, so...
    INVOKE  _sum,75,25


Actually, the assembler DOES support strong typing, at least for function parameters. if you define


PDWORD      TYPEDEF         PTR DWORD


and


_sum    PROTO   PDWORD,:PDWORD


then you must call as


    INVOKE  _sum,ADDRESS op1,ADDRESS op2


You will get an error message if you skip the ADDRESS modifier as in


    INVOKE  _sum,op1,op2


You do not need to endlessly use DWORDs as the only PROTO definers, whether or not you are passing values or pointers to values. The assembler will be checking on you.

Dave.

dedndave

that may be so for PTR's
but, you can prototype with DWORD's, then use UINT's on the PROC line
if i am not mistaken, the assembler only checks it for size

RuiLoureiro

Dave,
        Try to follow this. What the answer

ProcA       proc    x:DWORD,...
            push    ebp
            mov     ebp, esp        ; <-  suppose ESP=EBP = 12345678

            ; ...................................
            ; here we write a lot of correct code
            ;     If we push we pop also
            ; all procs we call exit correctly
            ; ...................................

            ; ....................
            ; Here we want to exit  -> question: what the value in ESP ?
            ; ....................

ProcA       endp

dedndave

hopefully, it will be 12345678   :P

but, what if you want to put a bunch of locals on the stack without keeping track of how big they are ?
then, the MOV ESP,EBP (or LEAVE) balances the stack for you automatically

Gunther

Hi RuiLoureiro,

Quote from: RuiLoureiro on February 21, 2013, 07:16:10 AM
ProcA       proc    x:DWORD,...
            push    ebp
            mov     ebp, esp        ; <-  suppose ESP=EBP = 12345678

            ; ...................................
            ; here we write a lot of correct code
            ;     If we push we pop also
            ; all procs we call exit correctly
            ; ...................................

            ; ....................
            ; Here we want to exit  -> question: what the value in ESP ?
            ; ....................

ProcA       endp

But the current value of ESP isn't interesting, because we're addressing via EBP. ESP is changing by every PUSH or POP or function call etc.

Gunther
You have to know the facts before you can distort them.

MichaelW

Quote from: jayanthd on February 21, 2013, 03:27:39 AM
How is the result returned to the calling function?

See Agner Fog's calling_conventions.pdf available here.
Well Microsoft, here's another nice mess you've gotten us into.

dedndave

Quote from: Gunther on February 21, 2013, 08:14:51 AM
But the current value of ESP isn't interesting, because we're addressing via EBP. ESP is changing by every PUSH or POP or function call etc.

Gunther

it will be interesting for the next instruction where, presumably, they POP EBP and RET   :biggrin:

RuiLoureiro

Quote from: dedndave on February 21, 2013, 07:36:53 AM
hopefully, it will be 12345678   :P

but, what if you want to put a bunch of locals on the stack without keeping track of how big they are ?
then, the MOV ESP,EBP (or LEAVE) balances the stack for you automatically
Thats right. Everything ok Dave !  ;)

Gunther:  Dave gave the answer for me  :t

jayanthd

Thanks everybody. It was helpful.

@Dave

Quote
and, for the INVOKE...
   
    push    25
    push    75
    call    _sum


Why op1 is pushed first and op2 is pushed next? I read somewhere that before calling a function the arguments to the function are pushed in reverse order.

I have another question.

In different addressing modes we use instructions like below to load some value stored at some address into a register.


var1 dd ?

mov eax, [var1]
mov eax, offset var1
mov eax, [aabbccdd]

or

mov ebx, aabbccdd
mov eax, ds:[ebx]                   ;value from some address
mov eax, ds:[ebx + 2]             ;value from some effective address

or

mov si, aabbccdd
mov eax, ds:[si]

etc...




Actually the [address] points to the actual data at that address.

What is the difference between the above instructions to get data and with the below instructions?



mov ax, byte ptr ds:[var1]                  ;here var1 is a byte
mov eax, word ptr ds:[aabb]
mov eax, dword ptr ds:[aabbccdd]
mov eax, dword ptr ds:[ebx]              ;ebx is set to aabbccdd earlier



Where are the above code used?



dedndave

Quote from: jayanthd on February 21, 2013, 04:11:28 PM
and, for the INVOKE...
   
    push    25
    push    75
    call    _sum

Why op1 is pushed first and op2 is pushed next? I read somewhere that before calling a function the arguments to the function are pushed in reverse order.
my mistake - i simply swapped 25 and 75 by accident
the last parameter listed is pushed first

Quote from: jayanthd on February 21, 2013, 04:11:28 PM
In different addressing modes we use instructions like below to load some value stored at some address into a register.

var1 dd ?

mov eax, [var1]
mov eax, offset var1
mov eax, [aabbccdd]

or

mov ebx, aabbccdd
mov eax, ds:[ebx]                   ;value from some address
mov eax, ds:[ebx + 2]             ;value from some effective address

or

mov si, aabbccdd
mov eax, ds:[si]

etc...




Actually the [address] points to the actual data at that address.

What is the difference between the above instructions to get data and with the below instructions?



mov ax, byte ptr ds:[var1]                  ;here var1 is a byte
mov eax, word ptr ds:[aabb]
mov eax, dword ptr ds:[aabbccdd]
mov eax, dword ptr ds:[ebx]              ;ebx is set to aabbccdd earlier



Where are the above code used?
there are a variety of addressing modes that may be used for different purposes
first, let's deal with the address issue....
var1 dd ?

    mov     eax, offset var1

the assembler creates space for the label "var1" at some address
the assembler knows what the address is at assembly-time
the "offset" operator tells the assembler to load the address of var1 into EAX, not the contents at that address
the actual code generated might look something like this
    mov     eax,00401012h   ;the address of var1 is loaded into EAX

if we were to reference a LOCAL variable this way, we would use
    LOCAL   var1    :DWORD

    lea     eax,var1

LEA stands for Load Effective Address
behind the scenes, LOCAL's are addressed by using EBP as a reference
the address isn't known at assembly-time, so the assembler can't use MOV,constant
the actual code might be something like
    lea     eax,[ebp-4]
LEA calculates the address of var1 by subtracting 4 from the address in EBP and placing that value in EAX

now, let's load the contents
i noticed you used [] brackets
with MASM, you don't need to use brackets unless you are using a register
for a variable name...
    mov     eax,var1   ;the contents at the address of var1 are loaded into EAX
for a GLOBAL variable, the actual code generated by the assembler might look something like this
    mov     eax,[00401012h]
for a LOCAL...
    mov     eax,[ebp-4]

when addressing arrays or strings, it is often convenient to access data by using a register to hold the address
    mov     edx,offset var1
    mov     eax,[edx]

this is done so that you may calculate steps in EDX to address the individual elements of an array
on the next pass of a loop, for example, we might...
    add     edx,4      ;adjust the address
    mov     eax,[edx]  ;get the next dword element


there are numerous combinations that are available
    xor     edx,edx        ;zero EDX
    mov     eax,var1[edx]  ;same as MOV EAX,[EDX+var1]


you can also use 2 registers, an index, or even a multiplier of 2, 4, or 8
    mov     eax,MyArray[4*edx+ebx+24]
that's about as complex as they allow   :P
notice that the assembler combines "MyArray" and "+24" to form a single constant
this form might be used to address a 3-dimensional array, where EDX is an X, EBX is a Y, and +24 is a Z

jayanthd

@Dave

Your answers cleared most of my doubts but I didn't get clear picture of the below things.
Quote
there are numerous combinations that are available
    xor     edx,edx        ;zero EDX
    mov     eax,var1[edx]  ;same as MOV EAX,[EDX+var1]


Can you explain how mov eax, var1[edx] works? edx is cleared, So, [edx + var1] = [var1] Right? and var1[edx] = var1[0] = var1. Right?

Quote
you can also use 2 registers, an index, or even a multiplier of 2, 4, or 8
    mov     eax,MyArray[4*edx+ebx+24]
that's about as complex as they allow   :P
notice that the assembler combines "MyArray" and "+24" to form a single constant
this form might be used to address a 3-dimensional array, where EDX is an X, EBX is a Y, and +24 is a Z

By using two registers are you telling it will be based indexed addressing mode, where one base register and one index register is used to get the effective address like
mov eax, [ebx + esi + 12] ?

can you explain more about this code     mov  eax,MyArray[4*edx+ebx+24]

It will be MyArray[some address related to some element of the array]. Right?

And I didn't get this. There are different addressing modes. Ok, but what is the difference between the below two codes?
mov eax, dword ptr ds:[aabbccdd]

mov eax, ds:[aaddccdd]