Do I understand assembly call steps?

gesho · April 18, 2020, 08:33:37 AM

Below I tried to write up process of calling subroutine (reading Irvine). If anyone gets a chance, do I get it right?
PS: I am aware of invoke, proc, enter, leave, local and other goodies. I want to make sure I understand boilerplate approach.

; when procedure is called, stack frame is created in following order
1. caller: passed argument (if any) pushed on stack in reverse order
2. caller: subroutine called, return address pushed (call does that, no code needed)
3. callee: ebp pushed (caller ebp backed up) ; i'd rather do this before step1 to keep params/ret closer aligned to ebp inside callee
4. callee: esp copied to ebp. ebp becomes stack base for callee. esp can now wander for push/pops inside callee.
5. callee: if local vars in callee, esp is decremented
6. callee: if registers saved, they are pushed on stack ( individually or pushfd)

7. after this subroutine runs and it's time to return. before calling return need to make sure:
- registers, if backed up, are restored;
- local variables, if decremented, are incremented, manipulate esp
- pop ebp, to restore stack base for caller and also decrement stack
as a result of above, right before the ret is issued esp should be pointing to return address
last task left is to clear up stack in caller of pushed parameters, which can be:

; C-style, stack cleanup of passed parameters is handled in caller:
8. callee just has ret.
9. after call line in caller, increment esp by parameters pushed before call.

; stdcall style, caller stack cleanup (of parameters to callee) done in callee
8. ret someConstant ; where someConstant is #bytes used for parameter push. this will increment esp after callee returns.

mineiro · April 18, 2020, 10:34:56 PM

It sounds plausible to me.
In number 7:
- local variables, if decremented, are incremented, manipulate esp
it can be skipped, ignored, because when moving the base of the stack to the stack pointer this is done. The contents of local variables are lost.

I would just review the order of the parameters saved in the stack, I don't know if every "calling convention" does them in reverse order. Perhaps in C it is an order, in Pascal another order, I mean, right to left following the documentation or left to right.

A call with possible destruction of the return address follows and is not advised:
; pseudo call
push address_to_go
ret

Sounds windows 32 bits what you are talking about, when you move to 64 bits, some functions parameters are passed by registers and others , if exist, in stack. So we can fill registers in any order. And looking to past, in ms-dos, the registers can be filled in any order too.

Maybe the keyword is ABI and calling convention. Good luck.

---edit----
When I said plausibly it was in the sense that we can create our own calling convention, you are right to follow conventions.
an example is:
lea eax, address_to_go
call eax

It is generally not used because it is performed more slowly, but I mention it just to show you other possibilities. Maybe, internally in your code to build a library, you can use other, your own created, but while doing inter-calling functions you follow the rules.

--edit1--
I remembered something that might be relevant.
If a function parameter is of type byte, how to proceed? Only one byte needed. Will a byte, word or dword be used?
I say this for you to pay attention to the stack alignment. Generally, there may be exceptions if manipulating stack manually, the stack is aligned (using push) to a multiple of 2.

gesho · April 18, 2020, 11:39:30 PM

thx mineiro, I think I understand your points.

Quote from: mineiro on April 18, 2020, 10:34:56 PM
- local variables, if decremented, are incremented, manipulate esp
it can be skipped, ignored, because when moving the base of the stack to the stack pointer this is done. The contents of local variables are lost.

I see. as long as esp is also set to return address before ret?

Quote
A call with possible destruction of the return address follows and is not advised:
; pseudo call
push address_to_go
ret

this is interesting. where is address_to_go is coming from? saving eip before issuing call?

Quote
an example is:
lea eax, address_to_go
call eax

where would address_to_go come from in this case? in general, when writing source we dont know where, at which address our instructions / opcodes will be placed, do we? my understanding is that labels serve this purpose, but cant know address itself. but then, I've seen instructions in disassembly, they usually take 1 to 5 bytes , opcode + possible constant or memory address. I guess one could try to calculate address of targeted line by calculating all steps prior to it?

jj2007 · April 19, 2020, 12:06:53 AM

Quote from: gesho on April 18, 2020, 11:39:30 PM
this is interesting. where is address_to_go is coming from? saving eip before issuing call?

call pushes the ret address on the stack

I strongly suggest Olly to understand these things. It's really easy to use.

gesho · April 19, 2020, 12:24:18 AM

Quote from: jj2007 on April 19, 2020, 12:06:53 AM
call pushes the ret address on the stack

sure, missed that option, thx jj2007

Quote
I strongly suggest Olly to understand these things. It's really easy to use.

I'm currently into WinDbg. Olly looks sort of similar?

mineiro · April 19, 2020, 04:25:37 AM

1- yes
2- when you call a function, you're calling a memory address. You told that do not have played with 'proc'(procedure). A "proc" like "main","start",...,is a labeled memory address. That can be internal to your program or external.
A real call instruction saves the address of next instruction after call in stack, adjust stack pointer and them jump to a label (address,location). When a ret is seen, then a jump to that address in stack is done and properly stack adjust is done.
So, address_to_go is a label (or a procedure name), or better, a labeled memory address.
If you disassemble a call instruction and look to opcodes (instruction bytes) you will see the address that the jump will go, they can be in reversed hexadecimal order.
3-Yes, like that. Generally is calculated the size of "call" instruction, in bytes. Depending of processor mode we have 2 call types, near and far. These instructions by default have a fixed size, so we know (but this is done internally in processor) what address will be stored in stack.
That example was to say to you that we can create our own calling function. Well, we can create a structure and store address of that structure in eax register. The structure can be simple, like argc and argv. So, the first argument will tell us how many members that function is passing to called function, and the followers will be the argument properly said. This way we don't need do that many pushes. Just an example.

jj2007 · April 19, 2020, 05:48:13 AM

Quote from: gesho on April 19, 2020, 12:24:18 AM
I'm currently into WinDbg. Olly looks sort of similar?

gesho · April 19, 2020, 08:37:45 AM

mineiro: I think I understand what you mean. I believe I've seen those op-codes and destination memory addresses, which were in little endian.

jj2007 looks like WinDbg, I'll take a closer look.

Thanks folks, looks like I am good for now.

jj2007 · April 19, 2020, 10:37:17 AM

Quote from: gesho on April 19, 2020, 08:37:45 AM
jj2007 looks like WinDbg, I'll take a closer look.

In practice you need only
F7 one step please
F8 one step but don't dive into calls
F9 run until you hit an int 3

The MASM Forum

News:

Do I understand assembly call steps?

gesho

mineiro

gesho

jj2007

gesho

mineiro

jj2007

gesho

jj2007