News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Help undertanding stack frame

Started by jayanthd, February 21, 2013, 03:27:39 AM

Previous topic - Next topic

dedndave

now - back to forward and backward references
it is a matter of context
the direction vector only has meaning when the branch is relative - far branches are never relative

so - i said that far branches have no forward or backward - that isn't strictly correct
what i might have said is - for far branches, forward or backward direction have no signifigance

also - far branches cannot target addresses that are in register - at least, not x86
the address is in the code stream or in a memory location

mov ax, @data
mov ds, ax

mov ax, seg ds

"@data" is an assembler short-hand for "the data segment", usually _DATA is the real name
you cannot load a segment register from an immediate operand, so they put it in a general register, first
that last instruction makes no sense   :P

jmp far byte ptr FarTarget
no - in 16-bit code, far addresses consist of a segment (word) and an offset (word) - so "byte" is not good
intel usually stores the segment at the higher address (little-endian)

lpFarTarg  dw offset,segment
        jmp dword ptr lpFarTarg


QuoteOffset and segment should be initialized. Right? But how to know the address of segment and offset of the far destination? By subtracting far address from address of next instruction after jump instruction?

yes - they must be initialized, either by code, of by defining them as initialized variables
as for the segment, the operating system will adjust the segment as a relocatable when it loads the EXE
the offset can be a label
it might look like this
pFarTarget dw FarLabel,FAR_CODE
where the target is...
FAR_CODE SEGMENT

FarLabel:

FAR_CODE ENDS


pardon my "lp" - i am used to win32 code   :P

subtracting the address of the next instruction only applies to relative (near) branches
so - that part is wrong

jayanthd

Quotethe direction vector only has meaning when the branch is relative
What do you mean by relative? Does it mean the jump label is in 64k byte offset of same segment?

mov ax, seg ds makes sense to me. It loads ax with data segment address (not offset address) in emu8086.
Quote
lpFarTarg  dw offset,segment
        jmp dword ptr lpFarTarg

Where is the value for offset and segment defined? How can variable names used as contents of dw?

Should it be

lpFarTarg  dw aabbccddh,ddccbbaah
        jmp dword ptr lpFarTarg



In jmp far ptr FarTarget

is FarTarget a variable name or label? I am asking because FarTarget should contain the CS:IP address to where the code has to jump

Quote
pFarTarget dw FarLabel,FAR_CODE

I can't understand this... FAR_CODE is the label and it has some offset address in the code. But where is segment address? In the FarLabel?

Why is it written like pFarTarget dw FarLabel,FAR_CODE and not like
pFarTarget dw FarLabel:FAR_CODE

MichaelW

In this context "relative" means relative to the current value of the instruction pointer. A relative address is encoded as a "displacement", which is effectively a signed value that is added to the instruction pointer to set it to the destination address. Displacements can be SHORT or NEAR, with a SHORT displacement encoded as a BYTE, and a NEAR displacement encoded as a WORD for 16-bit code or as a DWORD for 32-bit code. MASM by default uses the shortest displacement encoding possible, but as shown below it can be forced to a larger encoding.

And in case this is not clear in the listing, the first byte of the encoded instructions is the opcode, and for the call and jump instructions the instruction operand is the displacement.

;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
.data
.code
;==============================================================================
  L0:
    ret
  L1:
    jmp L2
  start:
    call L0
    jmp L1
  L2:
    jmp L3
  L3:
    jmp NEAR PTR L4
    nop
    nop
  L4:
    inkey
    exit
;==============================================================================
end start

00000000   L0:
00000000     C3             ret
00000001   L1:
00000001     EB 07          jmp L2            ; displacement = +7
00000003   start:
00000003     E8 FFFFFFF8    call L0           ; displacement = -8
00000008     EB F7          jmp L1            ; displacement = -9
0000000A   L2:
0000000A     EB 00          jmp L3            ; displacement = 0
0000000C   L3:
0000000C     E9 00000002    jmp NEAR PTR L4   ; displacement = +2
00000011     90             nop
00000012     90             nop
00000013   L4:


Well Microsoft, here's another nice mess you've gotten us into.

jayanthd

That was somewhat clear.

Quotethe first byte of the encoded instructions is the opcode

What is an encoded instruction? Are you telling that for a 16 bit (2 byte) instruction, the 1st byte (upper byte or left byte) will be the op code and the next byte will be the operand? encoded instruction means machine code. Right?

The below code in emu8086 loads the segment value and offset value of a variable to cx and dx registers.

data segment
var1 dw 2030h, 4050h
ends

stack segment
    dw   128  dup(0)
ends

code segment
start:

    ; add your code here
    mov ax, data
    mov ds, ax
   
    mov bx, ds
    mov cx, seg var1
    mov dx, offset var1
   
mov ax, 4c00h
int 21h 

ends

end start

dedndave

different instructions require different numbers of of bytes
16-bit code refers to code that runs on 16-bit processors, in the case of intel, 8086/8088/80186/80188
it does not mean that each instruction is 16 bits

INC AX is a single byte
JMP 8000:0000 is 5 bytes

when the processor inerprets instructions, one of the things it must do is determine the number of bytes

the term "opcode" is thrown around a bit ambiguously
because part of an instruction might be the opcode and part might be an immediate operand
we often refer to the whole thing as an opcode - lol
i can see where that might be a little confusing

as for the stretch of code....
sure - you can load the segment and offset into registers
but, intel processors do not provide instructions that look like this
        jmp     cx:dx
        call    cx:dx


if i wanted to branch to a far address from values in register...
        push    cx    ;push the segment
        push    dx    ;push the offset
        retf          ;far return


if var1 was a code label, you could just branch to the label
the assembler knows it is in a different segment, and makes it a far branch

however, var1 is not a code label - it is a data label
what you really want is
        jmp dword ptr var1
now, the assembler knows that var1 has 2 words
it knows that it is a far branch
the segment of var1 must be in a segment register
normally, the DS register holds the data segment

in everything i have discussed, i am refering to MASM syntax
we don't use emu86 much in here

MichaelW

Quote from: jayanthd on February 25, 2013, 05:38:54 PM
Quotethe first byte of the encoded instructions is the opcode
What is an encoded instruction?

An encoded instruction is an instruction in its machine code format. The opcodes that I was referring to in the listing, in this case (but not in the general case) each a single byte, are:
C3
EB
E8
EB
EB
E9
90
90
Sorry for the confusion, I was trying to make it easy for you to identify the point of interest, the encoded displacements in the instruction operands.
Well Microsoft, here's another nice mess you've gotten us into.

jayanthd

@ Michael and Dave

Ok. If I have a 1 byte, 2 byte, and 5 byte machine code like below

F6
D58A
EBC14F8B2C90


Then F6, D5, and EB is the opcode. Right?

I am using masm611 and emu8086. I will play with it for another 2 weeks and then I will start 32 bit assembly programming using MASM32. So, bear with me.

Can anybody give me a simple MASM32 code for adding two numbers. It must have three variable var1, var2, and sum. The result should be printed in console window.

dedndave

correct on the opcode

you did say masm32   :P
we assume you have installed the masm32 package

;###############################################################################################

        INCLUDE    \Masm32\Include\Masm32rt.inc

;###############################################################################################

        .DATA

var1    dd 65
var2    dd 75

;***********************************************************************************************

        .DATA?

sum     dd ?

;###############################################################################################

        .CODE

;***********************************************************************************************

_main   PROC

        mov     eax,var1
        add     eax,var2
        mov     sum,eax

        print   str$(eax),13,10
        inkey
        exit

_main   ENDP

;###############################################################################################

        END     _main

dedndave

i guess, if you are using masm v 6.11, you may not have installed the masm32 package
it will be difficult to get started with 32-bit code without installing it

you can, however, assemble 16-bit code with it, provided you have a 16-bit linker
(the 32-bit linker will not link 16-bit modules)
notice that you can use some 32-bit registers in 16-bit code - that is probably a little confusing
the fact is, if you need 32-bit registers, you may as well write 32-bit code   :biggrin:

at any rate, here is a 16-bit equiv of the above program
i have omitted the display part, as you would need to write a routine for that and i didn't want to complicate it
you can watch the results in a debugger

        .MODEL  Small
        .STACK  1024
        OPTION  CaseMap:None

;####################################################################################

        .DATA

var1    dw 65
var2    dw 75

;************************************************************************************

        .DATA?

sum     dw ?

;####################################################################################

        .CODE

;************************************************************************************

_main   PROC    FAR

;----------------------------------

;DS = DGROUP

        mov     ax,@data
        mov     ds,ax

;----------------------------------

        mov     ax,var1
        add     ax,var2
        mov     sum,ax

;----------------------------------

;terminate

        mov     ax,4C00h
        int     21h

_main   ENDP

;####################################################################################

        END     _main


jayanthd

#39
I have installed both MASM 615 and MASM32. I will start using MASM32 in another 2 days.  :P

The things that I didn't understand in the masm32 code and masm611 code are

masm32 code
print   str$(eax),13,10
        inkey


Why str$(eax)? Why not str$(sum)...
What is inkey?

masm611 code
What is OPTION  CaseMap:None


dedndave

well - i could use (sum), or i could use (eax)
it just happens that the value is in a register, at that time
so, it is more efficient to use the register

print, inkey, str$ are all macros provided by Hutch's masm32 package
they just save some typing - and make the code a little easier to read

inkey displays a "press any key" message and waits for a keypress
if you run the program by clicking on it in windows explorer, and you don't have some kind of wait,
the console opens, runs the program, and closes before you get to see the results

you will want to browse the files in the \masm32\help folder
the macros are described in hlhelp.chm
and are defined in \masm32\macros\macros.asm