News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

What is an offset in assembly language?

Started by RedSkeleton007, June 25, 2015, 08:55:13 AM

Previous topic - Next topic

RedSkeleton007

Hi everyone, this is my first post. I'm not bad at programming with C++, but I'm new to assembly language. I'm working my way through my assembly language text book this summer, but so far it feels like I'm not fully grasping the concepts that my text book expects me to, which is not good because I've seen the word "offset" come up quite often in the book so far. The following is a passage from chapter 3:

ListSize must follow immediately after list. The following code, for example, produces too large a value (24) for ListSize because the storage used by var2 affects the distance between the current location counter and the offset of list:

list BYTE 10,20,30,40 ;line 1
var2 BYTE 20 DUP(?) ;line 2
ListSize = ($ - list) ;line 3


Where the heck does 24 come from in those 3 lines of code?

Is the distance disrupted because line 3 needed to come right after line 1 instead of line 2?

Also, I know that the $ sign refers to the current location counter, but what is meant by "distance between the current location counter and the offset of list"? For this example, is $ supposed to point to the element at the end of the list? If so, I would assume that "($ - list)" would result in 0, but that can't be right.

Finally, what is meant by "the offset of list"?

rrr314159

The offset of the list is essentially the address of the list. There are some details involved with where list is offset from, but you don't need to care about that; although someone else will undoubtedly tell you all about it.

Yes, line 3 must come right after line 1.

24 is the size of list plus the size of var2, but you only want the size of list. $ is the current location counter, "list" is the location counter at the beginning of list (or, as far as you're concerned, offset or address), so when ($ - list) is correctly positioned at line 2, ListSize will be 4 because there are 4 bytes in list.
I am NaN ;)

jj2007

Quote from: RedSkeleton007 on June 25, 2015, 08:55:13 AM
list BYTE 10,20,30,40 ;line 1
var2 BYTE 20 DUP(?) ;line 2
ListSize = ($ - list) ;line 3

In addition what rrr explained:
  $ is the position in the .data section after the var2 line
  list starts at "10"
  So $-list is  10, 20, 30, 40, [20 bytes] ==> 24 bytes

redskull

To take a more general approach, every byte has an address in the RAM, but at the same time this address need not be an 'absolute' address counting from zero.  In many cases, it's more convenient to establish some other location as the 'base', and then calculate the correct address relative to that, such as looping over an array; instead of saying 'element x' of the array, we can say "the item offset 'x' elements from the start".  Things are even more complex in assembly, where every byte can be referred to relative to any other byte; if, for example, you have three 4-byte integers sequentially in memory, then the 'middle' one can be referenced as offset +4 from the first, -4 from the third, or via it's absolute address from zero (that the assembler figures out for you), plus infinitely many more.  The CPU actually has addressing modes built in to help do this, and if you ever have the (dis)pleasure of using segmented memory, you will become more familiar with it than you ever wanted.

-r

nidud

#4
deleted

Tedd

Consider offset to be the same as address, i.e. what a pointer refers to.

24 is 4+20 -- list has 4 bytes, var2 has 20 bytes.

In memory it would look like this:

00401000  10    ;list
00401001  20
00401002  30
00401003  40
00401004  00    ;var2
00401005  00
00401006  00
     :
00401015  00
00401016  00
00401017  00

So, ListSize = ($ - list) = 00401018 - 00401000 = 18h (24 decimal)


If you actually want the size of List, you'll have to move the ListSize line immediately after its definition, so that $ is correct.
Potato2

K_F

In most other CPU's the memory map is one continuous block, so references to variables are absolute/exact memory locations (addresses)

Intel (and later, others) introduced what is called Segmented Addressing modes where there is a Base Address, held in the appropriately named  Segment register (CS, DS..etc), and everything else is defined as an OFFSET from this register ( ADDRESS = SEGMENT ADDRESS + OFFSET ADDRESS)

This allowed one to load a program made for a continuous memory block (as above) to be loaded anywhere in memory and work as 'normal', BUT as long as your memory references are OFFSETs from the Base/Segment register. The compiler/Assembler handles the offsets for you.

As you can see.. you now have the ability to load many programs (multitasking) anywhere in memory, without having to worry about absolute memory locations.
'Sire, Sire!... the peasants are Revolting !!!'
'Yes, they are.. aren't they....'