The MASM Forum

General => The Campus => Topic started by: NoCforMe on June 28, 2012, 07:05:57 AM

Title: How to access structure array elements
Post by: NoCforMe on June 28, 2012, 07:05:57 AM
I feel like such an idiot.

I should know this. I'm trying to create an array of structures, and access an element of the array using a pointer.

I've created the array with no problem:


$test STRUCT
  s1 DB 20 DUP(?)
  s2 DB 10 DUP(?)
  s3 DB 4 DUP(?)
$test ENDS

TestStructs $test 4 DUP (<>)


Problem is, when I try to access an element of the array, the subscript I use becomes a byte offset within one of the strings, not an offset to the nth element.

In other words, if I do this:


LEA EAX, TestStructs[1].s2


I end up pointing to the 2nd byte within s2 of the first element--not at all what I want.

I thought I knew how to do this. MASM's behavior here seems completely counter-intuitive. If I say TestStructs[1].s2, I'm saying I want the 2nd element (0-based) of the array of structures (what's to the left of the period), and then I want the offset to field s2 within that element. Right?

Obviously, wrong. The following little program shows it clearly:


;============================================
; Array addressing testbed
;============================================


include \masm32\include\masm32rt.inc


;============================================
; Defines, macros, prototypes, etc.
;============================================

$test STRUCT
  s1 DB 20 DUP(?)
  s2 DB 10 DUP(?)
  s3 DB 4 DUP(?)
$test ENDS


;============================================
; HERE BE DATA
;============================================
.data

TestStructs $test 4 DUP (<>)

Addrfmt DB "Address of TestStructs[0].s2: %x", 13, 10
DB "Address of TestStructs[1].s2: %x", 13, 10, 0

buffer DB 200 DUP(?)


;============================================
; CODE LIVES HERE
;============================================
.code


start: INVOKE wsprintf, OFFSET buffer, OFFSET Addrfmt,
OFFSET TestStructs[0].s2, OFFSET TestStructs[1].s2
INVOKE MessageBox, 0, OFFSET buffer, NULL, MB_OK

INVOKE ExitProcess, EAX

END start


So what's the correct syntax for what I'm trying to do? I know this is a piece of cake with structures that don't contain arrays (i.e., strings). It seems that the subscript is being applied to the field rather than the element, since the field is an array of bytes.

(I realize those OFFSETs don't really do anything--I just tried putting them in out of desperation!)


Title: Re: How to access structure array elements
Post by: jj2007 on June 28, 2012, 07:20:35 AM
include \masm32\include\masm32rt.inc

$test      STRUCT
  s1      DB 20 DUP(?)
  s2      DB 10 DUP(?)
  s3      DB 4 DUP(?)
$test      ENDS

.data?
TestStructs   $test 4 DUP (<>)

.code
start:   lea   edi, TestStructs[3*$test].s2
   mov byte ptr [edi], 123
   print str$(TestStructs[3*$test].s2), 9, "value", 13, 10
   mov eax, edi
   sub eax, offset TestStructs
   print str$(eax), 9, "offset", 13, 10
   exit
end start

HTH, jj
Title: Re: How to access structure array elements
Post by: FORTRANS on June 28, 2012, 07:22:56 AM
Hi,

LEA EAX, TestStructs[1].s2


   This says :load EAX with the following address.
Take the address of TestStructs, add the offset of .s2,
then add 1 ([1]).  You probably want to use something
like:

$test STRUCT
  s1 DB 20 DUP(?)
  s2 DB 10 DUP(?)
  s3 DB 4 DUP(?)
$test ENDS

TestStructs $test 4 DUP (<>)
SizeOfTest      EQU     34      ; Your struc has 34 bytes.

LEA EAX, TestStructs[1*SizeOfTests].s2


   Oops, jj2007 posted something, better forget mine.

Regards,

Steve N.
Title: Re: How to access structure array elements
Post by: jj2007 on June 28, 2012, 07:43:49 AM
Quote from: FORTRANS on June 28, 2012, 07:22:56 AM
Oops, jj2007 posted something, better forget mine.

That doesn't mean your post isn't correct. On the contrary, you added theory to my practical example :biggrin:

By the way, try this:
SizeOfTest      EQU     34

lea   esi, TestStructs[3*SizeOfTest].s2
lea   edi, TestStructs[3*$test].s2
.if esi==edi
    shout "the same"
...
Title: Re: How to access structure array elements
Post by: dedndave on June 28, 2012, 09:34:28 AM
ok - my turn - lol
i would not use LEA, in this case
LEA might be needed if one of the registers already contained part of the address

if you use MOV reg,OFFSET ..... the assembler will calculate the required address for you
the assembler knows the size of the structure and the base address of the array

sometimes, it isn't so obvious until you look at the disassembled code whether LEA is required

TestStructs[1].s2
use of the brackets ([]) and the period have the same effect, here
the assembler will add the 3 elements together:
the base address of TestStructs
1
s2 (the offset of s2 in a test$ structure)

Title: Re: How to access structure array elements
Post by: NoCforMe on June 28, 2012, 12:40:35 PM
Quote from: FORTRANS on June 28, 2012, 07:22:56 AM
Hi,

LEA EAX, TestStructs[1].s2


   This says :load EAX with the following address.
Take the address of TestStructs, add the offset of .s2,
then add 1 ([1]).

That's not what I would have ASS-U-med about  this at all. (Even though you are correct.)

My immediate reaction is that MASM's behavior in this case is brain-damaged and illogical. On more sober contemplation, it seems that MASM simply lacks true array processing.

Why do I say "brain-damaged and illogical"? Because, well, C handles array references in a way that seems logical: array[n].field says "Take the offset of the nth element of array and add to that the offset of field". Everything to the left of the dot has to do with selecting the array element; everything to the right adds an offset to that selection.

That's the way array references should work. But MASM has it bass-ackwards. (I confirmed it with a little test prog. Doesn't matter if the fields are DDs or whatever.) How did they come up with that behavior?

In other  words, what I thought was a subscript is actually just an offset, much like [EBX + VarName + 1]. The really annoying thing is that I haven't even been able to find documentation of this, at least not in the official Micro$oft MASM manual.

So is there no good shorthand method of referencing array elements using subscripts?

By the way, rather than using an equate using a hard-coded number (which would be incorrect if the size of the structure changed), I would prefer to do things this way:

TestStructs[SIZEOF $test * 1].s2

Still sucks compared to the way it should work, though ...
Title: Re: How to access structure array elements
Post by: dedndave on June 28, 2012, 12:53:10 PM
 :biggrin:

it is perfectly logical - just low-level
that is one of the major differences in programming in ASM vs compiled languages
you have to do a little more work in order to get a lot more control
and - you get to see what goes on inside the processor's "head"   :P
Title: Re: How to access structure array elements
Post by: NoCforMe on June 28, 2012, 01:28:25 PM
Quote from: dedndave on June 28, 2012, 12:53:10 PM
:biggrin:

it is perfectly logical - just low-level

Sorry, no; it's not logical at all. At least not syntactically.

Look: I make an array reference like TextField[2].field1. How in the world can you say that interpreting "[2]" as being an offset added to the offset of "field1" makes sense? It doesn't; everything on the left of the period should be evaluated as referencing a particular array element, not an offset from the 0th element. Otherwise, why have arrays at all if you can't properly reference their elements? (Well, we can, but we have to jump through a few hoops in other to do it. And it has nothing whatever to do with "low level" vs. high level.)
Title: Re: How to access structure array elements
Post by: dedndave on June 28, 2012, 01:50:15 PM
it has everything to do with being low level

at any rate....
you sure like to be contrary, don't you - lol
you're lucky we are in the campus
i remind myself that these posts are really for reference for others
Title: Re: How to access structure array elements
Post by: NoCforMe on June 28, 2012, 04:45:28 PM
I came up with another way to access array elements:


$test STRUCT
  s1 DD ?
  s2 DD ?
  s3 DD ?
$test ENDS

ar TEXTEQU <SIZEOF $test *>

LEA EDX, TestStructs[ar 2].s2


Is more intuitively satisfying to me (i.e., the "subscript" number is what one would expect), and is still "low level".
Title: Re: How to access structure array elements
Post by: tenkey on June 28, 2012, 05:03:29 PM
In most cases, you don't want to use hard-coded subscripts.

At best, you can access bytes, words, dwords, and qwords from arrays using the following syntax forms:

    mov al,ByteArray[ecx]
    mov ax,WordArray[ecx*2]
    mov eax,DwordArray[ecx*4]
    mov eax,dword ptr QwordArray[ecx*8]  ; lower half
    mov edx,dword ptr QwordArray[ecx*8+4]   ; upper half

MASM exposes the processor, and the processor knows nothing about arrays.

For an arbitrary sized item, you are forced to do the following for a variable index:

    mov     eax,sizeof $test
    imul    index_of_array   ; compute byte offset
    mov     edx,TestStructs[eax].s1
    mov     TestStructs[eax].s2,edx

or

    mov     ecx,index_of_array
    imul    eax,ecx,sizeof $test   ; compute byte offset
    mov     edx,TestStructs[eax].s1
    mov     TestStructs[eax].s2,edx

or

    imul    eax,index_of_array,sizeof $test   ; compute byte offset
    mov     edx,TestStructs[eax].s1
    mov     TestStructs[eax].s2,edx
Title: Re: How to access structure array elements
Post by: MichaelW on June 28, 2012, 05:32:33 PM
Constant indexes are easy. The array index does need to be adjusted by the size of the array elements, but if you are looping through the elements the multiply can be replaced with an addition.

;==============================================================================
include \masm32\include\masm32rt.inc
;==============================================================================
    $test STRUCT
        s0  DWORD 3 DUP(?)
        s1  DWORD 3 DUP(?)
        s2  DWORD 3 DUP(?)
    $test ENDS
;==============================================================================
.data
    TestStructs $test 3 DUP (<{0,1,2},{3,4,5},{6,7,8}>)
.code
;==============================================================================
start:
;==============================================================================
    I = sizeof $test

    printf("%d\t",  TestStructs[I*0].s0[0*4])
    printf("%d\t",  TestStructs[I*0].s0[1*4])
    printf("%d\t",  TestStructs[I*0].s0[2*4])
    printf("%d\t",  TestStructs[I*0].s1[0*4])
    printf("%d\t",  TestStructs[I*0].s1[1*4])
    printf("%d\t",  TestStructs[I*0].s1[2*4])
    printf("%d\t",  TestStructs[I*0].s2[0*4])
    printf("%d\t",  TestStructs[I*0].s2[1*4])
    printf("%d\n\n",TestStructs[I*0].s2[2*4])
    printf("%d\t",  TestStructs[I*1].s0[0*4])
    printf("%d\t",  TestStructs[I*1].s0[1*4])
    printf("%d\t",  TestStructs[I*1].s0[2*4])
    printf("%d\t",  TestStructs[I*1].s1[0*4])
    printf("%d\t",  TestStructs[I*1].s1[1*4])
    printf("%d\t",  TestStructs[I*1].s1[2*4])
    printf("%d\t",  TestStructs[I*1].s2[0*4])
    printf("%d\t",  TestStructs[I*1].s2[1*4])
    printf("%d\n\n",TestStructs[I*1].s2[2*4])
    printf("%d\t",  TestStructs[I*2].s0[0*4])
    printf("%d\t",  TestStructs[I*2].s0[1*4])
    printf("%d\t",  TestStructs[I*2].s0[2*4])
    printf("%d\t",  TestStructs[I*2].s1[0*4])
    printf("%d\t",  TestStructs[I*2].s1[1*4])
    printf("%d\t",  TestStructs[I*2].s1[2*4])
    printf("%d\t",  TestStructs[I*2].s2[0*4])
    printf("%d\t",  TestStructs[I*2].s2[1*4])
    printf("%d\n\n",TestStructs[I*2].s2[2*4])

    xor ebx, ebx
    .WHILE ebx < 3 * I
        xor esi, esi
        .WHILE esi < 3
            printf("%d\t", TestStructs[ebx].s0[esi*4])
            inc esi
        .ENDW
        xor esi, esi
        .WHILE esi < 3
            printf("%d\t", TestStructs[ebx].s1[esi*4])
            inc esi
        .ENDW
        xor esi, esi
        .WHILE esi < 3
            printf("%d\t", TestStructs[ebx].s2[esi*4])
            inc esi
        .ENDW
        printf("\n\n")
        add ebx, I
    .ENDW

    inkey
    exit
;==============================================================================
end start



Title: Re: How to access structure array elements
Post by: jj2007 on June 28, 2012, 06:17:31 PM
As Dave already wrote, Masm is low level, and [n] means "offset n bytes". But you still have elegant options available:
include \masm32\include\masm32rt.inc

$test STRUCT
  s1 DB 20 DUP(?)
  s2 DB 10 DUP(?)
  s3 DB 4 DUP(?)
$test ENDS

.data?
TestStructs $test 4 DUP (<>)

.code
start: lea edi, TestStructs[1*$test].s2 ; indirect, using edi
mov byte ptr [edi], 111 ; needs to inform Masm which size
mov TestStructs[2*$test].s2, 222 ; directly, no size info needed
print str$(TestStructs[1*$test].s2), 9, "value 1.s2", 13, 10
print str$(TestStructs[2*$test].s2), 9, "value 2.s2", 13, 10
mov eax, edi
sub eax, offset TestStructs
print str$(eax), 9, "offset 1:0", 13, 10
exit
end start

Output:
111     value 1.s2
222     value 2.s2
54      offset 1:0



If that is still not highlevelish enough, you are a candidate for MasmBasic :greensml:

include \masm32\MasmBasic\MasmBasic.inc

$test      STRUCT
  s1      DB 20 DUP(?)
  s2      DB 10 DUP(?)
  s3      DB 4 DUP(?)
$test      ENDS

   Init
   Dim TestStructs(3) As $test
   mov TestStructs(3, s2), 123
   Print Str$("Value=\t%i\n", TestStructs(3, s2))
   lea ecx, TestStructs(0, s1)   ; first item
   lea eax, TestStructs(3, s2)   ; current item
   sub eax, ecx
   Inkey Str$("Offset=\t%i", eax)
   Exit
end start

Pure MAssemblerTM  :biggrin:
Title: Re: How to access structure array elements
Post by: tenkey on June 28, 2012, 07:01:48 PM
Quote from: michaelwConstant indexes are easy. The array index does need to be adjusted by the size of the array elements, but if you are looping through the elements the multiply can be replaced with an addition.

One of the reasons for the continuing existence of the C language is to have access to some of the ASM tricks.

The indexed copy loop

  for (j = 0; j < count; j++) dest[j].f1 = src[j].f2;

can be rewritten as

  pdest = dest;   // array name is ptr constant
  psrc = src;
  n = count;
  while (n--)  (pdest++)->f1 = (psrc++)->f2;

The latter replaces the subscript multiplication with address addition. It is roughly equivalent to

    lea     edi,dest
    lea     esi,src
    mov     ecx,count
    cmp     ecx,0
    je      endlbl
lbl:
    mov     eax,[esi].src_struct.f2   ; if f1 and f2 are dwords
    mov     [edi].dest_struct.f1,eax
    add     esi,sizeof src_struct
    add     edi,sizeof dest_struct
    dec     ecx
    jnz     lbl
endlbl:
Title: Re: How to access structure array elements
Post by: hutch-- on June 28, 2012, 09:36:04 PM
A technique that does work well and is no big deal to write in assembler is an array of pointers to other array elements. The target array can be of uneven size, IE: an array of variable length strings for example but what makes it fast and easy to work with is an array of predictable size (DWORD ARRAY) where each DWORD member is a pointer to the uneven size elements.

High level languages do this all the time but its easy enough to create an array of variable length elements that is addressed by an array of pointers.
Title: Re: How to access structure array elements
Post by: Ghandi on June 28, 2012, 10:01:46 PM
I remember running into this when i was writing some code that accessed the data directories in the PE header, i wondered why in C i could simply give the directory equate yet in MASM it was failing. I debugged it and saw the memory accesses weren't correct and realized that i had to multply the directory index by the size of IMAGE_DATA_DIRECTORY to get it back to the correct values.

HR,
Ghandi
Title: Re: How to access structure array elements
Post by: Ryan on June 28, 2012, 11:08:15 PM
Quote from: jj2007 on June 28, 2012, 07:20:35 AM
   print str$(TestStructs[3*$test].s2), 9, "value", 13, 10
The SIZEOF operator is assumed when using the name of the struct in a calculation?
Title: Re: How to access structure array elements
Post by: dedndave on June 28, 2012, 11:11:16 PM
test$ STRUCT
  member db ?
test$ ENDS

TestStruct test$ <>


yes - test$ is a type definition and the assembler will use the size, as it has no address
TestStruct has an address, so the assembler will use that
Title: Re: How to access structure array elements
Post by: MichaelW on June 28, 2012, 11:15:33 PM
Yes, in my code I was able to reduce:
    I = sizeof $test
To:
    I = $test
And the app produced the same output as it did with the sizeof operator.



Title: Re: How to access structure array elements
Post by: NoCforMe on June 29, 2012, 03:02:51 AM
Quote from: tenkey on June 28, 2012, 05:03:29 PM
In most cases, you don't want to use hard-coded subscripts.

I agree; usually, one doesn't. However, in this case, hard-coded subscripts are  what's needed, which is what prompted my question in the first place.

What  I'm doing is placing addresses within one one array of structures (pointers to text buffers)into another array. It's much easier to hard-code this at assemble-time (since there's a 1:1 correspondence between the two elements), rather than have to programmatically loop through the the structures at runtime and plug in addresses.
Title: Re: How to access structure array elements
Post by: TBRANSO1 on January 25, 2019, 04:08:34 PM
Quote from: hutch-- on June 28, 2012, 09:36:04 PM
A technique that does work well and is no big deal to write in assembler is an array of pointers to other array elements. The target array can be of uneven size, IE: an array of variable length strings for example but what makes it fast and easy to work with is an array of predictable size (DWORD ARRAY) where each DWORD member is a pointer to the uneven size elements.

High level languages do this all the time but its easy enough to create an array of variable length elements that is addressed by an array of pointers.

I edited this as I figured out how to do this with pain and patience.


.NOLIST
include \masm32\include\masm32rt.inc
.LIST

ThreadFunction PROTO :DWORD
ErrorHandler PROTO :LPTSTR
ExitProcess PROTO :DWORD
Main PROTO

MYDATA STRUCT
val1 DWORD ?
val2 DWORD ?
MYDATA ENDS

public start

.CONST
   max_threads   EQU 10
   buff_size   EQU 0FFh

.DATA
  threadMsg   DB     "CreateThread ", 0
          rtlMsg      DB      "Parameters = %d, %d", 10, 0
          rtlMsg2     DB      "%s failed with error %d: %s", 0
          errMsg      DB      "Error", 0

.DATA?
gData MYDATA <?>

.CODE

start:
INVOKE Main
        ;inkey
push 0
call ExitProcess

ThreadFunction PROC lpParam:DWORD
    LOCAL hStdOut:DWORD
    LOCAL msgBuf[buff_size]:BYTE
    LOCAL cchStringSize:DWORD
    LOCAL dwChars:DWORD

    push STD_OUTPUT_HANDLE
    call GetStdHandle
    mov hStdOut, eax
    cmp hStdOut, INVALID_HANDLE_VALUE
    je _ERROR
    ;printf("I am inside\n")
mov ecx, lpParam
mov ebx, DWORD PTR [ecx]
mov edx, DWORD PTR [ecx+4]
    INVOKE crt__snprintf, addr msgBuf, buff_size, addr rtlMsg, ebx, edx
mov cchStringSize, eax
    INVOKE WriteConsoleA, hStdOut, addr msgBuf, cchStringSize, addr dwChars, 0
    ret
_ERROR:
    printf("I didn't make it that far inside the thread function\n")
    mov eax, 1
    ret
ThreadFunction ENDP

ErrorHandler PROC lpszFunction:LPTSTR
    LOCAL lpMsgBuf:LPVOID
    LOCAL lpDisplayBuf:LPVOID
    LOCAL dwordage:DWORD

    mov dwordage, rv(GetLastError)
    INVOKE FormatMessage, FORMAT_MESSAGE_ALLOCATE_BUFFER or \
                        FORMAT_MESSAGE_FROM_SYSTEM or FORMAT_MESSAGE_IGNORE_INSERTS,
                        0, dwordage, 0, lpMsgBuf, 0, 0
    mov eax, rv(lstrlen, lpszFunction)
    mov ebx, eax
    mov eax, rv(lstrlen, lpMsgBuf)
    add eax, ebx
    add eax, 028h
    push eax
    push 0
    call LocalAlloc
    mov lpDisplayBuf, eax
    INVOKE wsprintf, lpDisplayBuf, eax, addr rtlMsg2, lpszFunction, dwordage, lpMsgBuf
    INVOKE MessageBoxA, 0, lpDisplayBuf, addr errMsg, 0
    push lpMsgBuf
    call LocalFree
    push lpDisplayBuf
    call LocalFree
    ret
ErrorHandler ENDP

Main PROC
LOCAL localStrucArray[max_threads]:DWORD
LOCAL dwThreadIdArray[max_threads]:DWORD
LOCAL hThreadArray[max_threads]:HANDLE
LOCAL i:DWORD

mov i, 0
.WHILE i < max_threads
INVOKE GetProcessHeap
INVOKE HeapAlloc, eax, HEAP_ZERO_MEMORY, SIZEOF(MYDATA)
mov ecx, DWORD PTR i
mov DWORD PTR localStrucArray[ecx*4], eax
cmp DWORD PTR localStrucArray[ecx*4], 0
jz _EXIT
jne @F
INVOKE ErrorHandler, addr threadMsg
printf("Here @ErrorHandler\n")
push 2
call ExitProcess
@@:
mov ecx, DWORD PTR i
ASSUME eax:PTR MYDATA
mov eax, DWORD PTR localStrucArray[ecx*4]
mov edx, DWORD PTR i
add edx, 10
mov (MYDATA PTR [eax]).val1, edx
add edx, 90
mov (MYDATA PTR [eax]).val2, edx
ASSUME eax:NOTHING
mov edx, DWORD PTR dwThreadIdArray[ecx*4]
mov ecx, DWORD PTR localStrucArray[ecx*4]
INVOKE CreateThread, 0, 0, offset ThreadFunction, ecx, 0, edx
mov ecx, DWORD PTR i
mov DWORD PTR hThreadArray[ecx*4], eax
cmp DWORD PTR hThreadArray[ecx*4], 0
jz _EXIT
inc i
.ENDW

INVOKE WaitForMultipleObjects, max_threads, addr hThreadArray, TRUE, -1

mov i, 0
.WHILE i < max_threads
mov ecx, DWORD PTR i
mov ebx, DWORD PTR hThreadArray[ecx*4]
INVOKE CloseHandle, ebx
mov ecx, DWORD PTR i
cmp localStrucArray[ecx*4], 0
je @F
call GetProcessHeap
mov ecx, DWORD PTR i
mov edx, DWORD PTR localStrucArray[ecx*4]
INVOKE HeapFree, eax, 0, edx
mov ecx, DWORD PTR i
mov DWORD PTR localStrucArray[ecx*4], 0
@@:
inc i
.ENDW
xor eax, eax
ret
_EXIT:
        printf("Here @exit\n")
push 3
call ExitProcess
Main ENDP
end start


The first time I attempted to tackle this bit of Heap Allocation.