News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

is there a reason for not allowing struct fields resolving on locals?

Started by llm, October 26, 2024, 05:25:09 PM

Previous topic - Next topic

llm

i do not have a real problem with that (there are serveral ways to fix) but want
to understand why struct resolving on locals are an error

tested with UASM 2.57 (filed also an issue: https://github.com/Terraspace/UASM/issues/218)

example for just showing the assembling error - not "working"

my_struct struc
  one dw ?
  two dw ?
my_struct ends

.MODEL small
.STACK 100h
.DATA
  global_var my_struct<0>   

.CODE

local_var_test proc near
  local_var = my_struct ptr 4

  mov ax,[bp+local_var+my_struct.two] ; OK
  mov ax,[bp+local_var.two] ; ERROR: Error A2274: structure field expected <=======

local_var_test endp

start:
  mov ax,[global_var+my_struct.two] ; OK
  mov ax,[global_var.two] ; OK with global <=======
 
end start


zedd151

This both looks and seems wrong:  At least it feels 'icky'.  :tongue:
Quote from: llm on October 26, 2024, 05:25:09 PMlocal_var = my_struct ptr 4


Thanks japheth, llm, (from  below this post). 
Quote from: _japheth on November 01, 2024, 11:56:25 PM
Quote from: llmlocal_var = my_struct ptr 4
this is a valid syntax - of the many many possibilites grown over the years with all the assemblers around - i think its origin is from TASM

It's valid syntax ... but the '=' directive used here has the very strict pupose to define an assembly-time variable.
Still looks icky.  :tongue:  :biggrin:  :biggrin:
My incorrect code is removed.  :greensml:


I felt compelled to reply since no one else has... in 5-6 days.

llm

Quote from: zedd151 on November 01, 2024, 04:28:31 PMI felt compelled to reply since no one else has... in 5-6 days.

your response impelled me to give you the whole story  :rolleyes:

Quotelocal_var = my_struct ptr 4

this is a valid syntax - of the many many possibilites grown over the years with all the assemblers around - i think its origin is from TASM

GOAL:

i just try to get UASM better in re-assembling IDA Pro output of disassembled code 16/32 bit
UASM is already very good at it and normaly only minor tweaks are needed to be 100% binary-equal re-assembleable (best base for starting analyzing something)
but this local-struct accessing thing is one thing that would be nice if UASM could support it

i've done that re-assembling serveral times before and i am very familiar with ASM/reversing/16Bit/DOS etc. so not a complete rookie  :angelic:

IDA can't be configured to do the output different nor is WASM,MASM or TASM better suited - nasm got a more or less complete different syntax
and Ghidra assembler output seems to be not suited at all for re-assembling

reversing code producing sometimes very much code - 200-300k line of code is not uncommon for a simple DOS game (of which you want to understand the level or image format etc.) so having an assembler that does not need too many tweaks is very helping - the reversing is hard enough  :tongue:

example:

localvar.c

struct my_struct
{
  int one;
  int two;
};

int localvar_test()
{
  struct my_struct local_var;
  local_var.one = 1;
  local_var.two = 2;
  return local_var.one + local_var.two;
}

struct my_struct global_var;

int main(void)
{
  global_var.one = 10;
  global_var.two = 20;
  return global_var.one + global_var.two + localvar_test();
}

build LOCALVAR.EXE with Microsoft C 5.1 (from 1988)

CL.EXE /Gs localvar.c
analyze of LOCALVAR.EXE with IDA gives this disassembly for main and localvar_test (i skipped the rest of the startup code, data-seg etc. but added the my_struct type for better understanding)

...
seg000:0010 ; =============== S U B R O U T I N E =======================================
seg000:0010
seg000:0010 ; Attributes: bp-based frame
seg000:0010
seg000:0010 localvar_test  proc near              ; CODE XREF: _main+C␙p
seg000:0010
seg000:0010 local_var      = my_struct ptr -4
seg000:0010
seg000:0010                push    bp
seg000:0011                mov    bp, sp
seg000:0013                sub    sp, 4
seg000:0016                mov    [bp+local_var.one], 1
seg000:001B                mov    [bp+local_var.two], 2
seg000:0020                mov    ax, 3
seg000:0023                mov    sp, bp
seg000:0025                pop    bp
seg000:0026                retn
seg000:0026 localvar_test  endp
seg000:0026
seg000:0026 ; ---------------------------------------------------------------------------
seg000:0027                align 2
seg000:0028
seg000:0028 ; =============== S U B R O U T I N E =======================================
seg000:0028
seg000:0028
seg000:0028 ; int __cdecl main(int argc, const char **argv, const char **envp)
seg000:0028 _main          proc near              ; CODE XREF: start+8D␙p
seg000:0028                mov    global_var.one, 0Ah
seg000:002E                mov    global_var.two, 14h
seg000:0034                call    localvar_test
seg000:0037                add    ax, global_var.one
seg000:003B                add    ax, global_var.two
seg000:003F                retn
seg000:003F _main          endp
seg000:003F
...

and the problem is that i need to replace the struct field accesses in the localvar_test proc do be re-assemble-able

so

mov    [bp+local_var.one], 1
mov    [bp+local_var.two], 2

needs to become

mov    [bp+local_var+my_struct.one],1
mov    [bp+local_var+my_struct.two],2

to be fully assemble-able with UASM - and that in my real projects hundreds of times  :undecided:

and that is only needed for "local vars" even if the local definition contains the struct as type
so it seems that UASM is ignoring the type-info coming from the local_var definition - fixing that would be great

or maybe someone can explain why its differently behaving - if there is a reason for it - except "not implemented so far" :winking:

using your tip with

local local_var:my_struct
works only if i replace

mov    [bp+local_var.one], 1
mov    [bp+local_var.two], 2

with (removing the bp)

mov    [local_var.one],1
mov    [local_var.two],2

so there are still changes needed in all places - and sometimes these locals are not adressed by bp but bx or whatever the origin dev decides so just having the type information better used would be better

thank you very much for your inital reply


_japheth

Quote from: llmlocal_var = my_struct ptr 4
this is a valid syntax - of the many many possibilites grown over the years with all the assemblers around - i think its origin is from TASM

It's valid syntax ... but the '=' directive used here has the very strict pupose to define an assembly-time variable. Hence it's always a number, without a "type". That's why the "my_struct ptr" part is simply ignored.

To change this would severely break masm compatibility - without gaining much ( if anything at all ).

Quoteand the problem is that i need to replace the struct field accesses in the localvar_test proc do be re-assemble-able

so

mov    [bp+local_var.one], 1
mov    [bp+local_var.two], 2

needs to become

mov    [bp+local_var+my_struct.one],1
mov    [bp+local_var+my_struct.two],2

to be fully assemble-able with UASM - and that in my real projects hundreds of times  :undecided:

Perhaps a more compatible syntax would be:

my_struct struct
    org -4
one dw ?
two dw ?
my_struct ends

local_var textequ <my_struct>

    .code

    mov [bp+local_var.one],1
    mov [bp+local_var.two],2

Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

llm

Quote from: _japheth on November 01, 2024, 11:56:25 PM
Quote from: llmlocal_var = my_struct ptr 4
this is a valid syntax - of the many many possibilites grown over the years with all the assemblers around - i think its origin is from TASM

It's valid syntax ... but the '=' directive used here has the very strict pupose to define an assembly-time variable. Hence it's always a number, without a "type". That's why the "my_struct ptr" part is simply ignored.

To change this would severely break masm compatibility - without gaining much ( if anything at all ).

perfekt explanation

so its seems that the IDA guys many many years ago defined that this would be ok enough for reading/understanding but sadly never though about real re-assembling

Quote from: _japheth on November 01, 2024, 11:56:25 PM
Quoteand the problem is that i need to replace the struct field accesses in the localvar_test proc do be re-assemble-able

so

mov    [bp+local_var.one], 1
mov    [bp+local_var.two], 2

needs to become

mov    [bp+local_var+my_struct.one],1
mov    [bp+local_var+my_struct.two],2

to be fully assemble-able with UASM - and that in my real projects hundreds of times  :undecided:

Perhaps a more compatible syntax would be:

my_struct struct
    org -4
one dw ?
two dw ?
my_struct ends

local_var textequ <my_struct>

    .code

    mov [bp+local_var.one],1
    mov [bp+local_var.two],2



thanks for the idea - still (some) changes needed and not really working if the struct occures as local variable in different procs at different offsets - as around ~50 times in my current reversing project

do you know why using

local local_var:my_struct
as zedd151 suggested does not work with bp (because local is implicit bp/sp based adressing?)

_japheth

Quote from: llm on November 02, 2024, 02:21:54 AMthanks for the idea - still (some) changes needed and not really working if the struct occures as local variable in different procs at different offsets - as around ~50 times in my current reversing project

if different offsets are needed, the offset mustn't occure in the struct, of course - should instead be added to the equate:

my_struct struct
one dw ?
two dw ?
my_struct ends

local_var textequ <-4 + my_struct>

    .code

    mov [bp+local_var.one],1
    mov [bp+local_var.two],2




Quotedo you know why using

local local_var:my_struct
as zedd151 suggested does not work with bp (because local is implicit bp/sp based adressing?)


Yes, exactly.
Dummheit, gepaart mit Dreistigkeit - eine furchtbare Macht.

llm

your example works

local_var textequ <-4 + my_struct>
but is there a way to have the textequ only in the local scope
due to the disassembly nature there are hundreds of local vars that are named the same and then i got conflicts


six_L

M_Opr proc
Local LOC_Vstruct:my_struct

;Write
invoke  iRand,01h,0FFFFh
mov [LOC_Vstruct].one,ax
add ax,5
mov [LOC_Vstruct].two,ax

;Read
mov ax,[LOC_Vstruct].one
mov dx,[LOC_Vstruct].two

ret                       

M_Opr endp
;¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
R_Opr proc
Local LOC_Vstruct:my_struct

;Write
invoke  iRand,01h,0FFFFh
lea rcx,LOC_Vstruct
mov (my_struct PTR [rcx]).one,ax
add ax,5
mov (my_struct PTR [rcx]).two,ax

;Read
mov ax,(my_struct PTR [rcx]).one
mov dx,(my_struct PTR [rcx]).two
ret

R_Opr endp
Say you, Say me, Say the codes together for ever.

llm

@six_L thanks but local is not flexible enough for the disassembled code and still too many changes needed