News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Did i found a bug in masm?

Started by felipe, August 04, 2017, 02:38:48 AM

Previous topic - Next topic

felipe

Preface
I put this into here because i thought is an advanced topic  :bgrin: (probably not  :biggrin:). But have a look at this, is fun :eusa_dance: :
:greensml:

I already know that offset is (in global variables contex) more efficient than lea (both in speed and size). But i started to look for alternatives to this operator and, of course a didn't found (probably, you tell me if you know  :biggrin:) any other alternative more efficient than this inmediate value computed in assembly time and inserted in the instruction. So offset seems to be the best for this cases.

What i found, however, was a very interesting thing:

.386
.model  flat,stdcall
option  casemap:none

.data
@@:
foo dword   20

pfoo dword  foo

.code
start:
mov eax,foo                                      ; The value.

mov eax,offset foo                             ; The address of foo.
mov eax,pfoo                                    ; The address of foo.
mov eax,@b                                     ; The address of foo.

end start


What i found interesting is that @@ will be the only label that will work for this (any other label name will give an assembly time error). You can try this code with a debugger.

And you can try this too (the .exe is attached):


.386
.model flat,stdcall
option casemap:none 
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

.data
@@:
foo         db       "wow!",0

.code

start:
    push MB_OK
    push    @b
    push    @b
    push 0
    call MessageBox

    push 0
    call ExitProcess

end start


And of course it will work!

But if you use more than one @@: in your data section, from the code you will be referencing the same label always (of course) and not the others (i mean when you do something like jmp @b or jz @f, etc). So we can do this:


.386
.model  flat,stdcall
option  casemap:none

.data
@@:
foo dword   20

pfoo dword  foo
ff  dword   30

.code
start:
mov eax,foo ; The value.
mov eax,offset foo ; Address of foo.
mov eax,pfoo ; Address of foo.
mov esi,@b ; Address of foo.
add esi,8 ; Address of ff.
mov eax,[esi] ; The value (30) in eax.

end start


You can try that code with a debugger too.

So one more example of this (the .exe is attached):

.386
.model flat,stdcall
option casemap:none 
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

.data
@@:
tit           db "hello",0
cont             db  "wow!",0

.code

start:
    push MB_OK
    mov esi,@b             ; Address of the data section.
    push    esi               ; First element's offset .
    add     esi,6            ; We want the next string.
    push    esi              ; Second element's offset.
    push 0
    call MessageBox

    push 0
    call ExitProcess
 
end start


Conclusion

So yeah, this is (probably, you tell me if you know  :biggrin:) not as efficient as using the offset operator for global variables. But is fun!   :eusa_dance:

Postface

:icon_mrgreen:  :eusa_dance:

jj2007

Quote from: felipe on August 04, 2017, 02:38:48 AMWhat i found interesting is that @@ will be the only label that will work for this (any other label name will give an assembly time error).

It's the other way round: @@ will be the only label that will not work for this


; .386
; .model  flat,stdcall
; option  casemap:none
include \masm32\MasmBasic\MasmBasic.inc
.data
@@:
foo dword   12345678h
pfoo dword  foo

.code
start:
  mov eax,foo                                    ; The value.
  mov ebx, pfoo                                 ; The address of foo
  mov ecx,offset foo                          ; The address of foo.
  lea edx,pfoo                                    ; The address of pfoo.
  ; mov eax, offset @b                       ; won't work: The address of foo.

  usedeb=16 ; hex please
  deb 4, "Test", eax, ebx, ecx, edx
  exit
end start


No bug anywhere...:
Test
eax             12345678
ebx             00407000
ecx             00407000
edx             00407004

felipe

Quote from: jj2007 on August 04, 2017, 03:08:16 AM
Quote from: felipe on August 04, 2017, 02:38:48 AMWhat i found interesting is that @@ will be the only label that will work for this (any other label name will give an assembly time error).

It's the other way round: @@ will be the only label that will not work for this


; .386
; .model  flat,stdcall
; option  casemap:none
include \masm32\MasmBasic\MasmBasic.inc
.data
@@:
foo dword   12345678h
pfoo dword  foo

.code
start:
  mov eax,foo                                    ; The value.
  mov ebx, pfoo                                 ; The address of foo
  mov ecx,offset foo                          ; The address of foo.
  lea edx,pfoo                                    ; The address of pfoo.
  ; mov eax, offset @b                       ; won't work: The address of foo.

  usedeb=16 ; hex please
  deb 4, "Test", eax, ebx, ecx, edx
  exit
end start


No bug anywhere...:
Test
eax             12345678
ebx             00407000
ecx             00407000
edx             00407004


I didn't code this, man: mov eax, offset @b .
So you maybe didn't read the whole post, i guess. You can try the codes for yourself. I even loaded two .exe that run with no problems.  :icon14:

jj2007

Quote from: felipe on August 04, 2017, 03:25:31 AMI didn't code this, man: mov eax, offset @b

Sorry, you are right: You wrote...
Quote from: felipe on August 04, 2017, 02:38:48 AMmov eax,@b                                     ; The address of foo.

felipe

It's kind of extraneous isn't? I found this from the masm32 sdk help:


Local code labels


Syntax: [instruction] @F
.
.
.
@@: [statement]
.
.
.
[instruction] @B



Description

The @@: label defines a local code label, which is in effect until the next instance of @@:. The @F and @B operands can be used in conditional and unconditional jump statements to jump to the next and previous @@: label respectively.

The @@: label is useful for defining a nearby jump point where a full label is not appropriate.


Example

cmp ax, 18h
jg @F
-            ; Less than or equal
-
-
@@:          ; Greater than


mov cx, 640  ; Set count

@@:
-            ; Loop statements
-
-
loop @B      ; Loop back



But, do not mention anything from using this label in the data section.
So, is a bug? What do you think about it?

hutch--

felipe,

The label notation "@@:" is designed for what are normally called anonymous labels in places where having to endlessly name labels is a pest. With the DATA sections, the name is the label and in 32 bit MASM you can either use OFFSET or make a pointer to the data if you need one in code like this.

.data
  mytext db "1234567890",0
  align 4
  ptxt dd mytext

Then in the CODE section you can use the pointer rather than have to use ADDR of OFFSET.

felipe

Ok hutch, i know that. That will add one more item in the data section (1 more item each pointer), so is true than offset operator, for global variables, is the most efficient way of get the address of this global variables, right?
In the examples i posted, using something like: mov esi,@b, is more messy, of course. But i guess that seems like a little bug, because no other labels (not the typical label names we define like:foo byte 30 or foo dd ?, but labels like: foo: or some_label: ) apart from this one @@:, can be used in the data section.
So, seems like a little bug.

jj2007

Quote from: felipe on August 05, 2017, 12:34:17 AMBut i guess that seems like a little bug, because no other labels (not the typical label names we define like:foo byte 30 or foo dd ?, but labels like: foo: or some_label: ) apart from this one @@:, can be used in the data section.So, seems like a little bug.

Quote from: hutch-- on August 04, 2017, 01:46:05 PMWith the DATA sections, the name is the label

felipe

Quote from: jj2007 on August 05, 2017, 12:37:46 AM
Quote from: felipe on August 05, 2017, 12:34:17 AMBut i guess that seems like a little bug, because no other labels (not the typical label names we define like:foo byte 30 or foo dd ?, but labels like: foo: or some_label: ) apart from this one @@:, can be used in the data section.So, seems like a little bug.

Quote from: hutch-- on August 04, 2017, 01:46:05 PMWith the DATA sections, the name is the label

Don't get it.

jj2007

It makes no sense to ask for labels in the data section, because you already have them: the name of the variable.

.data
MyDD  dd 123, 456, 789, 0  ; a "label"

.code
mov esi, offset MyDD  ; use the data "label" as you like

MyCodeLabelLoop:
  lodsd
  test eax, eax
  je CodeLabelOut
  print str$(eax), 13, 10
  jmp MyCodeLabelLoop

CodeLabelOut:
  invoke ExitProcess, 0

hutch--

felipe,

It sounds like you used to use TASM and I remember that TASM guys asked the same question many years ago. I still probably own my TASM disks but I doubt they would still work as I have not seen them for about 20 years. MASM has a few leftover peculiarities, it does not require square brackets around memory variables but will simply ignore them. [var] = var.

Labels are a bit different, you have scope limited labels, "label:" is only visible within a procedure where "label::" is visible across a whole module. Now while 32 and 64 bit binaries have .DATA and .DATA? sections, when you either specify initialised or uninitialised data, MASM knows the actual location and rather than get it by placing a label in front of it, the name is a global scope variable. You can either write directly to it "mov global, immediate; or register" but if you need its address as some functions require, you use OFFSET variable with notation like,
"mov eax, OFFSET variable".

I know what you are after with producing the smallest code and using OFFSET saves you the 4 bytes by using a pointer but be aware that PE 32 bit binaries generally have 512 byte sections and unless you are unlucky the extra pointer will not change the final file size, in almost all instances it will just get added to a section.

felipe

That's fine hutch. I didn't ever used tasm. i used masm 5.0.  :bgrin:. And i know what you have said, i understand all that. Please have a look (if you can) to what i posted. i will repeat the thing here:


Labels (symbolic names) in the data section allow us to access the contents in memory we allocate and we also define. We are all agreed with that. If we want the offset of some data item, that is global (i mean, is not in the stack, is no a local variable) we use the offset operator (or a similar option).

But LABELS that we use in the code section like: mainloop: or @@: or whatever: , are used by us (right?), to get an offset to jump to it or things like that.

If you use this last labels in the data section, at assembly time ml gives an error like: register assumed error.
Except for the @@: ones.

Using @@: in the data section and doing this in the code: mov esi,@b. Will give us the offset of the data item (defined with a label, symbolic name), that is after (below) that label (@@:), not the value, like if we were doing mov esi,labelname.

You can try it for yourself. You can see the code i posted above.

Maybe is  bug of ml.

Maybe who gives a sh*t.
Right?  :redface:

jj2007

Quote from: felipe on August 05, 2017, 02:06:18 AM
Using @@: in the data section and doing this in the code: mov esi,@b. Will give us the offset of the data item (defined with a label, symbolic name), that is after (below) that label (@@:), not the value, like if we were doing mov esi,labelname.

Interesting observation indeed :t

felipe

Quote from: hutch-- on August 05, 2017, 01:41:20 AM

Labels are a bit different, you have scope limited labels, "label:" is only visible within a procedure where "label::" is visible across a whole module.

I know what you are after with producing the smallest code and using OFFSET saves you the 4 bytes by using a pointer but be aware that PE 32 bit binaries generally have 512 byte sections and unless you are unlucky the extra pointer will not change the final file size, in almost all instances it will just get added to a section.

Sorry, i actually didn't know this ones. Thanks for sharing your knowledge.  :icon14:

hutch--

Ah, now your comments make sense, MASM5 was a very low level assembler alongside MASM 6.0 and 6.11 which changed how many things work. I vaguely remember there being a set of compatibility options for MASM 5.1 but I never used them. I bought MASM 6.0 very close to when it was first released and never worked with the earlier versions. MASM 6.11 was the base for the conversion to 32 bit PE files in late 1997 and even the current 64 bit version retains the bulk of the macro engine that the old version had. I own both sets of manuals that I confess I have not used for the last 20 years or so.