News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

h2inc+

Started by Biterider, March 29, 2024, 10:45:35 PM

Previous topic - Next topic

Biterider

Hi Greenhorn
Quote from: Greenhorn on April 10, 2024, 07:06:52 AMSo it makes no sense to me. That's why I commented the 'signed arithmetic' out in the macro.

Reading about the signed/unsigned bitfields, I found this, which makes sense for a compiler
https://stackoverflow.com/questions/149869/does-ansi-c-support-signed-unsigned-bit-fields

Biterider

jj2007

Quote from: jj2007 on April 10, 2024, 07:20:13 AM111=-3
110=-2
101=-1
100=-0
000=0
001=1
010=2
011=3

I admit it's a rather "forced" interpretation ;-)

So Salomon on SOF confirms my "forced" interpretation. I confess I am unsure about 100=-0, what's your take on that?

Biterider

Hi
I'm playing around with some ideas and may have another approach to the size problem: a filler field.
Since there can be more than one filler, it has to be appended with e.g. a sequential number. The reversal of the field order is also shown here:

REC16 RECORD REC16_Filler1:(sizeof(WORD)*8-5-2), REC16_Field2:2, REC16_Field1:5
REC32 RECORD REC32_Filler1:(sizeof(DWORD)*8-2-4), REC32_Field2:4, REC32_Field1:2

local rec16:REC16, rec32:REC32

mov ax, MASK REC16_Field1
mov eax, MASK REC32_Field1

A problem arises with :0 (=> new storage unit)
A possible solution could be something like

REC64 RECORD REC64_Filler1:(sizeof(QWORD)*8-4-32), REC64_Field2:4, REC64_Filler2:(@WordSize/2*8-3), REC64_Field1:3
here a second filler (REC64_Filler2) fills the rest of the first DWORD.

Note: The numbers in parentesis could be calculated by the translator. I have written them explicitly to visualise what is going on.

Biterider

Biterider

Hi
I have scanned the entire Windows API (Win11) for a :0 bitfield without finding a match. 
It seems that we can safely leave this deature for later.

Biterider

Biterider

Hi
I have now made several attempts to get the BitField to be correctly translated to asm. 
It is much more complicated than expected. The best explanation I have found is here:
https://stackoverflow.com/questions/4129961/how-is-the-size-of-a-struct-with-bit-fields-determined-measured

Biterider

Biterider

Hi
After a few hours of experimenting with BitFields using an MS compiler (which we need if we want to translate the Windows header files), I came up with a strategy that might be good enough.

It has 4 steps

  • Determine the size of the allocation unit (BYTE, WORD, DWORD, QWORD) that is the largest of all field types (signed or unsigned).
  • Fill the allocation unit with the fields in the reverse order in which they appear, until the total size exceeds the size of the allocation unit.
  • Fill the rest of the allocation unit with a dummy filler.
  • Start a new allocation unit and repeat the process from step 2 to the first field.

The implemantation isn't easy but seems to work.

Biterider

jj2007

QuoteIf enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined.

Crystal clear indeed. I love C :thumbsup:

Using the MS compiler as "implementation" is the best idea, I guess. As you wrote already, these headers are being used by that compiler.

Biterider

Hi
After finishing the bitfield renderer, I looked at the rest of the code and found some problems that require rethinking some parts of the code. The parser is fine, but the renderer has become very large and confusing due to the complexity of the C grammar. To start from scratch, I looked for documents describing the C grammar and came across the MS C documentation, which is not so good for this purpose, but necessary because it is the target grammar.

The BNF of C seems to be a very good and compact reference https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
The GNU C Reference Manual is also a good supplement https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html

This new version of h2inc is taking longer than expected, but I have a good feeling about it.  :biggrin:

Biterider

NoCforMe

I took a look at that BNF description. (I'm not pretending that I totally understand it, but I do get what it is, a hierarchical list of all the elements that comprise a language.) I was surprised how relatively short and compact it is.

So now are you going to go ahead and write YACC[n], now that you have the complete C language description?

(Heh, just kidding, but it might not be all that much more difficult than untangling the bitfield mess that you're dealing with.)
Assembly language programming should be fun. That's why I do it.

jj2007

Quote from: Biterider on April 20, 2024, 04:05:41 PMthe renderer has become very large and confusing due to the complexity of the C grammar

Quote from: jj2007 on March 29, 2024, 11:36:35 PMI suppose you are aware of Gcc's option -E, which allows to get rid of most of the conditional stuff.

Do you use such an option, or do you handle the conditional stuff yourself?

Biterider

Hi
@NoCForMe: you can take a look at the code here https://github.com/ObjAsm/ObjAsm-C.2/tree/master/Projects/X/h2inc%2B
You can get a feel for it, although it is a work in progress and far from finished. I have moved the project to the public area.

@JJ: I'm translating the header files using some conditional switches fixed in the code, e.g. to get rid of all the C++ stuff, and I'm leaving other well-known switches like WIN32_LEAN_AND_MEAN, INCL_WINSOCK_API_PROTOTYPES, INTERNET_PROTOCOL_VERSION, etc. in place so that the include files can be controlled the same way it is shown in the documentation and in the C examples.

Biterider

Biterider

Hi
I came across another challenge that makes the compilation more complicated: preprocessor directives inside some MASM declarations.
Example using records:

typedef struct {
      unsigned short reserved : 6,
#ifndef VERSION2
        fBusy : 1;
#else
      fBusy:1,
        fAck : 1;
#endif
} Status;


In this fictional example, a new bit has been added due to a new version.
Rendering to MASM as

Status struct
  Status_REC record \
ifndef VERSION2
    fBusy:1,
else
    fAck:1,
    fBusy:1,
endif
    reserved:6
  Status_R0 Status_REC <>
Status ends


did not work.
MASM doesn't like to break the declaration, and the first line (Status_REC record) must have at least one bit declaration. In short, you cannot do it this way.

The same happens with prototypes, of which there are a few in the Windows header.

Biterider

NoCforMe

I hate to ask you this, but after all this angst and toil, do you think anyone would even use this capability if you somehow manage to implement it? I mean, who out there actually uses bitfield records in assembly language? (Except for those few instances in Win32, like the time/date fields.)

I use bit fields in my programs all the time, but I manage to do it with simple EQUs and bitmasks. No need here for packed records.
Assembly language programming should be fun. That's why I do it.

Biterider

Hi 
Translating the Windows header files is not an easy task. It has a lot of challenges that I'm shareing here in order to find a practicable solution.

I've done this job at least 2 times and know what I'm talking about. The aim of this new version is to avoid all the extensive work of manually correcting all the cases not covered by the translator, if you read my last post you will have noticed that I used the record case as an example. 
More important are the prototypes, which are much more common. 

This is one of the most important projects for the future of MASM. 
We all know about Windows.inc and its problems. Not to mention 64-bit compatibility. 

By sharing the development of the project, which is btw open source, I hope to find some people to help with some areas of the code or to test the final result. 
Besides the translation, there is a significant part at the end of the project that needs agreement on how the include files should be arranged. Technically this is a minor point, but it has a lot of potential for discussion.

If you have a better approach, please feel free to share it. The MASM community will greatly appreciate it.

Regards, Biterider

sinsi

It sounds like we need a C++ "engine", are there any open source ones? Of course, then we need a C++ programmer to adapt the code.

I see it as three areas
 - WinAPI, easy enough to parse .lib files
 - constants
 - structures

If we can agree on syntax, we could each pick a .h file to convert?
Over the past few years I've been adding to my includes as I go (32- and 64-bit mixed) but they're not MASM64 friendly  :biggrin: