Author Topic: My assembler development update  (Read 6181 times)


  • Member
  • ***
  • Posts: 463
    • Uasm
My assembler development update
« on: June 11, 2012, 06:41:00 PM »
I've been quiet for some time.. but things are still happening.

New features included so-far:

full org support, section management, phase 1 of COFF implemented. It can now assemble the test output to a valid 32 and 64bit COFF file that links and works.
Partially implemented struct/union,ifdef,ifndef..
Entry point is managed and i've made some more fixes and refactored all over the place. Now sitting at 17,000 lines of code..
ENUM support is about 70% and due to all the mad dependencies I had to start implementing scope and namespaces in parallel.

I've (partially) settled on using namespaces and scope the same way via the symbol table. a Scope Reference of 0 in the table would indicate normal global scope, Namespaces/proc/enum/struct etc are added
to the symbol table as either internal or public types. If other symbols are added that are in their scope the reference will change. In the parsing of the code I use two different structures to represent:

Current Declarative Scope and
Current Referential Scope.

I'm busy finalizing all the data directives, struct/union so that COFF phase 2 work can happen and check all the symbol types and complete debug$S,$T sections.

I've restructured the opcode db again based on how I think this should be working (for easiest implementation).
The idea is to group all the instructions together by which parser rule can handle them.
IE: The first block was all instructions that have NO parameters and at and only an option prefix.
I implemented this and linked it with a single parser production which handles the prefixes and 16,32,64 bit modes.
Granted it's low hanging fruit but it adds up to 130 odd instructions all fully implemented in all bit modes already.

Once the data directive work and types are done, i'll go back to evaluate mem-expressions and finalize the rest of the instruction set support.

So at this point I've got a question:
I've started implementing the RECORD directive, but given that I've got ENUM and I've never used/or seen it used in modern times.. should I bother supporting it?

I should have a code update for you in the next week with the COFF + ENUM + INSTRUCTIONS + STRUCT/UNION and updated example code.



  • Member
  • *****
  • Posts: 7470
  • Assembler is fun ;-)
    • MasmBasic
Re: My assembler development update
« Reply #1 on: June 11, 2012, 10:00:05 PM »
I've started implementing the RECORD directive, but given that I've got ENUM and I've never used/or seen it used in modern times.. should I bother supporting it?

We've touched RECORD once in years in this thread, but despite having handy macros, I've never used them. If your macro engine will be compatible with Masm, RECORD could be implemented as a macro, too.


  • Member
  • ***
  • Posts: 463
    • Uasm
Re: My assembler development update
« Reply #2 on: June 11, 2012, 10:10:57 PM »

Tentatively I'm going to leave it out.. I guess this is the advantage of doing this yourself.. if we need it.. we can always put it in :) For now I'm sure ENUM will suffice for most bitmask type requirements that anyone might have used it for. Right now i'm fighting my way through struct/union's deriving all their subtle nuances.. anonymous structs, nesting, named nested structs.. etc etc

Based on previous discussions around 64bit coding, what I'm proposing as a change to the struct format is the following
MyStruct STRUCT 4:16,[nonunique]
f1 real4 ?
f2 real4 ?
f3 real4 ?
f4 real4 ?
MyStruct ENDS

The extra parameter on the declaration specifies the alignment of the structure when it its instantiated. IE: in the above case if you allocate it as a stack local or in data section it will automatically be 16byte aligned.. idea for putting something like _m128 on stack.


  • Member
  • **
  • Posts: 93
Re: My assembler development update
« Reply #3 on: June 11, 2012, 11:47:28 PM »
Hi John,

records are used in a lot of the Windows SDK includes.
And sometimes someone needs them.

The annoying thing with records in MASM is that they must have a globally unique name and that they haven't the default size of 4 bytes in Win32, respectively 8 bytes in Win64 in structures if you declare e.g. just 8 bits.



  • Member
  • ***
  • Posts: 463
    • Uasm
Re: My assembler development update
« Reply #4 on: June 12, 2012, 01:31:08 AM »
I've been checking and found 10 instances so far. Enough to justify implementing it :( oh well.. haha
I will add some support to them to try and overcome the issues you mentioned, given default size in accordance with default word size and their names would obviously fall into my normal scoping model.. So in GLOBAL name space they'd have to be unique, but you'd be able to create other inside other name spaces.


  • Member
  • **
  • Posts: 93
Re: My assembler development update
« Reply #5 on: June 12, 2012, 08:27:45 AM »
It would be nice if you could implement it properly.  :t
I'm looking forward to your developement ...

BTW, here is what I've found in my includes:

Alle suchen "RECORD", Groß-/Kleinschreibung beachten, Ganzes Wort, Unterordner, Suchergebnisse: 1, "C:\Win7Inc\include"
  C:\Win7Inc\include\   DDEACK_R0   RECORD   bAppReturnCode:8,reserved0:6,fBusy:1,fAck0:1
  C:\Win7Inc\include\   DDEADVISE_R0   RECORD   reserved1:14,fDeferUpd0:1,fAckReq0:1
  C:\Win7Inc\include\   DDEDATA_R0   RECORD   unused:12,fResponse:1,fRelease0:1,reserved2:1,fAckReq1:1
  C:\Win7Inc\include\   DDEPOKE_R0   RECORD   unused0:13,fRelease1:1,fReserved0:2;* 12 unused bits.                       *
  C:\Win7Inc\include\   DDELN_R0   RECORD   unused1:13,fRelease2:1,fDeferUpd1:1,fAckReq2:1
  C:\Win7Inc\include\   DDEUP_R0   RECORD   unused2:12,fAck1:1,fRelease3:1,fReserved1:1,fAckReq3:1
  C:\Win7Inc\include\      PSAPI_WORKING_SET_BLOCK_R0   RECORD   Protection:5,ShareCount:3,Shared:1,Reserved:3,
  C:\Win7Inc\include\      PSAPI_WORKING_SET_BLOCK_R1   RECORD   VirtualPage:20
  C:\Win7Inc\include\      PSAPI_WORKING_SET_EX_BLOCK_R0   RECORD   Valid:1,ShareCount:3,Win32Protection:11,Shared:1,Node:6,Locked:1,LargePage:1
  C:\Win7Inc\include\   MIDL_STUB_MESSAGE_R0   RECORD   fInDontFree:1,fDontCallFreeInst:1,fInOnlyParam:1,fHasReturn:1,fHasExtensions:1,fHasNewCorrDesc:1,fIsIn:1,fIsOut:1,fIsOicf:1,fBufferValid:1,fHasMemoryValidateCallback:1,fInFree:1,fNeedMCCP:1,fUnused:3,fUnused2:16
  C:\Win7Inc\include\   SCRIPT_CONTROL_R0   RECORD   uDefaultLanguage:16,fContextDigits:1,fInvertPreBoundDir:1,fInvertPostBoundDir:1,fLinkStringBefore:1,fLinkStringAfter:1,fNeutralOverride:1,fNumericOverride:1,fLegacyBidiClass:1,fMergeNeutralItems:1,fReserved:7
  C:\Win7Inc\include\   SCRIPT_STATE_R0   RECORD   uBidiLevel:5,fOverrideDirection:1,fInhibitSymSwap:1,fCharShape:1,fDigitSubstitute:1,fInhibitLigate:1,fDisplayZWG:1,fArabicNumContext:1,fGcpClusters:1,fReserved:1,fEngineReserved:2
  C:\Win7Inc\include\   SCRIPT_ANALYSIS_R0   RECORD   eScript:10,fRTL:1,fLayoutRTL:1,fLinkBefore:1,fLinkAfter:1,fLogicalOrder:1,fNoGlyphIndex:1
  C:\Win7Inc\include\   SCRIPT_VISATTR_R0   RECORD   uJustification:4,fClusterStart:1,fDiacritic:1,fZeroWidth:1,fReserved:1,fShapeReserved:8
  C:\Win7Inc\include\   SCRIPT_LOGATTR_R0   RECORD   fSoftBreak:1,fWhiteSpace:1,fCharStop:1,fWordStop:1,fInvalid:1,fReserved:3
  C:\Win7Inc\include\   SCRIPT_PROPERTIES_R0   RECORD   langid:16,fNumeric:1,fComplex:1,fNeedsWordBreaking:1,fNeedsCaretInfo:1,bCharSet:8,fControl:1,fPrivateUseArea:1,fNeedsCharacterJustify:1,fInvalidGlyph:1,fInvalidLogAttr:1,fCDM:1,fAmbiguousCharSet:1,fClusterSizeVaries:1,fRejectInvalid:1
  C:\Win7Inc\include\   SCRIPT_DIGITSUBSTITUTE_R0   RECORD   NationalDigitLanguage:16,TraditionalDigitLanguage:16,DigitSubstitute:8
  C:\Win7Inc\include\   SCRIPT_CHARPROP_R0   RECORD   fCanGlyphAlone:1,reserved:15   ;// Reserved
  C:\Win7Inc\include\   COMSTAT_R0   RECORD   fCtsHold:1,fDsrHold:1,fRlsdHold:1,fXoffHold:1,fXoffSent:1,fEof:1,fTxim:1,fReserved:25
  C:\Win7Inc\include\   DCB_R0   RECORD   fBinary:1,fParity:1,fOutxCtsFlow:1,fOutxDsrFlow:1,fDtrControl:2,fDsrSensitivity:1,fTXContinueOnXoff:1,fOutX:1,fInX:1,fErrorChar:1,fNull:1,fRtsControl:2,fAbortOnError:1,fDummy2:17
  C:\Win7Inc\include\         DISPLAYCONFIG_TARGET_DEVICE_NAME_FLAGS_R0   RECORD   friendlyNameFromEdid:1,friendlyNameForced:1,edidIdsValid:1,reserved29:29
  C:\Win7Inc\include\         DISPLAYCONFIG_SET_TARGET_PERSISTENCE_R0   RECORD   bootPersistenceOn:1,reserved31:31
  C:\Win7Inc\include\         PR_IN_R0   RECORD   ServiceAction:5,Reserved1:3
  C:\Win7Inc\include\         PR_OUT_R0   RECORD   ServiceAction:5,Reserved1:3,Type_:4,Scope:4;//
  C:\Win7Inc\include\         Bits_R0   RECORD   BaseMid:8,Type_:5,Dpl:2,Pres:1,LimitHi:4,Sys:1,Reserved_0:1,Default_Big:1,Granularity:1,BaseHi:8
  C:\Win7Inc\include\         Bits_R0   RECORD   BaseMid:8,Type_:5,Dpl:2,Pres:1,LimitHi:4,Sys:1,Reserved_0:1,Default_Big:1,Granularity:1,BaseHi:8
  C:\Win7Inc\include\      DUMMYSTRUCTNAME_R0   RECORD   RatePercent:7,Reserved0:25
  C:\Win7Inc\include\   XSTATE_CONFIGURATION_R0   RECORD   OptimizedSave:1
  C:\Win7Inc\include\         DUMMYSTRUCTNAME_R00   RECORD   AllowScaling1:1,Disabled1:1,Reserved14:14
  C:\Win7Inc\include\   PROCESSOR_POWER_POLICY_INFO_R0   RECORD   AllowDemotion:1,AllowPromotion:1,Reserved30:30
  C:\Win7Inc\include\   PROCESSOR_POWER_POLICY_R0   RECORD   DisableCStates:1,Reserved31:31
  C:\Win7Inc\include\            DUMMYSTRUCTNAME_R000   RECORD   NoDomainAccounting:1,IncreasePolicy:2,DecreasePolicy:2,Reserved2:3
  C:\Win7Inc\include\         DUMMYSTRUCTNAME_R01   RECORD   NameOffset:31,NameIsString:1
  C:\Win7Inc\include\         DUMMYSTRUCTNAME2_R0   RECORD   OffsetToDirectory:31,DataIsDirectory:1
  C:\Win7Inc\include\   IMAGE_CE_RUNTIME_FUNCTION_ENTRY_R0   RECORD   PrologLen:8,FuncLen:22,ThirtyTwoBit:1,ExceptionFlag:1
  C:\Win7Inc\include\   FPO_DATA_R0   RECORD   cbProlog:8,cbRegs:3,fHasSEH:1,fUseBP:1,reserved:1,cbFrame:2
  C:\Win7Inc\include\   IMAGE_ARCHITECTURE_HEADER_R0   RECORD   AmaskValue:1,int2_:7,AmaskShift:8,int1_:16
  C:\Win7Inc\include\   IMPORT_OBJECT_HEADER_R0   RECORD   Type2:2,NameType3:3,Reserved11:11
  C:\Win7Inc\include\      Header8_R0   RECORD   Depth8:16,Sequence8:9,NextEntry8:39
  C:\Win7Inc\include\      Header8_R01   RECORD   HeaderType8:1,Init8:1,Reserved8:59,Region:3
  C:\Win7Inc\include\      Header16_R0   RECORD   Depth16:16,Sequence16:48
  C:\Win7Inc\include\      Header16_R01   RECORD   HeaderType16:1,Init16:1,Reserved16:2,NextEntry16:60
  C:\Win7Inc\include\      HeaderX64_R0   RECORD   Depth64:16,Sequence64:48
  C:\Win7Inc\include\      HeaderX64_R01   RECORD   HeaderType64:1,Reserved64:3,NextEntry64:60
  C:\Win7Inc\include\         s_R0   RECORD   LongFunctionV3:1,PersistentV3:1,PrivateV3:30
  C:\Win7Inc\include\         s_R0   RECORD   LongFunctionV1:1,PersistentV1:1,PrivateV1:30
  C:\Win7Inc\include\   MENUBARINFO_R0   RECORD   fBarFocused:1,fFocused:1
  C:\Win7Inc\include\         SCOPE_ID_R0   RECORD   Zone:28,Level:4
  Übereinstimmende Zeilen: 48    Übereinstimmende Dateien: 10    Insgesamt durchsuchte Dateien: 170


  • Member
  • *****
  • Posts: 7470
  • Assembler is fun ;-)
    • MasmBasic
Re: My assembler development update
« Reply #6 on: June 12, 2012, 12:43:16 PM »
BTW, here is what I've found in my includes:
...  C:\Win7Inc\include\   DDEACK_R0   RECORD   bAppReturnCode:8,reserved0:6,fBusy:1,fAck0:1

The standard Masm32 installation has only these eight matches:
Code: [Select]
rName   RECORD NameIsString
rDirectory    RECORD DataIsDirectory
; FPOProlog    RECORD cbFrame
ImportRec RECORD Reserved
SHELLFLAGSTATE record fShowAllObjects
9 lines found

You miss the last one because of Groß-/Kleinschreibung beachten. Check yourself - exe attached.

include \masm32\MasmBasic\   ; download
  GetFiles \Masm32\include\*.inc, "RECORD", 99, 4+1   ; 99=max matches, 4=whole word, 1=case-insensitive
  push eax
  For_ ebx=0 To eax-1
   Let esi=Trim$(Files$(ebx))
   PrintLine Left$(esi, Instr_(esi, ":")-1)
  pop eax
  Inkey Str$("%i lines found", eax)
end start


  • Member
  • **
  • Posts: 93
Re: My assembler development update
« Reply #7 on: June 13, 2012, 02:56:11 AM »
Thanks Jochen,

this was just a quick search with Visual Studio in the includes I've translated from the Windows SDK 7.1. ;)


  • Member
  • ***
  • Posts: 463
    • Uasm
Re: My assembler development update
« Reply #8 on: June 28, 2012, 05:12:23 PM »
Ok.. RECORD is implemented, TEXTEQU and EQU <Literal> is done. Code substitution for them and to form the basis of expanding macros is done, CRC checking and duplicate include file testing is done.
At this point I realized I can't continue without fully supporting IF/IFDEF etc as most of the test files I use define EQU's within these blocks, which breaks my parser rules of not allowing EQU redefinition.

While looking at these expressions and bearing in mind a lot of things still to come.. I think I have to completely refactor the expression system and parser.
The parser is just getting a bit out of hand being monolithic, so I want to separate it into different files and blocks:
IE:,,, ..

That should be fairly easy.

In addition, and this is quite a major change I want to try and make as much common logic generic and built into each parser production rule.. I already have macros that do things like:
LOOKUP_ADD_SYMBOL,PARSER_RESET, PARSER_RESTART <ptr>, NOT_VALID_IN_ENUM, NOT_VALID_IN_STRUCT etc which sit inside the various blocks to tidy up the code for handling all that common stuff, but I want to take it further with expressions.

At the moment the expression evaluator deals with simple numerical expressions and a few operators/functions like SIN,COS,TAN,ATAN, OFFSET... So it's ideal for calculation values used by variables and equs.
It handles all the different numerical types, integer/float and promotes an expression to the correct type.

Now here's my thinking.. sanity check required...:
By creating a expression result structure
and implmenting a load of other operators and logic... the one system could evaluate numerical expressions, as well as logical constructs and memory addresses?

mov eax,[esi+(10+20)]

(10+20) would be handled currently... the [ ] operators would set a flag to say memory indirection.. esi would be set as the base.. any identifiers would be looked up, relocation generation flag would be set if
required.. if it found a : operator, the previous seg reg token would be set as the override...

Code: [Select]
EXPTYPE_MEM    equ 3
EXPTYPE_HLL    equ 5
EXPTYPE_REG    equ 6

expType     db ?
isEvaluated db ?
isTrue      db ?
iValue      dq ?
fValue      REAL8 0.0
segOvr      dd ?
regPtr      dd ?
symPtr      dd ?
scopePtr    dd ?
bAddr       dq ?
baseReg     dd ?
idxReg      dd ?
scale       dd ?
emitReloc   db ?

Something along those lines...

So in a parse rule to deal with MOV EAX,EBX

EAX and EBX will both be assumed to be expressions, fed to this system, and the two results will be of type EXPTYPE_REG..

Does this make sense?

If anyone is interested I'd be happy to post up the full source of whats done so-far... maybe some one else is sick in the head enough like me to help out :) :)