Author Topic: LLVM Macro Assembler - LLVM ML - Masm x64  (Read 377 times)


  • Member
  • ****
  • Posts: 675
LLVM Macro Assembler - LLVM ML - Masm x64
« on: January 22, 2020, 09:52:29 PM »

llvm will probably support microsoft macro assembler, maybe someone wants to discuss this project .. Clang as an x64 assembler - Masm x64

Code: [Select]
Hi all,

I'm proposing to add MASM support to LLVM's assembler capabilities, which
should nearly complete LLVM's support for cross-platform Windows

Goal: Match the functionality of Microsoft's ml.exe and ml64.exe.

I'm currently trying to implement this as a set of extensions to the
assembler; it will correctly handle assembly files containing both GNU and
MASM syntax.

The features would be added to AsmParser and controlled via MCAsmInfo, and
would be available through both llvm-mc and clang. Both tools would remain
able to parse GNU-syntax assembler, but when passed the appropriate flags,
would also handle MASM. My first thought is to trigger MASM support
whenever we're targeting Windows, but we could also make it a discrete
function controlled by a different command-line flag.

We will also define a new driver (llvm-ml) that matches the command-line
interface of ml.exe and/or ml64.exe. This will likely be similar to
clang-cl and llvm-lib in building on top of a previous driver: either
llvm-mc or clang.

Known obstacles:
1. Support for "ifdef <register>": already completed, with a new
tryParseRegister method added to all TargetAsmParsers.
2. Syntax variations: MASM uses infix notation for many directives, and
does not use "."-prefixes on its directives.
3. Macro functions: MASM includes macro functions, which can emit
parameters and not just full instructions. (Probably tricky.)
4. Richer macro language in general: some MASM files rely heavily on
text-substitution for named symbols, as well as resolution of fields in

3 & 4 might be best handled by adding a preprocessor stage, but the syntax
is not similar to the C preprocessor... I'm going to try to augmenting
existing macro support first.

Any obstacles I've missed, or critiques of the approaches, would be very

- Eric Astor

RFC: MASM support

Code: [Select]
Hi all,

I'm working on a project that uses clang-cl & lld-link to build for
Windows, along with some tools out of the Windows SDK... but we're
currently pre-building some pieces of MASM assembly code using Microsoft's
ml.exe & ml64.exe. Unfortunately, it's not all inline assembly, which clang
can already handle, and Microsoft's file-level directives are a bit unusual.

I plan to work on getting llvm-mc to compile (relatively simple) MASM files
when targeting a Windows x86-based platform, with goal of matching the
output of ml.exe and ml64.exe. I've already drafted a proof-of-concept
patch that lets llvm-mc handle MASM's variants of conditional assembly
macros (including the idiomatic use of "ifdef rax" to check if a build is
targeting x86-64)... but macro functions & structs are of course looking a
bit harder.

A few questions:

1. Should all of the changes be locked behind an equivalent to clang's
-fms-compatibility flag, or would it be good if some subset of the
functionality were shared? [e.g., should .ifdef rax be a valid way to check
if the rax register exists?]

2. Is there anyone around who would be willing to answer questions
regarding the intended architecture of llvm-mc and the AsmParser classes?
I'd like to make sure my proposals fit well into the design... and I'm
starting to have trouble finding where these extensions should go. (Also,
I've had some trouble getting used to the recursive-descent parser
conventions being used. For example, how should one handle "try parsing
this identifier as a register, and if that fails, check if it's defined as
a symbol" while not emitting Errors from the first attempt?)

- Eric
llvm-mc & Microsoft's MASM

Code: [Select]
Hi all,

Continuing work on llvm-ml (a MASM assembler)... and my latest obstacle is
in enabling MASM's convention that (unless specified) all memory location
references should be RIP-relative. Without it, we emit the wrong
instructions for "call", "jmp", etc., and anything we build fails at the
linking stage.

My best attempt at this so far is a small patch to X86AsmParser.cpp - just
taking any Intel expression with no specified base register and switching
it to use RIP - and this works alright. There's at least one exception: it
breaks the "jcc" instructions, at least "jcc <label>". The issue seems to
be that the "jcc" family exclusively takes a relative offset, never an
absolute reference... so adding a base register causes the operand not to
match. ("jcc" is always RIP-relative anyway.)

I'm not very familiar with the operand-matching logic, and am still pretty
new to LLVM as a whole. Are there more X86 instructions this will interact
badly with? Any thoughts on how this could be handled better?

If this is mostly a valid approach, might there be a way to change the
operand type of "jcc" to accept offset(base) operands, as long as base ==
X86::RIP, then ignore the RIP bit?

- Eric
[llvm-dev] MASM & RIP-relative addressing


  • Member
  • *****
  • Posts: 10258
  • Assembler is fun ;-)
    • MasmBasic
Re: LLVM Macro Assembler - LLVM ML - Masm x64
« Reply #1 on: January 22, 2020, 10:32:51 PM »
Wishful thinking :cool:
Known obstacles:
3. Macro functions: MASM includes macro functions, which can emit
parameters and not just full instructions. (Probably tricky.)


  • Member
  • ***
  • Posts: 396
Re: LLVM Macro Assembler - LLVM ML - Masm x64
« Reply #2 on: March 12, 2020, 05:12:03 AM »
My guess his best option is to look at the code of JWASM
The macros of MASM are very powerful, thus will be difficult to implement from scratch.


  • Administrator
  • Member
  • ******
  • Posts: 7212
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: LLVM Macro Assembler - LLVM ML - Masm x64
« Reply #3 on: March 12, 2020, 01:46:43 PM »
Sounds like pipe dreams. I understand John and nidud tweaking the old JWASM to get better results but my view is that a new assembler should not try and ape an old one. The form of MASM dates to just before 1990, TASM about the same yet technology has advanced a very long way since that type of architecture. A new MACRO assembler should produce its own syntax and pre-processor and be free of old assumptions.
hutch at movsd dot com    :biggrin:  :skrewy: