News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

HJWasm Macro Library Suggestions

Started by johnsa, March 31, 2017, 08:00:16 AM

Previous topic - Next topic

johnsa

So as mentioned in another thread, HJWasm 2.22+ features a built-in macro library which automatically adapts to the selected OPTION ARCH:<SSE|AVX> settings etc.

If you have any ideas for macros that should be built-in to HJWasm (custom invokes, prologue, helper functions.. etc) put them here.

For example we might add a DELPHI32_INVOKE, and DELPHI32_PROLOGUE/DELPHI32_EPILOGUE as built in macros to enable that form of ABI.

aw27

#1
Quote from: johnsa on March 31, 2017, 08:00:16 AM
For example we might add a DELPHI32_INVOKE, and DELPHI32_PROLOGUE/DELPHI32_EPILOGUE as built in macros to enable that form of ABI.
That will be amazing.  :t

Vortex

Hi johnsa,

A relaxed invoke macro option calling registers and variables without prototyping :

include     MsgBoxTimeout.inc

.data

user32      db 'user32.dll',0
text        db 'This message box will destroy itself after 4000 miliseconds',0
caption     db 'Self-destroying message box',0
func        db 'MessageBoxTimeoutA',0

.data?

hModule     dd ?

.code

start:

    invoke  LoadLibrary,ADDR user32
    mov     hModule,eax

    invoke  GetProcAddress,eax,ADDR func
    test    eax,eax
    jz      _exit

   _invoke  eax,0,ADDR text,ADDR caption,\
            MB_ICONWARNING,LANG_NEUTRAL,TIMEOUT

_exit:

    invoke  FreeLibrary,hModule

    invoke  ExitProcess,0

END start

mineiro

return keyword on prototypes;
mov myvar,invoke function,par1,par2,par3
From what I have see, on ms-dos, linux and windows (32 or 64), most functions return values on ax/eax/rax register, and if 2nd return value exists it will be on dx/edx/rdx register. But some functions can return some flags setup. This can be expanded to xmm registers, ... .
If function above have a void return type, so an error message should inform user.

A enum macro can be usefull too.

A syscall like invoke (with prototype check):
__NR_exit equ 60 <--- an enum
syscall __NR_exit,0 <---
I cannot create prototypes to syscall instruction, the same way I can't create to 'int' instruction, like int 80,int 21,int 2f...

syscall eax,rdi,rsi,rdx,r10,r8,r9   <--eax means function enum, other registers are parameters, used on linux x86-64.
int 80h,eax,ebx,ecx,edx,esi,edi,ebp  <--eax means function enum, used on linux 32 to call native functions, system call.
Please, check abi just to be sure if sequence above is right.
So, you should create a new name (cannot be invoke because we can mix 'call' and 'syscall' on same source code) , I don't have suggestions, and program will appear like ideal mode to be portable, we can port linux 32 to 64 bits this way. What changes are enumerations only. On 32 bits, "__NR_exit EQU 1" and on x86-64 "__NR_exit EQU 60".

I don't have sure if on linux 32 bits have 2 different calling way, maybe from kernel 2.2 below is one and above 2.2 is the one listed above.
So, linux to bsd can be done too (bsd use other abi or calling convention, I don't know exact name to be said).
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

aw27

Option to Align XMM Local variables to 16 bytes under x86.


jj2007

Quote from: aw27 on April 04, 2017, 10:42:33 PM
Option to Align XMM Local variables to 16 bytes under x86.

Could be easily done in the PROLOG macro, with the advantage that the source would still assemble with ML. There is a huge 32-bit codebase...

aw27

Quote from: jj2007 on April 05, 2017, 01:46:20 AM
Quote from: aw27 on April 04, 2017, 10:42:33 PM
Option to Align XMM Local variables to 16 bytes under x86.

Could be easily done in the PROLOG macro, with the advantage that the source would still assemble with ML. There is a huge 32-bit codebase...

I guess so, Jochen, but ML is not using a macro for prolog since pre-6.0 ages and I am not the man to develop one to do this job.
I am ready to sacrifice ML, which I am using now for x86, for the Option I requested.

hutch--

In 64 bit MASM a 32 byte aligned stack frame is this easy. Tweak is done in the stackframe macro.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

tstproc proc

    LOCAL .ymm12 :YMMWORD
    LOCAL .ymm13 :YMMWORD
    LOCAL .ymm14 :YMMWORD
    LOCAL .ymm15 :YMMWORD

    vmovntdq .ymm12, ymm12
    vmovntdq .ymm13, ymm13
    vmovntdq .ymm14, ymm14
    vmovntdq .ymm15, ymm15

    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop

    vmovntdqa ymm12, .ymm12
    vmovntdqa ymm13, .ymm13
    vmovntdqa ymm14, .ymm14
    vmovntdqa ymm15, .ymm15

    ret

tstproc endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

johnsa

The biggest problem with 32byte aligned stack is that the OS doesn't guarantee that for you on entry in the first place, so you'd need to start with a manual adjust to RSP to get things aligned 32 first before having the right prologue/epilogues to keep it that way.

aw27

Quote from: hutch-- on April 05, 2017, 03:49:23 AM
In 64 bit MASM a 32 byte aligned stack frame is this easy. Tweak is done in the stackframe macro.

You will need to place all the 16-byte variables together at the beginning otherwise they will not remain aligned. If you intersperse variable types you will have an alignment problem.
On the other hand, the stack is always aligned to 16-bytes at the beginning after you push rbp, so why bother to align again. After your tweak, which I have not seen, will be 32-bytes but conclusion is the same.
This is the way I read the Prolog macro, but something may have escaped me.

Adamanteus

#11
I'm thinking, that built-in macros mought affect only on replacing builti-in assembler commands (or it could became AsmC), that could overheat system in much repetitions (especially cheap) as xlat, scasX,  stosX, lodsX, movsX, cld/std (could be need empty), movsx/movzx (and not used in invoke), enter/leave : as possible mark authors of compilers them also avoiding.

hutch--

> If you intersperse variable types you will have an alignment problem.

This is correct and it involves the discipline of stacking LOCAL variables in descending order of size, YMM then XMM then 64 bit and so on down to BYTE variables. The default prologue macro I use is 16 byte aligned which works fine with XMM, all it needs to handle YMM registers is 32 byte alignment. What I am inclined to do with MASM is just add the 32 byte version to the macro options as the main one works well and I don't want to further complicate the macro call with another option.

The reason for making this suggestion is that at a design level of an assembler, it would be very easy to test the data sizes while parsing the source code to ensure that a top down ordering is done and create an alignment error if the locals are out of order.

aw27

Quote from: hutch-- on April 05, 2017, 11:39:07 AM
> If you intersperse variable types you will have an alignment problem.
This is correct and it involves the discipline of stacking LOCAL variables in descending order of size,
If we assume all big structure and union variables will have fields aligned to 16-bytes at least, which can be a waste of memory when they don't contain SIMD instructions.

jj2007

Quote from: hutch-- on April 05, 2017, 11:39:07 AMensure that a top down ordering is done and create an alignment error if the locals are out of order.

John has remarked that ordering can be problematic if coder accesses more than one variable e.g. with a movups for 4 dwords. But an error (or a warning?) for unaligned locals would be a great solution, as it forces the coder to observe the "big ones first" logic. Better than chasing mysterious bugs.