64 bit assembler > UASM Assembler Development

Confusion about Architecture Selection (AVX or SSE)

(1/4) > >>

johnsa:
Hi all,

In a recent conversation with a UASM regular user there was some confusion over why it was generating executables that wouldn't run on machines without AVX support.. So I thought I'd copy the detail here to serve as an explanation / reminder to anyone who may run into this:

Ok.. so to give you some background here as to what is going on.. UASM/ASMC/JWASM all generate a lot of code for invoke/prologue/epilogue on procs etc. Even more now so in UASM with it’s more advanced prologue/epilogue and macro library.

Traditionally ALL the instructions were generated as SSE (ASMC and JWASM). The problem however then arises where if you write a procedure that uses AVX/AVX2 or AVX512 there is a massive penalty from transition between SSE and AVX modes. To reduce this penalty you can insert VZEROALL or VZEROUPPER instructions to avoid the state change costing thousands of cycles.

The problem was that under some arrangements with SSE as the default / used in prologue there was no opportunity for the programmer to insert these to avoid the penalty, or in others you might simply forget and have no idea why the code is so slow.

Because of this we totally re-worked ALL that code to work more like a fully-fledged compiler (like VC/GCC etc) which gives you an option.

We have OPTION ARCH:SSE and OPTION ARCH:AVX which control this. Depending on that setting all proc/invoke/prologue/epilogue and macro library built-in functionality will use the corresponding instruction set, so you can switch that back and forth in your code as much as you like depending on your requirements for instruction set.

In your case where you need the code to run on machines with SSE only and no AVX support you should add OPTION ARCH:SSE either to the code or it can be specified on the command line via switch.

OPTION ARCH:AVX was determined to be the best default, but with the command line switch or OPTION directive it’s entirely up to you.
If you add that then you should get MOVQ instead of the AVX equivalent VMOVQ.

The command line switches are listed when you use -?
They are:

-archSSE OR –archAVX

You can use the OPTION multiple times in code without restriction, so you could wrap sets of SSE and AVX functions in them to provide different execution paths or library calls etc.

John

johnsa:
Just to let you know, we have uploaded an update to 2.39 dated today 4th September.
All that has changed is the default architecture is now SSE instead of AVX to maximise default compatibility.

The OPTION ARCH and command line switches work as before.
So if you wanted to generate AVX opcodes in invoke/prologue/epilogue you'd explicitly enable it with OPTION ARCH:AVX or -archAVX on the command line.

John

aw27:
 :t

habran:
 :biggrin:
There is one more thing that is added in that last build and John forgot to include in Extended Manual:
 OPTION SWITCHSIZE:SIZE   which we limited to 8000h  and default is 4000h
The purpose is to give to a programmer the choice to choose between speed or size
usage is E.G.:
  OPTION SWITCHSIZE:2000h
the mechanic is in hll.c:

--- Code: ---.....
swsize = 0x4000;
.....
          if (ModuleInfo.Ofssize == USE32 || hll->csize == 4) {
            bubblesort(hll, hll->plabels, hll->pcases, hll->casecnt);
            if ((hll->delta * 4) <= (hll->casecnt * 4 + hll->casecnt * 2))
              hll->cflag = 6;                   /* we need only jump table */
            else if (hll->delta < 256)
              hll->cflag = 4;                   /* we need both jump table and count table byte size */
            else if (hll->delta < swsize)       /* size limited to 0x4000  */
              hll->cflag = 7;                   /* we need both jump table and count table word size */
            else
              hll->cflag = 5;                   /* we will use a binary tree */
            }

--- End code ---

The Samples folder in both 32 and 64 bit contains switch32.asm which gives example of each version of cases
and explanation what it produces


nidud:

--- Quote from: nidud on August 27, 2017, 05:09:21 AM ---This crashed with the latest build of Uasm (v2.39):

--- Code: ---
    .x64
    .model  flat, fastcall

    option  dllimport:<msvcrt>
    printf  proto :ptr byte, :vararg
    exit    proto :qword

    .data
    error  db "Uasm Error: %d",10,0

    .code

sw_uasm proc val

    .switch ecx

    enum = 0
    repeat 300
%   .case @CatStr(%enum)
    mov eax,enum
    enum = enum + 1
    endm

    enum = 600
    repeat 60
%   .case @CatStr(%enum)
    mov eax,enum
    enum = enum + 1
    endm

    enum = 1000
    repeat 1000
%   .case @CatStr(%enum)
    mov eax,enum
    enum = enum + 1
    endm

    .default
        xor eax,eax

    .endswitch
    ret

sw_uasm endp

main proc

    mov esi,299
    .while esi
        invoke sw_uasm,esi
        .if eax != esi
            invoke printf,addr error,esi
            .break
        .endif
        dec esi
    .endw
    mov esi,659
    .while esi >= 600
        invoke sw_uasm,esi
        .if eax != esi
            invoke printf,addr error,esi
            .break
        .endif
        dec esi
    .endw
    mov esi,1999
    .while esi >= 1000
        invoke sw_uasm,esi
        .if eax != esi
            invoke printf,addr error,esi
            .break
        .endif
        dec esi
    .endw
    mov edi,1000
    .while edi
        mov esi,2000
        .while esi
            invoke sw_uasm,esi
            dec esi
        .endw
        dec edi
    .endw
    invoke exit,0

main endp

    end main

--- End code ---

--- End quote ---

Navigation

[0] Message Index

[#] Next page

Go to full version