News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

JWasmAVX2 source

Started by habran, November 25, 2014, 07:19:31 AM

Previous topic - Next topic

habran

Here is the main folder with source files and Visual Studio 2013 project
I have fixed bugs and it suppose to run in 32bit and 64bit in any option
Remember that this is my version with all the goodies inside (I will add an detailed manual later)
so, if you want to access them, you should use for 64 bit:

option casemap : none
option win64 : 11
option frame : auto
option stackbase : rsp

Beware that you need H folder in the post under for successful compilation
Cod-Father

habran

Here is a folder with headers
decompress it and drop in a main folder
When you open the project you will have to change the:
Project->Properties->C/C++->General->Additional Include Directories:C:\Users\Hn\Desktop\JWasm2014AVX2\H
to your location
Cod-Father

habran

Goodies added to this version:
PRE-EXISTING FLAG CONDITIONS (dedndave was the godfather to these children)
signed jumps
   LESS?             JGE  skip
   !LESS?            JL    skip
   GREATER?      JLE  skip
  !GREATER?      JG   skip
unsigned missing jumps
   ABOVE?         JBE  skip
   !ABOVE?        JA   skip
;//----------------------------------------------------------------------------------
BUILT IN .FOR AND .ENDFOR C-LIKE HLL LOOP
The first difference is ';' is replaced with '¦' which is character 0A6h or 166
to type it hold ALT down and type 221 and than release ALT
The second difference is that it is highly optimised and runs with the lightning speed
initializers and counters:
=, *=, /=, %=, +=, -=, <<=, >>=, &=, ^=, |=
condition opperators:
==,!=,> ,< ,>=,<=,&&,||,& ,!,ZERO?,CARRY?,SIGN?,PARITY?,OVERFLOW?,LESS?,GREATER?,ABOVE?

E.G.  eax+=10   ebx <<= 16 (shl) ecx >>= 8 (shr)
here is how it can be used:

  .for (edx=88,ecx=4¦eax != 24 && hWnd > lParam || ebx <= 20 || ebx >= 3¦eax=23,edx=24,ebx++)
nop
  .endfor
   

.for (¦r8¦r8++,[rcx].RECT.top=eax)
    nop   
    nop           
    .if (rax)                           
     .continue               
    .endif
    mov[rcx], dl
  .endfor               

     
;forever loop
.for (¦¦)
.break .if eax
.endfor

;//----------------------------------------------------------------------------------
HOMING SPACE
Although the first four parameters are passed via registers, there is still space allocated on the stack for these four parameters. This is called the  parameter homing space and is used to store parameter values if either the
function accesses the parameters by address instead of by value or if the function is compiled with the homeparams flag. The minimum size of this homing space is 0x20 bytes or four 64-bit slots, even if the function takes less than  4 parameters. When the homing space is not used to store parameter values, the JWasm uses it to save non-volatile registers.

00000000`ff4f34bb mov     qword ptr [rax+8],rbx
00000000`ff4f34bf mov     qword ptr [rax+10h],rbp
00000000`ff4f34c3 mov     qword ptr [rax+18h],rsi
00000000`ff4f34c7 mov     qword ptr [rax+20h],rdi
00000000`ff4f34cb push    r12
00000000`ff4f34cd sub     rsp,70h

Auto save first 4 registers will not save unused register

If a procedure doesn't have invoke it will not unnecessarily allocate the homing space
If a procedure doesn't have locals FRAME will not be created
so you will not need to use:
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;//----------------------------------------------------------------------------------
INVOKE OPTIMIZATION
If invoke has in first 4 parameters FALSE,NULL or 0 it will not assemble to MOVE REG,0
but XOR REG,REG
;//----------------------------------------------------------------------------------
AVX2 IMPLEMENTED




Cod-Father

jj2007

Quote from: habran on November 25, 2014, 06:06:19 PMINVOKE OPTIMIZATION
If invoke has in first 4 parameters FALSE,NULL or 0 it will not assemble to MOVE REG,0
but XOR REG,REG

Interesting - we often do that "by hand" :icon14:
Can you give an example?
Which reg is being used? What if coder wants to pass a value with this register?

habran

Here it is:

invoke testproc5, NULL,FALSE,NULL, 0,0, rdx
000000013FB618F0 33 C9                xor         ecx,ecx 
000000013FB618F2 33 D2                xor         edx,edx 
000000013FB618F4 45 33 C0             xor         r8d,r8d 
000000013FB618F7 45 33 C9             xor         r9d,r9d 
000000013FB618FA 48 C7 44 24 20 00 00 00 00 mov         qword ptr [rsp+20h],0 
000000013FB61903 48 89 54 24 28       mov         qword ptr [rsp+28h],rdx 
000000013FB61908 E8 76 00 00 00       call        testproc5 (013FB61983h) 


So, not only it will be xor-ed but you can than use it for zeros in next parameters
pay attention to rdx being reused for zero in the sixth parameter

and now I will use it also in fifth parameter have look at this:

invoke testproc5, NULL,FALSE,NULL, 0,rdx, rdx
000000013F8C18F0 33 C9                xor         ecx,ecx 
000000013F8C18F2 33 D2                xor         edx,edx 
000000013F8C18F4 45 33 C0             xor         r8d,r8d 
000000013F8C18F7 45 33 C9             xor         r9d,r9d 
000000013F8C18FA 48 89 54 24 20       mov         qword ptr [rsp+20h],rdx 
000000013F8C18FF 48 89 54 24 28       mov         qword ptr [rsp+28h],rdx 
000000013F8C1904 E8 76 00 00 00       call        testproc5 (013F8C197Fh)


MIRACLE!!! isn't it ;)

I cold have used rcx instead or r8 :biggrin:

QuoteWhich reg is being used? What if coder wants to pass a value with this register?

it will be zeroed only if it holds zero otherwise it will contain the value of the parameter
I gave more intelligence to JWasm :t
Cod-Father

Gunther

Hi habran,

good idea for all the macro fans.  :t

By the way, I've checked the AVX2 code generation with jWasm (counter check via YASM). It works well and generates the same machine code.  :t

Gunther
You have to know the facts before you can distort them.

habran

Thanks Gunther :t
That means that we have now the best assembler on this planet :biggrin:
Cod-Father

habran

QuoteInteresting - we often do that "by hand" :icon14:
I hope that doesn't mean something rude ::)
if it does be aware of the danger: you can acquire a blindness :bgrin: 
Cod-Father

Gunther

Habran,

Quote from: habran on November 25, 2014, 07:21:40 PM
That means that we have now best assembler on this planet :biggrin:

I would say nearly. The DOS version is a bit out of date. I'm not sure about the other platforms (Linux, BSD, OS/2).

Gunther
You have to know the facts before you can distort them.

habran

you are just being to polite my friend :biggrin:
Cod-Father

Gunther

Quote from: habran on November 25, 2014, 07:40:04 PM
you are just being to polite my friend :biggrin:

No it's just a summary of the state, not more and not less.

Gunther
You have to know the facts before you can distort them.

TouEnMasm


Quote
That means that we have now best assembler on this planet
He need some improvements."general failure" with version generated with source code and an unchanged code who was good before change of compiler.The first 64 compiler you posted was better than the new.

 
Fa is a musical note to play with CL

jj2007

Quote from: habran on November 25, 2014, 07:09:45 PM
000000013FB618F2 33 D2                xor         edx,edx 
000000013FB61903 48 89 54 24 28       mov         qword ptr [rsp+28h],rdx 

How long is the and qword ptr [rsp+x] encoding? I'm not yet in 64-bit, so I can't test it...

0040100F   ³.  33C9           xor ecx, ecx
00401011   ³.  894C24 10      mov [esp+10], ecx
00401015   ³.  836424 10 00   and dword ptr [esp+10], 00000000


Another option would be to use the shorter rbp encodings instead of rsp:

0040101A   ³.  894D 10        mov [ebp+10], ecx
0040101D   ³.  8365 10 00     and dword ptr [ebp+10], 00000000

Biterider


habran

Biterider, do you have the same problem as ToutEnMasm?
Cod-Father