The MASM Forum

64 bit assembler => UASM Assembler Development => Topic started by: KradMoonRa on March 30, 2018, 04:03:07 AM

Title: How to pass an integer input parameter to shufps
Post by: KradMoonRa on March 30, 2018, 04:03:07 AM
Hi,

Getting messed with this one.

How can I pass an input parameter to the instruction shufps Imm?


_TEXT segment
align 16
uXm_xmm_shuffle_ps proto UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

align 16
uXm_xmm_shuffle_ps proc UX_VECCALL (xmmword) frame ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

local _Imm8:dword
mov _Imm8, eparam1 ;64bits-ecx/edi,32bits-ecx
shufps xmm0, xmm1, _Imm8 ;Invalid operation with shufps

ret
uXm_xmm_shuffle_ps endp
_TEXT ends
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on March 30, 2018, 04:42:38 AM
I has to be an immidiate 8 bit value (imm8)
You can not use locals, globals or registers.

Shuffle MACRO V0,V1,V2,V3
    EXITM %((V0 shl 6) or (V1 shl 4) or (V2 shl 2) or (V3))
ENDM

With a macro:
    shufps  xmm0,xmm0,Shuffle(1,3,2,0)
or direct:
    shufps  xmm0,xmm0,01111000b
Title: Re: How to pass an integer input parameter to shufps
Post by: jj2007 on March 30, 2018, 05:26:59 AM
Quote from: Siekmanski on March 30, 2018, 04:42:38 AM
I has to be an immidiate 8 bit value (imm8)
You can not use locals, globals or registers.

Nothing is impossible in assembler 8)

include \masm32\MasmBasic\MasmBasic.inc         ; download (http://masm32.com/board/index.php?topic=94.0)
.data
src OWORD 11223344556677889900AABBCCDDEEFFh

  Init
  movups xmm1, src
  mov ecx, 8C200h

  mov cl, 00011011b             ; the "immediate" parameter in a register ;-)

  push ecx
  mov eax, 0C1700f66h
  push eax
  call esp
  deb 4, "shuffled!!", x:xmm1, x:xmm0
EndOfCode


shuffled!
x:xmm1          11223344 55667788 9900AABB CCDDEEFF
x:xmm0          CCDDEEFF 9900AABB 55667788 11223344
Title: Re: How to pass an integer input parameter to shufps
Post by: daydreamer on March 30, 2018, 05:42:49 AM
I Think its impossible now with DEP to change an immediate value
you get a gpf if you try
check PSHUFB instead, it takes values in a xmm reg and can do the job instead
but its a SSSE3 instruction
PSHUFB xmm1, xmm2/m128 ;second operand Controls how its shuffled
Title: Re: How to pass an integer input parameter to shufps
Post by: jj2007 on March 30, 2018, 06:07:18 AM
Quote from: daydreamer on March 30, 2018, 05:42:49 AMyou get a gpf if you try

I tried hard on Win7-64 and Win10, no gpf :(
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on March 30, 2018, 07:07:48 AM
QuoteNothing is impossible in assembler 8)

Hi JJ, can you write me a routine to increase my bank account?  :biggrin:

Another way is, create a piece of executable data section, and write the imm8 value right into the shufps mnemonic memory.
Title: Re: How to pass an integer input parameter to shufps
Post by: jj2007 on March 30, 2018, 02:02:22 PM
Quote from: Siekmanski on March 30, 2018, 07:07:48 AMHi JJ, can you write me a routine to increase my bank account?  :biggrin:

Not my league, Marinus, sorry. But rumours say that at the end of the Cold War, many good programmers lost their jobs in the military-industrial complex, and found better paid ones at Wall Street :icon_cool:

QuoteAnother way is, create a piece of executable data section, and write the imm8 value right into the shufps mnemonic memory.

But that is exactly what my code does... only that I used pshufd instead of shufps:
  movups xmm1, src
  mov ecx, 8C200h
  mov cl, 00011011b ; the "immediate" parameter in a register ;-)
  push ecx
  push 0C1700f66h
  call esp


P.S.: The x64 equivalent - and that one stumbles indeed over DEP (but there is a working solution in the attachment):  mov rax, 0C1700F660008C200h
  mov al, 00011011b ; the "immediate" parameter in a register ;-)
  rol rax, 32
  push rax
  call rsp
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on March 30, 2018, 10:18:28 PM
Hi Jochen,
OK, but I meant only 1 byte memory write. (SMC)  8)

Hi KradMoonRa,
Why the need for a "shufps" procedure with 3 inputs ( 2 xmm regs, 1 imm8 ).
Is there a reason you need to have a variable imm8 in memory?
Title: Re: How to pass an integer input parameter to shufps
Post by: jj2007 on March 30, 2018, 10:51:50 PM
Quote from: Siekmanski on March 30, 2018, 10:18:28 PMI meant only 1 byte memory write. (SMC)

Did you manage to do that in 64-bit land? My solution works, but it is admittedly a bit clumsy.
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on March 30, 2018, 11:02:30 PM
I haven't tried it with masm 64 bit.
The only 64 bit experience I have is setting Masm64 up to use it with RadASM and run some of Hutch's examples...
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on March 31, 2018, 07:27:14 AM
Hi @Siekmanski @jj2007 @daydreamer,

Awesome thank-you.

@Siekmanski
QuoteWhy the need for a "shufps" procedure with 3 inputs ( 2 xmm regs, 1 imm8 ).
Is there a reason you need to have a variable imm8 in memory?

I'm recreating the SSE functions in asm and export to cc language to make available with the intrinsic struct in the library.
After some hours trying to figure out how to pass the register value to an constant imm8 value, (and I'm not figured out that its a byte value), my best bet at the time has store it in 32bits memory.

With Your recommendations, I managed to do something like this.



;xmm4shuffle(1:3<<6,1:3<<4,1:3<<2,1:3)
xmm4shuffle0000 equ 0
xmm4shuffle0001 equ 1
xmm4shuffle0002 equ 2
xmm4shuffle0003 equ 3
xmm4shuffle0010 equ 4
xmm4shuffle0011 equ 5
........
........
........
xmm4shuffle3333 equ 255


uXm_xmm4shuffled_ps macro reg0, reg1, reg2
.switch reg2
.case xmm4shuffle0000
shufps reg0, reg1, xmm4shuffle0000
.break
.case xmm4shuffle0001
shufps reg0, reg1, xmm4shuffle0001
.break
.case xmm4shuffle0002
shufps reg0, reg1, xmm4shuffle0002
.break
.case xmm4shuffle0003
shufps reg0, reg1, xmm4shuffle0003
.break
.case xmm4shuffle0010
shufps reg0, reg1, xmm4shuffle0010
.break
.case xmm4shuffle0011
shufps reg0, reg1, xmm4shuffle0011
.break
..........
..........
..........
.case xmm4shuffle3333
shufps reg0, reg1, xmm4shuffle3333
.break
.endswitch
endm

_TEXT segment
align 16
uXm_xmm_shuffle_ps proto UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

align 16
uXm_xmm_shuffle_ps proc UX_VECCALL (xmmword) frame ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

uXm_xmm4shuffled_ps xmm0, xmm1, rparam3

ret
uXm_xmm_shuffle_ps endp
_TEXT ends


It's working, probably I can do better?

The library at my signature.
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on March 31, 2018, 07:29:19 AM
Going to sync the library to github. in meantime it's available with the last changes.
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on March 31, 2018, 03:10:52 PM
I'm not familiar with your coding language but, maybe something like this is possible.

#define uXm_XMM_SHUFFLE(V0,V1,V2,V3) (((V0) << 6) | ((V1) << 4) | ((V2) << 2) | ((V3)))

uXm_xmm_shufps macro reg0, reg1, sp0, sp1, sp2, sp3

    shufps  reg0,reg1,uXm_XMM_SHUFFLE(sp0,sp1,sp2,sp3)

endm


Or you could try to use a jump table with all the 256 imm8 entries instead of the switch/case approach.
It will be a lot faster.
Title: Re: How to pass an integer input parameter to shufps
Post by: habran on March 31, 2018, 05:53:03 PM
Hi KradMoonRa :biggrin:

It is not necessary to use .break in the switch block, UASM takes care of that 8)
you can just write like this:

.switch reg2
.case xmm4shuffle0000
shufps reg0, reg1, xmm4shuffle0000
.case xmm4shuffle0001
shufps reg0, reg1, xmm4shuffle0001
.case xmm4shuffle0002
shufps reg0, reg1, xmm4shuffle0002
.case xmm4shuffle0003
shufps reg0, reg1, xmm4shuffle0003
.case xmm4shuffle0010
shufps reg0, reg1, xmm4shuffle0010
.case xmm4shuffle0011
shufps reg0, reg1, xmm4shuffle0011
..........
..........
..........
.case xmm4shuffle3333
shufps reg0, reg1, xmm4shuffle3333
.endswitch
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 01, 2018, 02:07:53 AM
Hi,

I came up to this after some hours of seeing the .object debug source, produces better results, but I'm thinking about the .for .endf solution, probably produces less and fast code, researching, the .for.


uXm_xmm4shuffled2_ps macro reg0, reg1, reg2

.if((bparam1 >= 240) && (bparam1 <= 255))
jmp xmm4jelabel_16x16
.endif

.if((bparam1 >= 224) && (bparam1 <= 239))
jmp xmm4jelabel_15x16
.endif
........
........
........
........
.if((bparam1 >= 0) && (bparam1 <= 15))
jmp xmm4jelabel_1x16
.endif

xmm4jelabel_1x16:
.if(reg2 == xmm4shuffle0000)
shufps reg0, reg1, xmm4shuffle0000
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0001)
shufps reg0, reg1, xmm4shuffle0001
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0002)
shufps reg0, reg1, xmm4shuffle0002
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0003)
shufps reg0, reg1, xmm4shuffle0003
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0010)
shufps reg0, reg1, xmm4shuffle0010
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0011)
shufps reg0, reg1, xmm4shuffle0011
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0012)
shufps reg0, reg1, xmm4shuffle0012
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0013)
shufps reg0, reg1, xmm4shuffle0013
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0020)
shufps reg0, reg1, xmm4shuffle0020
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0021)
shufps reg0, reg1, xmm4shuffle0021
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0022)
shufps reg0, reg1, xmm4shuffle0022
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0023)
shufps reg0, reg1, xmm4shuffle0023
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0030)
shufps reg0, reg1, xmm4shuffle0030
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0031)
shufps reg0, reg1, xmm4shuffle0031
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0032)
shufps reg0, reg1, xmm4shuffle0032
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle0033)
shufps reg0, reg1, xmm4shuffle0033
jmp xmm4shuffle_END
.endif
........
........
........
xmm4jelabel_16x16:
.if(reg2 == xmm4shuffle3300)
shufps reg0, reg1, xmm4shuffle3300
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3301)
shufps reg0, reg1, xmm4shuffle3301
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3302)
shufps reg0, reg1, xmm4shuffle3302
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3303)
shufps reg0, reg1, xmm4shuffle3303
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3310)
shufps reg0, reg1, xmm4shuffle3310
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3311)
shufps reg0, reg1, xmm4shuffle3311
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3312)
shufps reg0, reg1, xmm4shuffle3312
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3313)
shufps reg0, reg1, xmm4shuffle3313
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3320)
shufps reg0, reg1, xmm4shuffle3320
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3321)
shufps reg0, reg1, xmm4shuffle3321
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3322)
shufps reg0, reg1, xmm4shuffle3322
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3323)
shufps reg0, reg1, xmm4shuffle3323
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3330)
shufps reg0, reg1, xmm4shuffle3330
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3331)
shufps reg0, reg1, xmm4shuffle3331
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3332)
shufps reg0, reg1, xmm4shuffle3332
jmp xmm4shuffle_END
.endif
.if(reg2 == xmm4shuffle3333)
shufps reg0, reg1, xmm4shuffle3333
jmp xmm4shuffle_END
.endif
xmm4shuffle_END:
endm


_TEXT segment
align 16
uXm_xmm_shuffle_ps proto UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:byte

align 16
uXm_xmm_shuffle_ps proc UX_VECCALL (xmmword) frame ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:byte

uXm_xmm4shuffled2_ps  xmm0, xmm1, rparam3

ret
uXm_xmm_shuffle_ps endp
_TEXT ends

Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on April 01, 2018, 02:42:30 AM
You can get rid of all the compare instructions by using a 256 offsets entry table and jump directly to the code.

.data

Imm8Jump    dd offset Imm8_0,offset Imm8_1,offset Imm8_2,........... Imm8_253,offset Imm8_254,offset Imm8_255

.code

    movzx eax,reg2
    jmp         [Imm8Jump+eax*4]


Imm8_0:
    shufps reg0,reg1,0
    ret
Imm8_1:
    shufps reg0,reg1,1
    ret
Imm8_2:
    shufps reg0,reg1,2
    ret

-------
-------
-------

Imm8_253:
    shufps reg0,reg1,253
    ret
Imm8_254:
    shufps reg0,reg1,254
    ret
Imm8_255:
    shufps reg0,reg1,255
    ret

Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 02, 2018, 03:00:23 AM
Hi @Siekmanski  :biggrin:

Thank-you, after some good hours managed to get compiled.
Tried to put the label data outside off the code but uasm complains for unknown label and offsets symbols.
Tried to replace the ret with an end label jump, seriously the object header full off jmp erros.
Some errors in the object header still persist about decisions in the .text code, but gets compiled and I think its not big deal.
Some work to port relocatable address for 64bits, working.

And finally:  :greenclp:

_TEXT segment
align 16
uXm_xmm_shuffle_ps proto UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

align 16
uXm_xmm_shuffle_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

ifndef __X64__
xor ecx, ecx
xor eax, eax
movzx ecx, byte ptr [dparam3]
lea eax, [shpsjmptable]
mov ecx, [eax+ecx*4]
mov eax, ecx
jmp eax
else
xor rcx, rcx
xor rax, rax
movzx rcx, byte ptr [rparam3]
lea rax, [shpsjmptable]
mov ecx, [rax+rcx*4]
mov rax, rcx
jmp rax
endif

ifndef __X64__
shpsword textequ <dword>
shpsiword textequ <dd>
else
shpsword textequ <qword>
shpsiword textequ <dq>
endif

;uams complains about line to big, and only work declared here
shpsjmptable label shpsword
shpsiword offset shps_0, .....
shpsiword offset shps_51, .....
shpsiword offset shps_101, .....
shpsiword offset shps_151, .....
shpsiword offset shps_201, .....
shpsiword offset shps_251, ....

shps_0 label shpsword
shufps xmm0, xmm1, 0
ret
shps_1 label shpsword
shufps xmm0, xmm1, 1
ret
.................
.................
.................
.................
.................
.................
shps_254 label shpsword
shufps xmm0, xmm1, 254
ret
shps_255 label shpsword
shufps xmm0, xmm1, 255
ret

uXm_xmm_shuffle_ps endp
_TEXT ends

Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on April 02, 2018, 03:56:07 AM
 :t
Cool that it works but, do you really need 7 instructions for the jump table execution?
In Masm it works with only 2 instructions.
The both xor instructions are useless.
Title: Re: How to pass an integer input parameter to shufps
Post by: jj2007 on April 02, 2018, 04:05:56 AM
Quote from: Siekmanski on April 02, 2018, 03:56:07 AM
In Masm it works with only 2 instructions.

One should be sufficient: In 64-bit land, the first 4 args are passed in registers 8)
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 10, 2018, 03:18:02 AM
Hi,

Reading some spec from intel manuals, interesting some opcodes can be pushed and used.

If someone knows, I'm near something, can be done?

header file:

extern "CC" {
extern __uXm128 uXm_mm_shuffle_ps(__uXm128 InXmm_A, __uXm128 InXmm_B, unsigned int _Imm8);
}


asm file:

ifndef __X64__
   dparam3 textequ <esp+16*2+4>
ifdef WINDOWS
   dparam3 textequ <r8d>
else
   dparam3 textequ <ecx>
endif

align 16
uXm_mm_shuffle_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword, _Imm8:dword

ifndef __X64__
push ebp
mov ebp, esp
sub esp, 16*2+4 ; allocate space on stack
movups [ebp-16], xmm0 ; push xmm param 1
movups [ebp-16*2], xmm1 ; push xmm param 2
mov [ebp-16*2+4], dparam3 ; push param 3
db 0fh, 0c6h, 3h ; shufps imm encoding
mov dparam3, [ebp-16*2+4] ; pop param 3
movups xmm1, [ebp-16*2] ; pop xmm param 2
movups xmm0, [ebp-16] ; pop xmm param 1
add esp, 16*2+4 ; deallocate space on stack
mov esp, ebp
pop ebp
else
push rbp
mov rbp, rsp
sub rsp, 16*2+4 ; allocate space on stack
movups [rbp-16], xmm0 ; push xmm param 1
movups [rbp-16*2], xmm1 ; push xmm param 2
mov [rbp-16*2+4], dparam3 ; push param 3
db 0fh, 0c6h, 3h ; shufps imm encoding
mov dparam3, [rbp-16*2+4] ; pop param 3
movups xmm1, [rbp-16*2] ; pop xmm param 2
movups xmm0, [rbp-16] ; pop xmm param 1
add rsp, 16*2+4 ; deallocate space on stack
mov rsp, rbp
pop rbp
endif

ret
uXm_mm_shuffle_ps endp


EDIT:  producing unexpected results. Can't be done like this. The byte opcode produces same opcode resulting there's a need for a fixed imm.
Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on April 10, 2018, 05:00:36 AM
I'm not exactly sure what your goal is?
Maybe I'm missing something?
Using so many instructions to execute 1 shufps instruction which really slows down the code execution a lot....
Why not set the imm8 with a macro construction?

uXm_xmm_shufps macro reg0, reg1, _imm8

    shufps  reg0,reg1,_imm8

endm
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 10, 2018, 05:36:07 AM
Hi Siekmanski,

actually I have done it

cc header file:

/*******************************************************/
/* MACRO for use uXm_mm_shuffle_****_ps(). */
/* Argument fp3 is a digit[0123] that represents the fp*/
/* from argument "b" of uXm_mm_shuffle_****_ps that will be     */
/* placed in fp3 of result. fp2 is the same for fp2 in */
/* result. fp1 is a digit[0123] that represents the fp */
/* from argument "a" of uXm_mm_shuffle_****_ps that will be     */
/* places in fp1 of result. fp0 is the same for fp0 of */
/* result                                              */
/* const __uXm128 temp = uXm_MM_SHUFFLE_IMR_PS(InXmm_A, InXmm_B, 0, 1, 2, 3); */
/*******************************************************/
#define uXm_MM_SHUFFLE_IM_PS(VA,VB,fp3,fp2,fp1,fp0) uXm_mm_shuffle_##fp3##fp2##fp1##fp0##_ps(VA,VB)
#define uXm_MM_SHUFFLE_IMR_PS(VA,VB,fp0,fp1,fp2,fp3) uXm_mm_shuffle_##fp3##fp2##fp1##fp0##_ps(VA,VB)


asm file:

align 16
uXm_mm_shuffle_0000_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword
shufps xmm0, xmm1, 0
ret
uXm_mm_shuffle_0000_ps endp

align 16
uXm_mm_shuffle_0001_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword
shufps xmm0, xmm1, 1
ret
uXm_mm_shuffle_0001_ps endp

align 16
uXm_mm_shuffle_0002_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword
shufps xmm0, xmm1, 2
ret
uXm_mm_shuffle_0002_ps endp

align 16
uXm_mm_shuffle_0003_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword
shufps xmm0, xmm1, 3
ret
uXm_mm_shuffle_0003_ps endp

align 16
uXm_mm_shuffle_0010_ps proc UX_VECCALL (xmmword) ;InXmm_A:xmmword, InXmm_B:xmmword
shufps xmm0, xmm1, 4
ret
uXm_mm_shuffle_0010_ps endp
.............
.............


But I'm really searching how to, work with the imm from the c function and using it with the imm asm, know I can't mix the 2 thing's. The imm its an instruction opcode, but how to really select it using the c imm function in a simple clear manner.

I have to do it to bit m128 shifting also, the count it's an imm(opcode).


/*******************************************************/
/* MACRO for use uXm_mm_slli_si128_*(). */
/* result                                              */
/* const __uXm128i temp = uXm_MM_SLLI_SI128_IM(InXmm_A, 3); */
/*******************************************************/
#define uXm_MM_SLLI_SI128_IM(VA,IMM) uXm_mm_slli_si128_##IMM##(VA)



align 16
uXm_mm_slli_si128_0 proc UX_VECCALL (xmmword) ;Inxmm_A:xmmword
pslldq xmm0, 0
ret
uXm_mm_slli_si128_0 endp


The macros expand to the named function declared in the header as extern.

I'm excited to new and different programming approaches.

Love to know how to pass the value from the c function as an opcode to the shufps/pslldq/psllq... etc.

Title: Re: How to pass an integer input parameter to shufps
Post by: Siekmanski on April 10, 2018, 06:06:08 AM
Sadly, I don't speak CC language.
Does it have inline ASM support inside a MACRO construction?
Because IMHO it is a waste to use a function for 1 instruction.
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 11, 2018, 04:02:01 AM
Unfortunately VC compiler for 64 bits, it can't compile inline assembly.
Yea, one instruction it's really full and fast code. Called from cc has jmp to code really breaks the speedy. But 40% or the relation off Murphy's law, can be fast than the cc function counterpart, or not, depends how the function its used.
I believe the jmp it's the best option for now, till I learn how to make an runtime called machine opcode with the imm converted to byte code, and something like.


;0Fh, 0C6h:    shufleps
;0C1h:         xmm0 to xmm1
;3h:           r/r and not r/m
;0FFh:         255 "shuffle4(3,3,3,3)"
;0C3h:         retn

__shuffleps proc
; how to make this at runtime, must be like this, only need to convert the regparam3 to the byte place FFh, something like regtohex(regparam3).
byte 0Fh, 0C6h, 0C1h, 3h, 0FFh, 0C3h
__shuffleps endp



Title: Re: How to pass an integer input parameter to shufps
Post by: habran on April 11, 2018, 05:01:52 AM
Hi KradMoonRa :biggrin:
The best way to learn assembly programming is to write functions in C and than look through disassembly and try to shorten and/or speed it up.
Such altered disassembly can be used to create functions in assembly.
You can play with optimisation and see what difference  produce C compiler.
You can play with intrinsics as well.

Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 11, 2018, 11:05:32 AM
Hi habran,

Yes, I has decompiling the functions from c and cc, really fun to follow and trying to bring it with asm.

But I'm watching that some undercover secrets off asm and machine code, its the way to do it in the right way.
Title: Re: How to pass an integer input parameter to shufps
Post by: KradMoonRa on April 24, 2018, 12:15:12 PM
Hi,  :biggrin:

Finally I got the jmp address in 64bits working as the 32bits version, still 3 instructions to overcome 64bits linker /LARGEADDRESSAWARE.


_uXm_m128_cvtelts_f32 proc UX_VECCALL (real4) ;InXmm_A:xmmword, InInt_BSel:dword

;.if(rparam2 > 3)
; ret
;.else

ifndef __X64__
movzx eax, byte ptr [rparam2]
;mov rbx, dword ptr [rbx+rparam2*4]
jmp dword ptr [m128cvteltsf32jmptable+eax*4]
else
;movzx rax, byte ptr [rparam2]
lea rbx, qword ptr [m128cvteltsf32jmptable]
mov rbx, qword ptr [rbx+rparam2*8]
jmp rbx
endif

ifndef __X64__
m128cvteltsf32word textequ <dword>
m128cvteltsf32iword textequ <dd>
else
m128cvteltsf32word textequ <qword>
m128cvteltsf32iword textequ <dq>
endif

m128cvteltsf32_0 label m128cvteltsf32word
movss xmm0, xmm0
ret
m128cvteltsf32_1 label m128cvteltsf32word
shufps xmm0, xmm0, _uXm_mm_shuffler4(1,1,1,1)
movss xmm0, xmm0
ret
m128cvteltsf32_2 label m128cvteltsf32word
shufps xmm0, xmm0, _uXm_mm_shuffler4(2,2,2,2)
movss xmm0, xmm0
ret
m128cvteltsf32_3 label m128cvteltsf32word
shufps xmm0, xmm0, _uXm_mm_shuffler4(3,3,3,3)
movss xmm0, xmm0
ret
;.endif

_uXm_m128_cvtelts_f32 endp


All the code attacked.