The MASM Forum

General => The Workshop => Topic started by: daydreamer on July 08, 2018, 05:18:25 PM

Title: is there ways to shuffle this data?
Post by: daydreamer on July 08, 2018, 05:18:25 PM
Hi
is there ways to shuffle this data,or maybe simd mask should be used?
first I gonna shuffle x to all fp,symbolized by 3.0,after that I need to mix 1.0's and x's

            ;pattern for mulps,to achieve x3,x5,x7,x9,x=3.0
x3x5x7x9    real4 3.0,3.0,3.0,3.0 ;x*x*x =mulps,mulps=x3
            real4 1.0,3.0,3.0,3.0
            real4 1.0,1.0,3.0,3.0
            real4 1.0,1.0,1.0,3.0


Title: Re: is there ways to shuffle this data?
Post by: Siekmanski on July 09, 2018, 07:05:35 AM
Hi Magnus,

It's not clear what needs to be shuffled in what order.
Do you have an example from start positions to end positions?
And which need to be multiplied by 3 5 7 or 9?
Title: Re: is there ways to shuffle this data?
Post by: daydreamer on July 09, 2018, 08:57:17 PM
Quote from: Siekmanski on July 09, 2018, 07:05:35 AM
Hi Magnus,

It's not clear what needs to be shuffled in what order.
Do you have an example from start positions to end positions?
And which need to be multiplied by 3 5 7 or 9?
Sorry i shorted down comment also,it shall be x^3,x^5,x^7,x^9 ,first step is shuffle one x to 4 x's,mulps until i have x^3,x^3,x^3,x^3 in one xmm reg,change multiplier to 1.0,x,x,x and keep multiply
What is important is get the result in data section posted above
I thought of maybe movups with .data section 1.0,1.0,1.0,1.0,x,x,x,x would be a slow alternative???
SSE2 shift ???
Title: Re: is there ways to shuffle this data?
Post by: Siekmanski on July 09, 2018, 10:20:09 PM
Something like this?


.const
Multipliers real4 8.0, 8.0, 8.0, 8.0 ; ^3

.data
YourData   ???????

.code
    movaps      xmm0,oword ptr Multipliers

    movaps      xmm1,oword ptr YourData
    pshufd      xmm2,xmm1,???? ; shuffle your data into place
    mulps       xmm2,xmm0      ; result

    pslld       xmm0,2  ; ^5   ( update multipliers from ^3 to ^5 )
    ; repeat steps with next set of data
Title: Re: is there ways to shuffle this data?
Post by: daydreamer on July 10, 2018, 03:25:08 AM
many years ago, I seen someone made a integer fast sqrt,maybe the opposite should be possible?
what about SHIFT 32bits combined with OR 1.0,0,0,0 ?
I almost never use shuffles
this is what I have come so far, I am making a sine Taylor series
remember you use radians for x

.code
    start:
   
    lea ebx,fconstant
    add ebx,16
    lea edx,x3x5x7x9
   
    movaps xmm0,x
    movaps xmm7,[edx]
    movaps xmm6,[ebx]
    mulps xmm0,xmm7;x2
    mulps xmm0,xmm7;x3
    add edx,16
    mulps xmm0,[edx];x4 3times
    mulps xmm0,[edx];x5 3 times
    add edx,16
    mulps xmm0,[edx];x6 2 times
    mulps xmm0,[edx];x7 2times
    add edx,16
    mulps xmm0,[edx];x8 1 time
    mulps xmm0,[edx];x9 1 time
    mulps xmm0,xmm6 ;x reciprocals of 3!,5!,7!,9!,add right - or + signs to prepare for haddps
    ;haddps here
    ;haddps
    movss sinex,xmm0


Title: Re: is there ways to shuffle this data?
Post by: Siekmanski on July 10, 2018, 04:43:14 AM
We did some trig testing routines a while back on the forum.
The Chebyshev Remez approximation of a 9th degree polynomial came out as the most accurate. ( depends on the number of coeffs of course )
4 optimized constants gives a maximum error of about 3.3381e-9 over -1/2 pi to +1/2 pi.


double fastsin2(double x)
{
    const double a3 = -1.666665709650470145824129400050267289858e-1;
    const double a5 = 8.333017291562218127986291618761571373087e-3;
    const double a7 = -1.980661520135080504411629636078917643846e-4;
    const double a9 = 2.600054767890361277123254766503271638682e-6;

    return x + x*x*x * (a3 + x*x * (a5 + x*x * (a7 + x*x * a9))));
}


A routine to calculate 4 real4 sines at once ( could be rewritten to 2 sines and 2 cosines at once )
And a routine to calculate 2 real8 at once.

http://masm32.com/board/index.php?topic=4118.msg49276#msg49276
Title: Re: is there ways to shuffle this data?
Post by: daydreamer on July 11, 2018, 02:38:06 PM
Thanks for the link,marinus
Check asmc large integers and floats,about real16's
http://masm32.com/board/index.php?topic=6454.15 (http://masm32.com/board/index.php?topic=6454.15)
Title: Re: is there ways to shuffle this data?
Post by: daydreamer on July 14, 2018, 03:38:19 AM
wouldnt it be good candidate for pi calculation,to use 6*arcsin 0.5
with fixed Point it would be a simple shift instruction to get Powers of 0.5,0.25 etc?