News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

ArcSin timings

Started by jj2007, December 21, 2020, 11:16:57 AM

Previous topic - Next topic

jj2007

Two algos that calculate arcsin(x) in the range x=0 ... 0.5. The first one, Arcsinus(), uses Raymond's tutorial, the second algo uses FastMath with Arcsinus() values:

FastMath ArcSin ; define a math function
  For_ fct=0.0 To 1.0 Step 0.0001
fld fct
fstp REAL10 ptr [edi]
void Arcsinus(fct)
fstp REAL10 ptr [edi+REAL10]
add edi, 2*REAL10
  Next
FastMath


May I have some timings please?

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

15296   cycles for 100 * Arcsinus
1907    cycles for 100 * ArcSin

15373   cycles for 100 * Arcsinus
1899    cycles for 100 * ArcSin

15238   cycles for 100 * Arcsinus
1912    cycles for 100 * ArcSin

15206   cycles for 100 * Arcsinus
1910    cycles for 100 * ArcSin

15219   cycles for 100 * Arcsinus
1905    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin

Siekmanski

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)

18643   cycles for 100 * Arcsinus
2256    cycles for 100 * ArcSin

18636   cycles for 100 * Arcsinus
2246    cycles for 100 * ArcSin

18650   cycles for 100 * Arcsinus
2261    cycles for 100 * ArcSin

18629   cycles for 100 * Arcsinus
2255    cycles for 100 * ArcSin

18640   cycles for 100 * Arcsinus
2261    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin

--- ok ---
Creative coders use backward thinking techniques as a strategy.

jj2007

Thanks, Marinus :thup:

I wonder why my old i5 is a tick faster... doesn't make much sense :cool:

Core i5-2450M
Core i7-4930K

Siekmanski

Hi Jochen,
My system is clocked down to prevent noise, for live audio recordings.
Creative coders use backward thinking techniques as a strategy.

daydreamer

Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (SSE4)

16580   cycles for 100 * Arcsinus
1378    cycles for 100 * ArcSin

16408   cycles for 100 * Arcsinus
1378    cycles for 100 * ArcSin

16397   cycles for 100 * Arcsinus
1358    cycles for 100 * ArcSin

16463   cycles for 100 * Arcsinus
1368    cycles for 100 * ArcSin

16674   cycles for 100 * Arcsinus
1358    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin

-

also wonder how the oldschool raycasting optimization stand compared to this:an arccos LUT? :biggrin:

I have a general SSE trigo PROC thats untested,just input 4 floats and offset value that controls which set of constants it points to it,so it become different taylor series


@Marinus
I thought you underclock it because it would be a bigger challenge to optimize it to run on slower cpu :badgrin:
AMIGA clock speed today would really challenging :biggrin:
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

Siekmanski

@Magnus
Still miss the Amiga days, banging directly to the hardware was a lot of fun.
Creative coders use backward thinking techniques as a strategy.

HSE

 :biggrin: What its ArcSinInit's timing?
Equations in Assembly: SmplMath

jj2007

Quote from: HSE on December 22, 2020, 12:21:51 AM
:biggrin: What its ArcSinInit's timing?

Test yourself - it might be enough for a coffee break, who knows? :biggrin:

ArcSinInit:
NanoTimer()
FastMath ArcSin        ; define a math function
  For_ fct=0.0 To 1.0 Step 0.0001
        fld fct
        fstp REAL10 ptr [edi]
        void Arcsinus(fct)
        fstp REAL10 ptr [edi+REAL10]
        add edi, 2*REAL10
  Next
FastMath
PrintLine NanoTimer$(), " for initialising the ArcSin macro"
retn

HSE

Quote from: jj2007 on December 22, 2020, 12:50:55 AM
Test yourself - it might be enough for a coffee break, who knows? :biggrin:

I tried previously but I have a little crash related with memory allocation  :biggrin:
Equations in Assembly: SmplMath

jj2007

Post the exe, I am curious

TouEnMasm


Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz (SSE4)

18942   cycles for 100 * Arcsinus
2095    cycles for 100 * ArcSin

18950   cycles for 100 * Arcsinus
2100    cycles for 100 * ArcSin

19068   cycles for 100 * Arcsinus
2137    cycles for 100 * ArcSin

18970   cycles for 100 * Arcsinus
2596    cycles for 100 * ArcSin

18904   cycles for 100 * Arcsinus
2112    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin

--- ok ---
Fa is a musical note to play with CL

TimoVJL

AMD Ryzen 5 3400G with Radeon Vega Graphics     (SSE4)

19735   cycles for 100 * Arcsinus
2530    cycles for 100 * ArcSin

19817   cycles for 100 * Arcsinus
2535    cycles for 100 * ArcSin

19796   cycles for 100 * Arcsinus
2309    cycles for 100 * ArcSin

19779   cycles for 100 * Arcsinus
2326    cycles for 100 * ArcSin

19822   cycles for 100 * Arcsinus
2321    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin

-
May the source be with you

HSE

Quote from: jj2007 on December 22, 2020, 02:36:31 AM
Post the exe, I am curious

:thumbsup: Was missing MbProHeap initialization:  ifdef MbBufferInit
call MbBufferInit
  endif
Equations in Assembly: SmplMath

quarantined


AMD A6-9220e RADEON R4, 5 COMPUTE CORES 2C+3G   (SSE4)
1.60 GHz

28377   cycles for 100 * Arcsinus
4369    cycles for 100 * ArcSin

28350   cycles for 100 * Arcsinus
4247    cycles for 100 * ArcSin

28358   cycles for 100 * Arcsinus
4414    cycles for 100 * ArcSin

28341   cycles for 100 * Arcsinus
4440    cycles for 100 * ArcSin

28374   cycles for 100 * Arcsinus
4300    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin


Windows 7 Pro, 32 bit

coaster

Intel(R) Core(TM) i5-7300U CPU @ 2.60GHz (SSE4)

14448   cycles for 100 * Arcsinus
1225    cycles for 100 * ArcSin

14644   cycles for 100 * Arcsinus
1214    cycles for 100 * ArcSin

14663   cycles for 100 * Arcsinus
1237    cycles for 100 * ArcSin

14606   cycles for 100 * Arcsinus
1206    cycles for 100 * ArcSin

14518   cycles for 100 * Arcsinus
1272    cycles for 100 * ArcSin

58      bytes for Arcsinus
209     bytes for ArcSin

Real8   29.99999926061033761    Arcsinus
Real8   29.99999926061034117    ArcSin