News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

FastMath timings

Started by jj2007, September 04, 2020, 09:18:30 PM

Previous topic - Next topic

jj2007

May I have some timings, please? This is for checking if slow math functions can be replaced with faster proxies.

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
1337 µs for initialising FastLog10

2885    cycles for 100 * Log10 (Guga)
1476    cycles for 100 * FastMath Log10
10679   cycles for 100 * fptan
1402    cycles for 100 * FastTan
8200    cycles for 100 * fsin
1486    cycles for 100 * FastSin
12313   cycles for 100 * CRT BesselJ0
1397    cycles for 100 * FastBesselJ0

2879    cycles for 100 * Log10 (Guga)
1500    cycles for 100 * FastMath Log10
10672   cycles for 100 * fptan
1396    cycles for 100 * FastTan
8201    cycles for 100 * fsin
1493    cycles for 100 * FastSin
12318   cycles for 100 * CRT BesselJ0
1397    cycles for 100 * FastBesselJ0

2883    cycles for 100 * Log10 (Guga)
1488    cycles for 100 * FastMath Log10
10679   cycles for 100 * fptan
1395    cycles for 100 * FastTan
8204    cycles for 100 * fsin
1490    cycles for 100 * FastSin
12290   cycles for 100 * CRT BesselJ0
1387    cycles for 100 * FastBesselJ0

2884    cycles for 100 * Log10 (Guga)
1510    cycles for 100 * FastMath Log10
10709   cycles for 100 * fptan
1397    cycles for 100 * FastTan
8183    cycles for 100 * fsin
1491    cycles for 100 * FastSin
12306   cycles for 100 * CRT BesselJ0
1386    cycles for 100 * FastBesselJ0

2869    cycles for 100 * Log10 (Guga)
1486    cycles for 100 * FastMath Log10
10696   cycles for 100 * fptan
1394    cycles for 100 * FastTan
8183    cycles for 100 * fsin
1497    cycles for 100 * FastSin
12308   cycles for 100 * CRT BesselJ0
1392    cycles for 100 * FastBesselJ0

Real8   0.6989700043360187465   Log10 (Guga)
Real8   0.6989700043360188575   FastMath Log10
Real8   1.000003619996628679    fptan
Real8   1.000003614159206133    FastTan
Real8   0.4794255386042030054   fsin
Real8   0.4794255386042030054   FastSin
Real8   0.9384698072408128589   CRT BesselJ0
Real8   0.9384698072408128589   FastBesselJ0

FORTRANS

Hi,

   Two results (sort of).


Windows XP
Intel(R) Pentium(R) M processor 1.70GHz
"LOG10_FA.EXE has encountered a problem and needs to close.
We are sorry for the inconvenience."
... "Please tell Microsoft about this problem."



Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.

C:\Windows\System32>g:

G:\>cd temp

G:\TEMP>cd test

G:\TEMP\TEST>LOG10_FA.EXE
Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz (SSE4)
5001 elements
5001 elements
5001 elements
16 ms for initialising FastLog10

2792    cycles for 100 * Log10 (Guga)
1522    cycles for 100 * FastMath Log10
13172   cycles for 100 * fptan
1549    cycles for 100 * FastTan
9875    cycles for 100 * fsin
1686    cycles for 100 * FastSin
14603   cycles for 100 * CRT BesselJ0
1547    cycles for 100 * FastBesselJ0

2803    cycles for 100 * Log10 (Guga)
1523    cycles for 100 * FastMath Log10
13190   cycles for 100 * fptan
1548    cycles for 100 * FastTan
9873    cycles for 100 * fsin
1696    cycles for 100 * FastSin
14600   cycles for 100 * CRT BesselJ0
1548    cycles for 100 * FastBesselJ0

2792    cycles for 100 * Log10 (Guga)
1524    cycles for 100 * FastMath Log10
13172   cycles for 100 * fptan
1552    cycles for 100 * FastTan
9879    cycles for 100 * fsin
1686    cycles for 100 * FastSin
14602   cycles for 100 * CRT BesselJ0
1548    cycles for 100 * FastBesselJ0

2798    cycles for 100 * Log10 (Guga)
1522    cycles for 100 * FastMath Log10
13175   cycles for 100 * fptan
1548    cycles for 100 * FastTan
9876    cycles for 100 * fsin
1686    cycles for 100 * FastSin
14602   cycles for 100 * CRT BesselJ0
1547    cycles for 100 * FastBesselJ0

2793    cycles for 100 * Log10 (Guga)
1521    cycles for 100 * FastMath Log10
13172   cycles for 100 * fptan
1548    cycles for 100 * FastTan
9873    cycles for 100 * fsin
1687    cycles for 100 * FastSin
14601   cycles for 100 * CRT BesselJ0
1552    cycles for 100 * FastBesselJ0

Real8   0.6989700043360187465   Log10 (Guga)
Real8   0.6989700043360188575   FastMath Log10
Real8   1.000003619996628679    fptan
Real8   1.000003614159206133    FastTan
Real8   0.4794255386042030054   fsin
Real8   0.4794255386042030054   FastSin
Real8   0.9384698072408128589   CRT BesselJ0
Real8   0.9384698072408128589   FastBesselJ0

--- ok ---


Regards,

Steve N.

TouEnMasm

hello,
windows 10
Quote
Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz (SSE4)
13 ms for initialising five Fast* functions

2922    cycles for 100 * Log10 (Guga)
1550    cycles for 100 * FastMath Log10
13298   cycles for 100 * fptan
2066    cycles for 100 * FastTan
10263   cycles for 100 * fsin
2043    cycles for 100 * FastSin
15159   cycles for 100 * CRT BesselJ0
1653    cycles for 100 * FastBesselJ0

2829    cycles for 100 * Log10 (Guga)
1535    cycles for 100 * FastMath Log10
13356   cycles for 100 * fptan
1615    cycles for 100 * FastTan
10025   cycles for 100 * fsin
1626    cycles for 100 * FastSin
14824   cycles for 100 * CRT BesselJ0
1586    cycles for 100 * FastBesselJ0

2857    cycles for 100 * Log10 (Guga)
1615    cycles for 100 * FastMath Log10
13501   cycles for 100 * fptan
1648    cycles for 100 * FastTan
9932    cycles for 100 * fsin
1618    cycles for 100 * FastSin
14886   cycles for 100 * CRT BesselJ0
1832    cycles for 100 * FastBesselJ0

3454    cycles for 100 * Log10 (Guga)
1656    cycles for 100 * FastMath Log10
13335   cycles for 100 * fptan
1587    cycles for 100 * FastTan
9952    cycles for 100 * fsin
1571    cycles for 100 * FastSin
14808   cycles for 100 * CRT BesselJ0
1578    cycles for 100 * FastBesselJ0

2834    cycles for 100 * Log10 (Guga)
1721    cycles for 100 * FastMath Log10
13306   cycles for 100 * fptan
1560    cycles for 100 * FastTan
10006   cycles for 100 * fsin
1567    cycles for 100 * FastSin
14748   cycles for 100 * CRT BesselJ0
1566    cycles for 100 * FastBesselJ0

Real8   0.6989700043360187465   Log10 (Guga)
Real8   0.6989700043360188575   FastMath Log10
Real8   1.000003619996628679    fptan
Real8   1.000003614159206133    FastTan
Real8   0.4794255386042030054   fsin
Real8   0.4794255386042030054   FastSin
Real8   0.9384698072408128589   CRT BesselJ0
Real8   0.9384698072408128589   FastBesselJ0
-
Fa is a musical note to play with CL

jj2007

Thanks, Steve and Yves. I wonder if the Pentium M supports fisttp :cool:

WinXP is not to blame, with my VM the results are very similar (as expected), with one exception - the CRT Bessel function is much slower:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
1369 µs for initialising five Fast* functions

3013    cycles for 100 * Log10 (Guga)
1500    cycles for 100 * FastMath Log10
11406   cycles for 100 * fptan
1528    cycles for 100 * FastTan
8650    cycles for 100 * fsin
1498    cycles for 100 * FastSin
20460   cycles for 100 * CRT BesselJ0
1457    cycles for 100 * FastBesselJ0

2976    cycles for 100 * Log10 (Guga)
1491    cycles for 100 * FastMath Log10
11296   cycles for 100 * fptan
1467    cycles for 100 * FastTan
8653    cycles for 100 * fsin
1501    cycles for 100 * FastSin
20471   cycles for 100 * CRT BesselJ0
1468    cycles for 100 * FastBesselJ0

Siekmanski

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)
3553 µs for initialising five Fast* functions

3151    cycles for 100 * Log10 (Guga)
1693    cycles for 100 * FastMath Log10
13133   cycles for 100 * fptan
1679    cycles for 100 * FastTan
10034   cycles for 100 * fsin
1726    cycles for 100 * FastSin
14843   cycles for 100 * CRT BesselJ0
1677    cycles for 100 * FastBesselJ0

3151    cycles for 100 * Log10 (Guga)
1694    cycles for 100 * FastMath Log10
13132   cycles for 100 * fptan
1676    cycles for 100 * FastTan
10044   cycles for 100 * fsin
1730    cycles for 100 * FastSin
14852   cycles for 100 * CRT BesselJ0
1676    cycles for 100 * FastBesselJ0

3153    cycles for 100 * Log10 (Guga)
1694    cycles for 100 * FastMath Log10
13132   cycles for 100 * fptan
1677    cycles for 100 * FastTan
10028   cycles for 100 * fsin
1726    cycles for 100 * FastSin
14826   cycles for 100 * CRT BesselJ0
1674    cycles for 100 * FastBesselJ0

3150    cycles for 100 * Log10 (Guga)
1693    cycles for 100 * FastMath Log10
13134   cycles for 100 * fptan
1677    cycles for 100 * FastTan
10031   cycles for 100 * fsin
1724    cycles for 100 * FastSin
14859   cycles for 100 * CRT BesselJ0
1674    cycles for 100 * FastBesselJ0

3148    cycles for 100 * Log10 (Guga)
1694    cycles for 100 * FastMath Log10
13133   cycles for 100 * fptan
1675    cycles for 100 * FastTan
10042   cycles for 100 * fsin
1725    cycles for 100 * FastSin
14833   cycles for 100 * CRT BesselJ0
1673    cycles for 100 * FastBesselJ0

Real8   0.6989700043360187465   Log10 (Guga)
Real8   0.6989700043360188575   FastMath Log10
Real8   1.000003619996628679    fptan
Real8   1.000003614159206133    FastTan
Real8   0.4794255386042030054   fsin
Real8   0.4794255386042030054   FastSin
Real8   0.9384698072408128589   CRT BesselJ0
Real8   0.9384698072408128589   FastBesselJ0

--- ok ---
Creative coders use backward thinking techniques as a strategy.

six_L

Intel(R) Core(TM) i5-9400H CPU @ 2.50GHz (SSE4)
1498 µs for initialising five Fast* functions

1607    cycles for 100 * Log10 (Guga)
677     cycles for 100 * FastMath Log10
9076    cycles for 100 * fptan
655     cycles for 100 * FastTan
6974    cycles for 100 * fsin
659     cycles for 100 * FastSin
7928    cycles for 100 * CRT BesselJ0
694     cycles for 100 * FastBesselJ0

1634    cycles for 100 * Log10 (Guga)
678     cycles for 100 * FastMath Log10
8921    cycles for 100 * fptan
662     cycles for 100 * FastTan
6948    cycles for 100 * fsin
667     cycles for 100 * FastSin
8052    cycles for 100 * CRT BesselJ0
664     cycles for 100 * FastBesselJ0

1611    cycles for 100 * Log10 (Guga)
746     cycles for 100 * FastMath Log10
8873    cycles for 100 * fptan
656     cycles for 100 * FastTan
6967    cycles for 100 * fsin
664     cycles for 100 * FastSin
7934    cycles for 100 * CRT BesselJ0
653     cycles for 100 * FastBesselJ0

1626    cycles for 100 * Log10 (Guga)
663     cycles for 100 * FastMath Log10
8861    cycles for 100 * fptan
655     cycles for 100 * FastTan
6911    cycles for 100 * fsin
669     cycles for 100 * FastSin
7933    cycles for 100 * CRT BesselJ0
664     cycles for 100 * FastBesselJ0

1676    cycles for 100 * Log10 (Guga)
663     cycles for 100 * FastMath Log10
8868    cycles for 100 * fptan
653     cycles for 100 * FastTan
6942    cycles for 100 * fsin
671     cycles for 100 * FastSin
7947    cycles for 100 * CRT BesselJ0
668     cycles for 100 * FastBesselJ0

Real8   0.6989700043360187465   Log10 (Guga)
Real8   0.6989700043360188575   FastMath Log10
Real8   1.000003619996628679    fptan
Real8   1.000003614159206133    FastTan
Real8   0.4794255386042030054   fsin
Real8   0.4794255386042030054   FastSin
Real8   0.9384698072408128589   CRT BesselJ0
Real8   0.9384698072408128589   FastBesselJ0

-
Say you, Say me, Say the codes together for ever.

guga



AMD Ryzen 5 2400G with Radeon Vega Graphics     (SSE4)
2994 µs for initialising five Fast* functions

2720 cycles for 100 * Log10 (Guga)
2245 cycles for 100 * FastMath Log10
9354 cycles for 100 * fptan
1707 cycles for 100 * FastTan
11454 cycles for 100 * fsin
1755 cycles for 100 * FastSin
17719 cycles for 100 * CRT BesselJ0
1702 cycles for 100 * FastBesselJ0

2736 cycles for 100 * Log10 (Guga)
1702 cycles for 100 * FastMath Log10
9490 cycles for 100 * fptan
1692 cycles for 100 * FastTan
11479 cycles for 100 * fsin
1717 cycles for 100 * FastSin
17690 cycles for 100 * CRT BesselJ0
1718 cycles for 100 * FastBesselJ0

2738 cycles for 100 * Log10 (Guga)
1700 cycles for 100 * FastMath Log10
9389 cycles for 100 * fptan
1685 cycles for 100 * FastTan
11528 cycles for 100 * fsin
1753 cycles for 100 * FastSin
17613 cycles for 100 * CRT BesselJ0
1678 cycles for 100 * FastBesselJ0

2872 cycles for 100 * Log10 (Guga)
1713 cycles for 100 * FastMath Log10
9539 cycles for 100 * fptan
1719 cycles for 100 * FastTan
11417 cycles for 100 * fsin
1680 cycles for 100 * FastSin
18221 cycles for 100 * CRT BesselJ0
1680 cycles for 100 * FastBesselJ0

2736 cycles for 100 * Log10 (Guga)
1718 cycles for 100 * FastMath Log10
9320 cycles for 100 * fptan
1702 cycles for 100 * FastTan
11544 cycles for 100 * fsin
1694 cycles for 100 * FastSin
17514 cycles for 100 * CRT BesselJ0
1687 cycles for 100 * FastBesselJ0

Real8 0.6989700043360187465 Log10 (Guga)
Real8 0.6989700043360188575 FastMath Log10
Real8 1.000003619996628679 fptan
Real8 1.000003614159206133 FastTan
Real8 0.4794255386042030054 fsin
Real8 0.4794255386042030054 FastSin
Real8 0.9384698072408128589 CRT BesselJ0
Real8 0.9384698072408128589 FastBesselJ0

--- ok ---


Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com