Author Topic: CRT vs GSL  (Read 864 times)

jj2007

  • Member
  • *****
  • Posts: 7549
  • Assembler is fun ;-)
    • MasmBasic
CRT vs GSL
« on: October 18, 2016, 05:06:20 AM »
Timings for a Million iterations, trusty old CRT vs GNU Scientific Library:

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
J0 =-0.177596771314338304...    Wolfram Alpha
J0 =-0.17759677131433826        crt
J0 =-0.17759677131433829        Gsl

Y0 =-0.308517625249033780...    Wolfram Alpha
Y0 =-0.30851762524903474        crt
Y0 =-0.30851762524903376        Gsl

Result: -0.17759677131433826    53 ms for CRT J0
Result: -0.17759677131433829    519 ms for GSL

Result: -0.30851762524903474    115 ms for CRT Y0
Result: -0.30851762524903376    514 ms for GSL

jj2007

  • Member
  • *****
  • Posts: 7549
  • Assembler is fun ;-)
    • MasmBasic
Re: CRT vs GSL
« Reply #1 on: October 18, 2016, 11:31:58 AM »
I've tried to add an example from Intel's Math Kernel Library - it's a free 613MB download.

Technically speaking, the MKL is accessible from assembler, see attachment, but I couldn't find an example for the simple Bessel functions ::)

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Intel(R) Math Kernel Library Version 2017.0.0 Product Build 20160801 for 32-bit applications
CPU current:    2.89 GHz
CPU max:        2.50 GHz
CPU clocks:     2.49 GHz

jj2007

  • Member
  • *****
  • Posts: 7549
  • Assembler is fun ;-)
    • MasmBasic
Re: CRT vs GSL
« Reply #2 on: October 19, 2016, 12:29:35 AM »
Found another test case for the MKL - square root:
Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Intel(R) Math Kernel Library Version 2017.0.0 Product Build 20160801 for 32-bit applications

SQRT: 10 elements, 1000000 loops
 70 ms for FPU, fsqrt
110 ms for MKL, vdSqrt

SQRT: 1000 elements, 1000000 loops
7815 ms for FPU, fsqrt
3573 ms for MKL, vdSqrt

sq =1.000000000000000000        MKL square root
sq =1.414213562373095145        MKL square root
sq =1.732050807568877193        MKL square root
sq =2.000000000000000000        MKL square root
sq =2.236067977499789805        MKL square root

This loads a double array with 0, 1, 2 etc and then calculates square roots for each element. For small number of elements (<=20), the FPU is faster, but for high element counts, the MKL is twice as fast.

Source and exe attached, but the MKL is required; the exe will invite you to visit Intel if it can't find it. The download is free but registration is required.

(the *.asc source is in RTF format, opens in Wordpad but much better in RichMasm)
« Last Edit: October 19, 2016, 08:37:53 PM by jj2007 »

jj2007

  • Member
  • *****
  • Posts: 7549
  • Assembler is fun ;-)
    • MasmBasic
Re: CRT vs GSL
« Reply #3 on: October 20, 2016, 09:31:33 AM »
Another test, this time with the Intel compiler's libmmd.dll, which I just discovered on my Win7-64 machine:

include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init

  Dll "%CommonProgramFiles%\Intel\Shared Libraries\redist\ia32\compiler\libmmd.dll"
  Declare double j0, C:1

  PrintLine "Bessel J0(5)=-0.17759677131433830435      (Wolfram Alpha)"      ; site
  Print Str$("Bessel J0(5)=%Jf      (Intel)\n", j0(5.0)#)
  invoke crt__j0, FP8(5.0)
  Print Str$("Bessel J0(5)=%Jf      (CRT)\n\n", ST(0)#)

  PrintLine "Bessel J0(7)= 0.30007927051955559665      (Wolfram Alpha)"
  Print Str$("Bessel J0(7)= %Jf      (Intel)\n", j0(7.0)#)
  invoke crt__j0, FP8(7.0)
  Inkey Str$("Bessel J0(7)= %Jf      (CRT)", ST(0)#)
EndOfCode


The output shows that the Intel compiler returns REAL10 precision:
Code: [Select]
Bessel J0(5)=-0.17759677131433830435    (Wolfram Alpha)
Bessel J0(5)=-0.1775967713143383044     (Intel)
Bessel J0(5)=-0.1775967713143382590     (CRT)

Bessel J0(7)= 0.30007927051955559665    (Wolfram Alpha)
Bessel J0(7)= 0.3000792705195555967     (Intel)
Bessel J0(7)= 0.3000792705195550060     (CRT)

Does everybody have the file C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\ia32\compiler\libmmd.dll ?
These DLLs are available here, but I can't remember that I installed them knowingly; perhaps some other package did it ::)