News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

FastMath

Started by jj2007, September 03, 2020, 09:31:58 AM

Previous topic - Next topic

jj2007

Some mathematical functions like sine, cosine, tangens etc are a little bit slow, so a speedier alternative might be useful. This is what the new FastMath macro is good for: You feed it with a range of y=f(x) values, and then you can use a new fast function. Example:

  FastMath MyBessel     ; -------- create the MyBessel function --------
  xor ecx, ecx                  ; counter for the arrays
  For_ fct=-10.0 To 20.0 Step 0.1
        fld fct
        fst x(ecx)              ; assign to x() array
        fstp REAL10 ptr [edi]   ; and to the FastMath macro
        void gsl_sf_bessel_J0(fct)      ; see GNU Scientific Library Reference
        fst y(ecx)              ; assign to y() array
        fstp REAL10 ptr [edi+REAL10]    ; and to the FastMath macro
        add edi, 2*REAL10
        inc ecx
  Next
  FastMath


The usage of the new function is pretty simple, for example:
Print Str$("MyBessel(10.0)=%f\n", MyBessel(10.0))

Attached a full example with a graphical interface. Building requires MasmBasic version 3 September 2020. It may ask you at a certain point to install the GSL dlls; if that works (it should, it's only 3MB), you'll see them afterwards in \masm32\MasmBasic\GnuScLib\DLL\*.dll

jj2007

New version, needs MasmBasic version 4 September to build. The attachment does no longer need the Gnu Scientific Library because I discovered that the ordinary CRT has a Bessel function, too :cool:

Now there are two versions of FastMath available:

a)   FastMath FastLog10    ; define a math function
       For_ fct=0.0 To 10.0 Step 0.5   ; max 10,000 iterations
                fld fct         ; X
               fstp REAL10 ptr [edi]
                void Log10(fct)         ; Y (built-in MasmBasic function)
               fstp REAL10 ptr [edi+REAL10]
                add edi, 2*REAL10
        Next
  FastMath
  Print Str$("Log(5.0)=%Jf", FastLog10(5)v)


b) if you save the REAL10 X/Y pairs to disk, you can use e.g. the one-liner FastMath MyLog10, "log10.dat"

For the one-liner, the second arg is a filename or a resource ID (data are in a RCDATA resource), or even a URL.

jj2007

Inspired by srvaldez at the FreeBasic forum:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
ExpE_slow: 1987 ms for the sum 3.3071e+300
ExpE_fast:  699 ms for the sum 3.3074e+300

ExpE_slow: 1974 ms for the sum 3.3071e+300
ExpE_fast:  701 ms for the sum 3.3074e+300

ExpE_slow: 1982 ms for the sum 3.3071e+300
ExpE_fast:  692 ms for the sum 3.3074e+300

ExpE_slow: 1952 ms for the sum 3.3071e+300
ExpE_fast:  691 ms for the sum 3.3074e+300

Speed gain factor: 2.83


Source (attached):
include \masm32\MasmBasic\MasmBasic.inc
  SetGlobals ct, sum:REAL10, fct:REAL8, timeSlow, timeFast
  Init
  PrintCpu 0
  FastMath ExpE_fast
For_ fct=0.0 To 680.0 Step 0.08
fld fct ; X
fstp REAL10 ptr [edi]
void ExpE(fct) ; Y (ExpE is a built-in MasmBasic function)
fstp REAL10 ptr [edi+REAL10]
add edi, 2*REAL10
Next
  FastMath

  push 4
  .Repeat
NanoTimer()
Clr sum
For_ ct=1 To 100000-1
xor ecx, ecx
.Repeat
void ExpE(ecx) ; calculates ExpE(ecx) and leaves the result in ST(0)
fld sum
fadd ; sum+=ExpE(ecx)
fstp sum
inc ecx
.Until ecx>680
Next
add timeSlow, NanoTimer(ms)
Print Str$("ExpE_slow: %___i ms", eax), Str$(" for the sum %5e\n", sum)  ; expected sum =  3.31e+300
fldz
fstp sum ; Clr sum
NanoTimer()
Clr sum
For_ ct=1 To 100000-1
xor ecx, ecx
.Repeat
void ExpE_fast(ecx) ; calculates ExpE(ecx) approximately and leaves the result in ST(0)
fld sum
fadd
fstp sum
inc ecx
.Until ecx>680
Next
add timeFast, NanoTimer(ms)
Print Str$("ExpE_fast: %___i ms", eax), Str$(" for the sum %5e\n\n", sum)  ; expected sum =  3.31e+300
dec stack
  .Until Sign?
  Inkey Str$("Speed gain factor: %3f", timeSlow/timeFast)
EndOfCode