### Author Topic: FastMath

« on: September 03, 2020, 09:31:58 AM »
Some mathematical functions like sine, cosine, tangens etc are a little bit slow, so a speedier alternative might be useful. This is what the new FastMath macro is good for: You feed it with a range of y=f(x) values, and then you can use a new fast function. Example:

FastMath MyBessel     ; -------- create the MyBessel function --------
xor ecx, ecx                  ; counter for the arrays
For_ fct=-10.0 To 20.0 Step 0.1
fld fct
fst x(ecx)              ; assign to x() array
fstp REAL10 ptr [edi]   ; and to the FastMath macro
void gsl_sf_bessel_J0(fct)
fst y(ecx)              ; assign to y() array
fstp REAL10 ptr [edi+REAL10]    ; and to the FastMath macro
inc ecx
Next
FastMath

The usage of the new function is pretty simple, for example:
`Print Str\$("MyBessel(10.0)=%f\n", MyBessel(10.0))`
Attached a full example with a graphical interface. Building requires MasmBasic version 3 September 2020. It may ask you at a certain point to install the GSL dlls; if that works (it should, it's only 3MB), you'll see them afterwards in \masm32\MasmBasic\GnuScLib\DLL\*.dll

« Reply #1 on: September 05, 2020, 06:08:53 AM »
New version, needs MasmBasic version 4 September to build. The attachment does no longer need the Gnu Scientific Library because I discovered that the ordinary CRT has a Bessel function, too Now there are two versions of FastMath available:

a)   FastMath FastLog10    ; define a math function
For_ fct=0.0 To 10.0 Step 0.5   ; max 10,000 iterations
fld fct         ; X
fstp REAL10 ptr [edi]
void Log10(fct)         ; Y (built-in MasmBasic function)
fstp REAL10 ptr [edi+REAL10]
Next
FastMath
Print Str\$("Log(5.0)=%Jf", FastLog10(5)v)

b) if you save the REAL10 X/Y pairs to disk, you can use e.g. the one-liner FastMath MyLog10, "log10.dat"

For the one-liner, the second arg is a filename or a resource ID (data are in a RCDATA resource), or even a URL.

« Reply #2 on: February 02, 2021, 01:12:48 AM »
Inspired by srvaldez at the FreeBasic forum:
`Intel(R) Core(TM) i5-2450M CPU @ 2.50GHzExpE_slow: 1987 ms for the sum 3.3071e+300ExpE_fast:  699 ms for the sum 3.3074e+300ExpE_slow: 1974 ms for the sum 3.3071e+300ExpE_fast:  701 ms for the sum 3.3074e+300ExpE_slow: 1982 ms for the sum 3.3071e+300ExpE_fast:  692 ms for the sum 3.3074e+300ExpE_slow: 1952 ms for the sum 3.3071e+300ExpE_fast:  691 ms for the sum 3.3074e+300Speed gain factor: 2.83`
Source (attached):
`include \masm32\MasmBasic\MasmBasic.inc  SetGlobals ct, sum:REAL10, fct:REAL8, timeSlow, timeFast  Init  PrintCpu 0  FastMath ExpE_fast For_ fct=0.0 To 680.0 Step 0.08 fld fct ; X fstp REAL10 ptr [edi] void ExpE(fct) ; Y (ExpE is a built-in MasmBasic function) fstp REAL10 ptr [edi+REAL10] add edi, 2*REAL10 Next   FastMath  push 4  .Repeat NanoTimer() Clr sum For_ ct=1 To 100000-1 xor ecx, ecx .Repeat void ExpE(ecx) ; calculates ExpE(ecx) and leaves the result in ST(0) fld sum fadd ; sum+=ExpE(ecx) fstp sum inc ecx .Until ecx>680 Next add timeSlow, NanoTimer(ms) Print Str\$("ExpE_slow: %___i ms", eax), Str\$(" for the sum %5e\n", sum)  ; expected sum =  3.31e+300 fldz fstp sum ; Clr sum NanoTimer() Clr sum For_ ct=1 To 100000-1 xor ecx, ecx .Repeat void ExpE_fast(ecx) ; calculates ExpE(ecx) approximately and leaves the result in ST(0) fld sum fadd fstp sum inc ecx .Until ecx>680 Next add timeFast, NanoTimer(ms) Print Str\$("ExpE_fast: %___i ms", eax), Str\$(" for the sum %5e\n\n", sum)  ; expected sum =  3.31e+300 dec stack  .Until Sign?  Inkey Str\$("Speed gain factor: %3f", timeSlow/timeFast)EndOfCode`