News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Exponentiation

Started by JK, July 31, 2022, 06:18:09 AM

Previous topic - Next topic

JK

I want do exponentiation of floats, so i had a look at Raymonds FPU tutorial and found sample code. This is what i use for testing:


include <windows.inc>
includelib kernel32.lib


.data

r1 real10  2.0
r2 real10 -2.4
r3 real10 -100.123
i4 dword   4


qf qword ?             
qs qword ?             
qe qword ?             


.code


MAIN PROC
;*************************************************************************************
; main proc
;*************************************************************************************


  invoke QueryPerformanceCounter, addr qs


  FINIT

  xor esi, esi
  .Repeat

    FILD i4
;    FLD r1
    FLD r2
;    FLD r3

    FYL2X                        ;ST(0)=y*log2(x), ST(1)=zzz
    FLD st(0)                    ;make a second copy, ST(0)=y*log2(x), ST(1)=y*log2(x), ST(2)=zzz
    FRNDINT                      ;round it to an integer, ST(0)=int[y*log2(x)], ST(1)=y*log2(x), ST(2)=zzz
    FSUB st(1), st(0)            ;this will leave only a fractional portion in ST(1), ST(0)=int[y*log2(x)], ST(1)=y*log2(x)-int[y*log2(x)], ST(2)=zzz
    FXCH                         ;ST(0)=y*log2(x)-int[y*log2(x)], ST(1)=int[y*log2(x)], ST(2)=zzz
    F2XM1                        ;get the fractional power of 2 (minus 1), ST(0)=2ST(0)-1, ST(1)=int[y*log2(x)], ST(2)=zzz
    FLD1                         ;ST(0)=1, ST(1)=2ST(0)-1, ST(2)=int[y*log2(x)], ST(3)=zzz
    FADD                         ;add the 1 to ST(1) and POP ST(0), ST(0)=2ST(0), ST(1)=int[y*log2(x)], ST(2)=zzz
    FSCALE                       ;add the integer in ST(1) to the exponent of ST(0), effectively multiplying the content of ST(0) by 2int, and yielding the final result of x^y, ST(0)=x^y, ST(1)=int[
    FSTP st(1)                   ;the content of ST(1) has become useless
                                 ;overwrite the content of ST(1) with the result and POP ST(0)
                                 ;ST(0)=x^y, ST(1)=zzz

    inc esi
  .Until esi>=1000000


  invoke QueryPerformanceCounter, addr qe
  invoke QueryPerformanceFrequency, addr qf


  FILD qe
  FILD qs
  FSUBP

  FILD qf
  FDIVP

int 3

RET
MAIN ENDP


;*************************************************************************************


END MAIN


The code delivers correct results. Running it 1 mio. times in a loop takes about 1 second on my (old) laptop. Running the same code (x = -2.4^4) in a high level language takes about 0.1 seconds, that is 1/10 of the above code. Obviously there is method of doing it 10 times faster - how?

JK

jj2007

    FILD i4
;    FLD r1
    FLD r2
;    FLD r3

    FYL2X                        ;ST(0)=y*log2(x), ST(1)=zzz


At this point, you have a NAN on your FPU.

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
493 ms for JK's code yielding 33.178
423 ms for JJ's code yielding 33.178

490 ms for JK's code yielding 33.178
422 ms for JJ's code yielding 33.178

491 ms for JK's code yielding 33.178
427 ms for JJ's code yielding 33.178

494 ms for JK's code yielding 33.178
432 ms for JJ's code yielding 33.178

489 ms for JK's code yielding 33.178
425 ms for JJ's code yielding 33.178


This is 10 Million times, so it's a bit faster than the HLL. Even your (slightly modified) version is a factor 2 faster than the HLL, but of course, it's difficult to compare without having the HLL executable.

ExpXY(2.4, 4, result)          ; calculates 2.4^4 and saves it to a REAL8 variable called result

That's what I use :cool:

daydreamer

What hll?
Cpp for example use only float=real4 and double =real8
There is precision settings in fpu control word you can test
Newer math library can take advantage of SSE which is easier to perform several math ops simultaneously, instead of your scalar fpu
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

jj2007

Quote from: daydreamer on July 31, 2022, 06:47:50 AMNewer math library can take advantage of SSE which is easier to perform several math ops simultaneously, instead of your scalar fpu

Interesting. Can you link to one that performs -2.4^4 using SIMD instructions? I'm curious.

JK

Thanks jj,

:sad: my bad, my code is buggy and a logarithm of a negative number results in a nan (as i should have learned in school). So the speed of code is ok, but my implementation definitely needs improvement.

What must i do, if x is negative in x^y (which basically is not forbidden)? But my implementation leads to calculating the logarithm of a negative number (which is forbidden).

JK   

jj2007

Quote from: JK on July 31, 2022, 07:43:24 AMmy code is buggy

Your code was almost perfect, but you forgot to pop the result, see fstp result.

QuoteWhat must i do, if x is negative in x^y (which basically is not forbidden)?

It is ok if your exponent is an integer. But launch the Windows calculator and try -2.4^3.999 :rolleyes:

HSE

Quote from: JK on July 31, 2022, 06:18:09 AM
Obviously there is method of doing it 10 times faster - how?

-2.4 * -2.4 * -2.4 *-2.4

Equations in Assembly: SmplMath

JK

Yes, i forgot to pop the FPU. If you google for -2.4^3.999 you get a calculator and as a result -33.1485667591 (which implies -(2.4^3.999, but (-2.4)^3.999 yields the same as a positve result), my old pocket calculator can do it too. Why the Windows calculator rejects it as "improper entry" - i donĀ“t know.

Mathematically -2.4^3.999 is not forbidden, so what must i do, if x is negative in x^y? Flip sign if negative, look at the exponent, if the exponent rounded to an integer mod 2 = 0 (that is, an even number) the result is positve, negative otherwise - is this correct?


@HSE: yes , i already thought of this, but what if the exponent is not an integer but a float?

JK

jj2007

Quote from: JK on July 31, 2022, 08:32:35 AMMathematically -2.4^3.999 is not forbidden, so what must i do, if x is negative in x^y?

Windows calculator:
-2.4^3.999999999999= -33.1776
-2.4^4= +33.1776


That an even integer exponent produces a positive number is a convention, not math. If your number is negative, flip the sign, perform the calc as usual, then flip the sign for the result.

jack

Windows calculator doesn't handle complex numbers, when you try to raise -2.4 ^ 3.999 it's the same as (-2.4) ^ 3.999 which give a complex number
-2.4 ^ 3.999 can be written as -(2.4 ^ 3.999) which is totally different from (-2.4) ^ 3.999

HSE

Quote from: JK on July 31, 2022, 08:32:35 AMBut what if the exponent is not an integer but a float?

Make another tests  :biggrin:
Equations in Assembly: SmplMath

NoCforMe

Don't get it. How is -2.4 different from (-2.4)? All those parens do is form a group, which so far as I can tell does nothing here. Or am I missing something?

And since when does -2.4 ^ 3.999 yield a complex number?

BTW, Windows calculator doesn't allow fractional exponents, at least so far as I could tell. (Gives me "invalid input".)
Assembly language programming should be fun. That's why I do it.

jack

hi  NoCforMe
the expression -2.4 is interpreted as the negative of 2.4, the - operator (sign) is not bound to the number 2.4
the expression -2.4 ^ 3.999 means the negative of 2.4 ^ 3.999 which is not complex but (-2.4) ^ 3.999 is complex
when you enter 2.4 in the calculator and then negate the number and then try to raise to some power it's as if you had entered (-2.4) ^ some power

NoCforMe

I still think you're wrong. Perfectly OK to take the power of a negative number: the sign of the result flips between positive and negative:

-2^2 = 4
-2^3 = -8
-2^4 = 16
-2^5 = -32

and so on.

So far as complex numbers go, I think you're thinking of roots, not powers, as in i = sqrt(-1).

Amiright?
Assembly language programming should be fun. That's why I do it.

TimoVJL

This might been here earlier, in year 2019 ?
#include <stdio.h>
#include <math.h>
int __cdecl main(void)
{
for (double d = 1.7; d < 2.3; d += 0.1)
printf("%f\n", pow(-1.23, d));
return 0;
}
nan
nan
nan
1.512900
nan
nan
May the source be with you