Exponentiation

JK · July 31, 2022, 06:18:09 AM

I want do exponentiation of floats, so i had a look at Raymonds FPU tutorial and found sample code. This is what i use for testing:

Code Select


include <windows.inc>
includelib kernel32.lib


.data

r1 real10  2.0
r2 real10 -2.4
r3 real10 -100.123
i4 dword   4


qf qword ?              
qs qword ?              
qe qword ?              


.code


MAIN PROC 
;*************************************************************************************
; main proc
;*************************************************************************************


  invoke QueryPerformanceCounter, addr qs


  FINIT

  xor esi, esi
  .Repeat

    FILD i4
;    FLD r1
    FLD r2
;    FLD r3

    FYL2X                        ;ST(0)=y*log2(x), ST(1)=zzz
    FLD st(0)                    ;make a second copy, ST(0)=y*log2(x), ST(1)=y*log2(x), ST(2)=zzz
    FRNDINT                      ;round it to an integer, ST(0)=int[y*log2(x)], ST(1)=y*log2(x), ST(2)=zzz
    FSUB st(1), st(0)            ;this will leave only a fractional portion in ST(1), ST(0)=int[y*log2(x)], ST(1)=y*log2(x)-int[y*log2(x)], ST(2)=zzz
    FXCH                         ;ST(0)=y*log2(x)-int[y*log2(x)], ST(1)=int[y*log2(x)], ST(2)=zzz
    F2XM1                        ;get the fractional power of 2 (minus 1), ST(0)=2ST(0)-1, ST(1)=int[y*log2(x)], ST(2)=zzz
    FLD1                         ;ST(0)=1, ST(1)=2ST(0)-1, ST(2)=int[y*log2(x)], ST(3)=zzz
    FADD                         ;add the 1 to ST(1) and POP ST(0), ST(0)=2ST(0), ST(1)=int[y*log2(x)], ST(2)=zzz
    FSCALE                       ;add the integer in ST(1) to the exponent of ST(0), effectively multiplying the content of ST(0) by 2int, and yielding the final result of x^y, ST(0)=x^y, ST(1)=int[
    FSTP st(1)                   ;the content of ST(1) has become useless
                                 ;overwrite the content of ST(1) with the result and POP ST(0)
                                 ;ST(0)=x^y, ST(1)=zzz

    inc esi
  .Until esi>=1000000


  invoke QueryPerformanceCounter, addr qe
  invoke QueryPerformanceFrequency, addr qf


  FILD qe
  FILD qs
  FSUBP

  FILD qf
  FDIVP

int 3

RET
MAIN ENDP


;*************************************************************************************


END MAIN

The code delivers correct results. Running it 1 mio. times in a loop takes about 1 second on my (old) laptop. Running the same code (x = -2.4^4) in a high level language takes about 0.1 seconds, that is 1/10 of the above code. Obviously there is method of doing it 10 times faster - how?

JK

jj2007 · July 31, 2022, 06:45:33 AM

Code Select

    FILD i4
;    FLD r1
    FLD r2
;    FLD r3

    FYL2X                        ;ST(0)=y*log2(x), ST(1)=zzz

At this point, you have a NAN on your FPU.

Code Select

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
493 ms for JK's code yielding 33.178
423 ms for JJ's code yielding 33.178

490 ms for JK's code yielding 33.178
422 ms for JJ's code yielding 33.178

491 ms for JK's code yielding 33.178
427 ms for JJ's code yielding 33.178

494 ms for JK's code yielding 33.178
432 ms for JJ's code yielding 33.178

489 ms for JK's code yielding 33.178
425 ms for JJ's code yielding 33.178

This is 10 Million times, so it's a bit faster than the HLL. Even your (slightly modified) version is a factor 2 faster than the HLL, but of course, it's difficult to compare without having the HLL executable.

ExpXY(2.4, 4, result) ; calculates 2.4^4 and saves it to a REAL8 variable called result

That's what I use

daydreamer · July 31, 2022, 06:47:50 AM

What hll?
Cpp for example use only float=real4 and double =real8
There is precision settings in fpu control word you can test
Newer math library can take advantage of SSE which is easier to perform several math ops simultaneously, instead of your scalar fpu

jj2007 · July 31, 2022, 07:11:53 AM

Quote from: daydreamer on July 31, 2022, 06:47:50 AMNewer math library can take advantage of SSE which is easier to perform several math ops simultaneously, instead of your scalar fpu

Interesting. Can you link to one that performs -2.4^4 using SIMD instructions? I'm curious.

JK · July 31, 2022, 07:43:24 AM

Thanks jj,

my bad, my code is buggy and a logarithm of a negative number results in a nan (as i should have learned in school). So the speed of code is ok, but my implementation definitely needs improvement.

What must i do, if x is negative in x^y (which basically is not forbidden)? But my implementation leads to calculating the logarithm of a negative number (which is forbidden).

JK

jj2007 · July 31, 2022, 07:48:20 AM

Quote from: JK on July 31, 2022, 07:43:24 AMmy code is buggy

Your code was almost perfect, but you forgot to pop the result, see fstp result.

QuoteWhat must i do, if x is negative in x^y (which basically is not forbidden)?

It is ok if your exponent is an integer. But launch the Windows calculator and try -2.4^3.999

HSE · July 31, 2022, 08:14:23 AM

Quote from: JK on July 31, 2022, 06:18:09 AM
Obviously there is method of doing it 10 times faster - how?

Code Select

-2.4 * -2.4 * -2.4 *-2.4

JK · July 31, 2022, 08:32:35 AM

Yes, i forgot to pop the FPU. If you google for -2.4^3.999 you get a calculator and as a result -33.1485667591 (which implies -(2.4^3.999, but (-2.4)^3.999 yields the same as a positve result), my old pocket calculator can do it too. Why the Windows calculator rejects it as "improper entry" - i don´t know.

Mathematically -2.4^3.999 is not forbidden, so what must i do, if x is negative in x^y? Flip sign if negative, look at the exponent, if the exponent rounded to an integer mod 2 = 0 (that is, an even number) the result is positve, negative otherwise - is this correct?

@HSE: yes , i already thought of this, but what if the exponent is not an integer but a float?

JK

jj2007 · July 31, 2022, 08:58:38 AM

Quote from: JK on July 31, 2022, 08:32:35 AMMathematically -2.4^3.999 is not forbidden, so what must i do, if x is negative in x^y?

Windows calculator:

Code Select

-2.4^3.999999999999=	-33.1776
-2.4^4=			+33.1776

That an even integer exponent produces a positive number is a convention, not math. If your number is negative, flip the sign, perform the calc as usual, then flip the sign for the result.

jack · July 31, 2022, 09:14:27 AM

Windows calculator doesn't handle complex numbers, when you try to raise -2.4 ^ 3.999 it's the same as (-2.4) ^ 3.999 which give a complex number
-2.4 ^ 3.999 can be written as -(2.4 ^ 3.999) which is totally different from (-2.4) ^ 3.999

HSE · July 31, 2022, 11:16:04 AM

Quote from: JK on July 31, 2022, 08:32:35 AMBut what if the exponent is not an integer but a float?

Make another tests

NoCforMe · July 31, 2022, 11:20:13 AM

Don't get it. How is -2.4 different from (-2.4)? All those parens do is form a group, which so far as I can tell does nothing here. Or am I missing something?

And since when does -2.4 ^ 3.999 yield a complex number?

BTW, Windows calculator doesn't allow fractional exponents, at least so far as I could tell. (Gives me "invalid input".)

jack · July 31, 2022, 11:56:37 AM

hi NoCforMe
the expression -2.4 is interpreted as the negative of 2.4, the - operator (sign) is not bound to the number 2.4
the expression -2.4 ^ 3.999 means the negative of 2.4 ^ 3.999 which is not complex but (-2.4) ^ 3.999 is complex
when you enter 2.4 in the calculator and then negate the number and then try to raise to some power it's as if you had entered (-2.4) ^ some power

NoCforMe · July 31, 2022, 01:33:54 PM

I still think you're wrong. Perfectly OK to take the power of a negative number: the sign of the result flips between positive and negative:

-2^2 = 4
-2^3 = -8
-2^4 = 16
-2^5 = -32

and so on.

So far as complex numbers go, I think you're thinking of roots, not powers, as in i = sqrt(-1).

Amiright?

TimoVJL · July 31, 2022, 02:03:29 PM

This might been here earlier, in year 2019 ?

Code Select

#include <stdio.h>
#include <math.h>
int __cdecl main(void)
{
	for (double d = 1.7; d < 2.3; d += 0.1)
		printf("%f\n", pow(-1.23, d));
	return 0;
}

Code Select

nan
nan
nan
1.512900
nan
nan

The MASM Forum

News:

Exponentiation

JK

jj2007

daydreamer

jj2007

JK

jj2007

HSE

JK

jj2007

jack

HSE

NoCforMe

jack

NoCforMe

TimoVJL