### Author Topic: Fast Exp approximation  (Read 1510 times)

#### jj2007

• Member
• Posts: 10636
• Assembler is fun ;-)
##### Re: Fast Exp approximation
« Reply #15 on: August 12, 2020, 09:04:50 PM »
I see this:

I actually meant "what kind of pattern do you see, and how could we use it to fumble mantissa and exponent"

Quote
Btw..how you managed to create this curve window ? I like that a lot

See ArrayPlot in the help file. All you need is an array:

Event Paint
ArrayPlot MyData(), RgbCol(200, 255, 255, 0), lines=4
ArrayPlot exit, "Exponential function"

#### Siekmanski

• Member
• Posts: 2365
##### Re: Fast Exp approximation
« Reply #16 on: August 12, 2020, 09:05:31 PM »
Hi Guga,

You said: MY main concern is about precision.
So, a very fast Chebyshev polynomial approximation is out of the question?

For what purpose do you need the Exp function?
Creative coders use backward thinking techniques as a strategy.

#### guga

• Member
• Posts: 1344
• Assembly is a state of art.
##### Re: Fast Exp approximation
« Reply #17 on: August 12, 2020, 09:40:55 PM »
Hi Guga,

You said: MY main concern is about precision.
So, a very fast Chebyshev polynomial approximation is out of the question?

For what purpose do you need the Exp function?

Hi Marinus

It depends of the level of precision of the Chebyshev polynomial. I think it´s usefull for what we are doing. Even considering some loss of precision, it seems to me, it won´t affect the final result.

The exp functon i was using to test that W Lambert function i told on  the other post. Even if we won´t use it for the watermark r emover, we can use for other puposes in other image filters that we could make. I gave a try on that laplace 2D algorithm that used this exp and pow functions to compute the W function. But, after you explained, i understood better this laplace thing, but i wwant to give a try calculating the sigma as a standard deviation of the whole image and see if we will need this Lambert algoithm or not.

But, even if we wouldn´t need it we can use faster pow and exp to build other algorithms such as a faster (and more accurate) CieLab for example or the other algorithm i tried last year (HSM or something)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

#### guga

• Member
• Posts: 1344
• Assembly is a state of art.
##### Re: Fast Exp approximation
« Reply #18 on: August 12, 2020, 10:04:51 PM »
I see this:

I actually meant "what kind of pattern do you see, and how could we use it to fumble mantissa and exponent"

Quote
Btw..how you managed to create this curve window ? I like that a lot

See ArrayPlot in the help file. All you need is an array:

Event Paint
ArrayPlot MyData(), RgbCol(200, 255, 255, 0), lines=4
ArrayPlot exit, "Exponential function"

Hi JJ

You mean, the result ?

I took a look at the result and compared with the same value as in wolframalpha and...for my surprise, my algo seems to be 100% accurate (At least until the 13th digit) !

These are the results i´ve got !

When using the input as Real8 i see this number:

exp(5) = 148.4131591025766

and in wolframalpha, the result is:

exp(5) = 148.4131591025766034211155800405522796234876675938789890467...

exp(-5) = 6.737946999085467e-3

in wolframalpha results in:

exp(-5) = 6.737946999085467096636048423148424248849585027355085430e-3

When i use the input as Real4(Float), i see this:

exp(5) = 148.4131591025766
exp(-5) = 6.737946999085467e-3

When i use the input as int, i see this:
exp(5) = 148.4131591025766
exp(-5) = 6.737946999085467e-3

This really surprises me, because on my initial tests, the original version from windows10 had a lack of precision after the 6th or 7th digit, but, somehow i managed to fix this damn algo, regardless the input format. So, even a int or Real4 value will result on a precise value without loss
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

#### nidud

• Member
• Posts: 1989
##### Re: Fast Exp approximation
« Reply #19 on: August 13, 2020, 01:29:54 AM »
I got 30 times "error A2071:initializer magnitude too large for specified size"

Then I switched to UAsm64 (see OPT_Assembler at the end), and everything was ok

@Nidud: with AsmC /Znk I get that error A2071

Well, it is an erroneous number according to the specification used:

dq 2.42294657314453310e-310

float.inc

DBL_MAX equ 1.7976931348623158e+308
DBL_MIN equ 2.2250738585072014e-308

From the regression test (float8.asm).

Thus the exp2_table uses HEX values I will assume they're still in range according to the above.

exp.asm
exp2_table.asm

However, ML and ML64 version 14 do not produce this error.

#### jj2007

• Member
• Posts: 10636
• Assembler is fun ;-)
##### Re: Fast Exp approximation
« Reply #20 on: August 13, 2020, 03:33:04 AM »
@Nidud: with AsmC /Znk I get that error A2071
Quote
Well, it is an erroneous number according to the specification used:

dq 2.42294657314453310e-310

...
However, ML and ML64 version 14 do not produce this error.

Indeed. And Olly reports 2.4229465731445330820e-310 in ST(0) after a fld with that number.

Code: [Select]
`include \masm32\MasmBasic\MasmBasic.incExpTable dq 2.42294657314453310e-310 ; invalid number for AsmC  Init  fld ExpTable  fld FP10(1.0e310)  fmul  PrintLine "expected:  2.42294657314453310"  PrintLine Str\$("obtained:  %Jf", ST(0)v)EndOfCode`
Output:
Code: [Select]
`expected:  2.42294657314453310obtained:  2.422946573144533082`
Using 2.4e-320, a certain loss of precision creeps in:
Code: [Select]
`expected:  2.42294657314453310obtained:  2.422897927205473053`

#### HSE

• Member
• Posts: 1399
• <AMD>< 7-32>
##### Re: Fast Exp approximation
« Reply #21 on: August 13, 2020, 05:12:28 AM »
DBL_MAX equ 1.7976931348623158e+308
DBL_MIN equ 2.2250738585072014e-308

That is Normalized range.

Intel specification say that only 80 bit denormalized values can be loaded without to raise an exception...  I don't know, here I have an AMD

#### jj2007

• Member
• Posts: 10636
• Assembler is fun ;-)
##### Re: Fast Exp approximation
« Reply #22 on: August 13, 2020, 05:24:35 AM »
Yep, but the hardware allows much more: at e-320 you still have over 4 digits of precision, more than enough for most purposes.

#### nidud

• Member
• Posts: 1989
##### Re: Fast Exp approximation
« Reply #23 on: August 13, 2020, 06:23:14 AM »
The Masm compatible version (@Version) for Asmc is 8 and 10 for Asmc64, but even Masm v12 will reject this.

#### jj2007

• Member
• Posts: 10636
• Assembler is fun ;-)
##### Re: Fast Exp approximation
« Reply #24 on: August 13, 2020, 08:11:51 AM »
http://www.website.masmforum.com/tutorials/fptute/fpuchap4.htm
Quote
If the source is a denormalized number, a Denormal exception will be detected, setting the related flag in the Status Word. The value will still be loaded and normalized if possible.

This is exactly what you can observe with Olly when loading from somevar REAL8 2.42294657314453310e-320 - the D flag is set. The question is how to deal with it. IMHO UAsm does it right: accept denormal values. If a programmer insists to work in this exotic range, he should know how to handle the loss of precision. A warning would be ok, though.

#### HSE

• Member
• Posts: 1399
• <AMD>< 7-32>
##### Re: Fast Exp approximation
« Reply #25 on: August 13, 2020, 08:24:47 AM »

#### nidud

• Member
• Posts: 1989
##### Re: Fast Exp approximation
« Reply #26 on: August 13, 2020, 09:59:25 AM »
http://www.website.masmforum.com/tutorials/fptute/fpuchap4.htm
Quote
If the source is a denormalized number, a Denormal exception will be detected, setting the related flag in the Status Word. The value will still be loaded and normalized if possible.

This is exactly what you can observe with Olly when loading from somevar REAL8 2.42294657314453310e-320 - the D flag is set. The question is how to deal with it.

It's the creators job (in this case the assembler) to deal with this in this case. If the FPU creates the number:

.data
f tbyte 2.42294657314453310e-310 ; legal input
x real8 0.0
.code
fld f
fstp x

The FPU will set the Precision and Underflow flag. If the Underflow Mask is set to 0 it instruct the FPU to generate an interrupt whenever that particular exception is detected so that the program will take whatever action may be deemed necessary before returning control to the FPU.

Quote
IMHO UAsm does it right: accept denormal values.

Asmc will accept some of them, like NaN, but I'm a bit sceptical about this approach thought.

Quote
If a programmer insists to work in this exotic range, he should know how to handle the loss of precision.

You think that is the case here?

movsd   xmm0,1.0
subsd   xmm0,x

#### jj2007

• Member
• Posts: 10636
• Assembler is fun ;-)
##### Re: Fast Exp approximation
« Reply #27 on: August 13, 2020, 10:37:20 AM »

The hardware can handle the value, but AsmC refuses to accept it? Let's use UAsm then.

Seriously: if the risk of setting the Underflow flag bothers you, issue a warning. The case is so exotic that I wouldn't worry at all.

#### nidud

• Member
• Posts: 1989
##### Re: Fast Exp approximation
« Reply #28 on: August 13, 2020, 11:43:27 AM »

The hardware can handle the value, but AsmC refuses to accept it? Let's use UAsm then.

Sure, why not.

Quote
Seriously: if the risk of setting the Underflow flag bothers you, issue a warning. The case is so exotic that I wouldn't worry at all.

The assembler don't use the FPU (or SSE) so it doesn't set the underflow flag but rather the function that convert the string sets errno to ERANGE. I assume the changes in Masm (and Uasm) has to do with external library functions rather than any deliberate changes to the assembler.

errno = 0; /* v2.11: init errno; errno is set on over- and under-flow */
double_value = strtod( inp, NULL );

#### guga

• Member
• Posts: 1344
• Assembly is a state of art.
##### Re: Fast Exp approximation
« Reply #29 on: August 13, 2020, 12:40:45 PM »
DBL_MAX equ 1.7976931348623158e+308
DBL_MIN equ 2.2250738585072014e-308

That is Normalized range.

Intel specification say that only 80 bit denormalized values can be loaded without to raise an exception...  I don't know, here I have an AMD

Tks you so much for the equates HSE
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com