Equivalence angle conversion in SSE2

HSE · July 20, 2023, 09:47:51 PM

Hi Guga!

Quote from: guga on July 19, 2023, 05:15:28 PMi´m not being able to make it work for larger numbers such as 33e17, 12.56e100 etc etc etc

Where you obtain this irreal angles? Is possible that previous process is not so smart.

guga · July 20, 2023, 10:03:08 PM

Siekmanski

i guess i found the limit for out SSE routines.

Whenever the input number is bigger then 8e11,it fails. Also, values closer to 8, such as 7.999999999e11. So, it seems that we have a safe limit for 7e11 to calculate this way.

I´m trying another solution for really huge numbers, such as 1.4564457e100 or also, 1.44654e900 etc etc. I´ll try to make it with tables after calcuulating the log10 of each one of them. It won´t be pretty, and may slow down a biut the algo, but perhaps, it would work with somewhat good precision. I´ll try to adapt the log routine for SSE we did a long time ago and see what happens

guga · July 20, 2023, 10:07:49 PM

Quote from: HSE on July 20, 2023, 09:47:51 PMHi Guga!

Quote from: guga on July 19, 2023, 05:15:28 PMi´m not being able to make it work for larger numbers such as 33e17, 12.56e100 etc etc etc

Where you obtain this irreal angles? Is possible that previous process is not so smart.

Hi HSE

those numbers are for testing only. I created a SSE version for tangent, and realized that maybe someone could input a really high value such as these ones i posted (Angles of 33e17, 12.56e100 etc etc).

I wanted to know if there was a way to calculate tangent of angles that extrapolates the limits of 360º, and it works fine accordying to wolframalpha, but it do have some limitations both on my and Siekmanski versions. (The limit seems to be around 8e11, so anything below 7e11 would be safe to calculate using SSE2). On JJ´s version, this limit is a bit higher because it uses TenByte (80 bits) for the scope, but it also may decreases the precision and will end with the same limitation as in SSE2 version.

In wolframalpha, all it is doing to calculate the tangent of huge numbers is:
1st - convert those value to the equivalent angle inside 360º
2nd, calculate the tangent from this equivalent angle

I´m trying to do the same thing, but i´m facing some limitations using SSE2. Perhaps i´ve found aa solution as i commented to Siekmanski, but i´m still working on it to see what happens.

For example, in wolframalpha if you input 33e17 degrees, it will convert it to the equivalent angle which is something around 240
https://www.wolframalpha.com/input?i=33e17+degrees

HSE · July 20, 2023, 10:20:37 PM

Quote from: guga on July 20, 2023, 10:07:49 PMinput

The input to this calculations is wrong. Something is turning around billions and billions of times

jj2007 · July 20, 2023, 10:22:01 PM

Hi Guga,

Can you explain the practical relevance of translating an angle of 33e17 to its 0...360° equivalent? Do you have a specific project that requires this?

Code Select

Using the FPU:
8.704100e+02 = 150.4
1.234560e+06 = 120.0
1.234560e+08 = 120.0
1.234560e+10 = 120.0
1.234560e+12 = 120.0
1.234560e+14 = 120.0

Using SIMD instructions:
8.704100e+02 = 150.4
1.234560e+06 = 120.0
1.234560e+08 = 120.1
1.234560e+10 = 129.9
1.234560e+12 = 27.65
1.234560e+14 = 244.8

I put everything in a macro now, see attached source, so that I can switch easily between SIMD and FPU mode.

guga · July 20, 2023, 11:29:16 PM

Hi JJ

I´m creating a variation of libm_sse2_tan_precise from msvcrt which is fast and also precise. It works fine for small numbers (i.e: inside the range of 360º, no matter if the input is a negative or positive angle btw). The problem is that there are some angles (positive or negative) higher than that wich also may need to be calculated (such as the equivalence angles that extrapolates the limits as 870.41º or any other weird weird weird values such as 33e11, 5.589e17, 5.9587e150 etc etc)

I realize that is unlikelly to use numbers higher then 360º, but since wolframalpha suceeded to calculate the tangent of very weird values with a very good precision, i´m trying to do the same and also maintain speed and precision as well.

I´ll give a test on your new file, but i´m trying to do it for SSE2 only, since i didn´t implemented SSE4 yet for RosAsm and still there are people that don´t have a newer processor that handles SSE4, so i can´t create yet a variation that can handle both.

Btw. I´m doing these tests for positive numbers 1st that extrapolates the limits, and then see if it is ok to do the same for negative angles. Siekmanski test works fine but not for negative angles from where it seems to not calculate the equivalent angle properly when the input is negative.The same for my original version, btw, that also fails to handle the equivalence angle when the input is negative. That´s why i´m trying for positive 1st and later do the same for negative numbers, i mean trying also to find the equivalence angle when the input is negative. (As wolframalpha do)

HSE · July 20, 2023, 11:42:27 PM

If you know what precision in angle result you need, you can know the maximum number valid.

Raymond must know the exact value. But for example, if you want integer angles, any REAL8 number bigger than 1.0e17 don't contain the final angle.

Quote from: guga on July 20, 2023, 11:29:16 PM5.9587e150

If this number is a REAL8 number don't have any information about the angle you need.

HSE · July 20, 2023, 11:54:45 PM

Quote from: jj2007 on July 20, 2023, 10:22:01 PM1.234560e+08 = 120.0
1.234560e+10 = 120.0
1.234560e+12 = 120.0
1.234560e+14 = 120.0

This numbers don't contain the angle. You need to fill all fraccional positions (unless you know that all missing positions are zero).

Quote from: jj2007 on July 20, 2023, 10:22:01 PMfrndint

By default FPU round to nearest, not to floor. You have to modify that. For example Biterider's macro:

Code Select

; ——————————————————————————————————————————————————————————————————————————————————————————————————
; Macro:      fInt
; Purpose:    Calculate the integer part of the content of st0.
; Arguments:  None.
; Return:     Nothing.

fInt macro
  sub xsp, @WordSize                                    ;;Reserve stack place for one word
  fstcw WORD ptr [xsp]                                  ;;Store FPU control word
  push XWORD ptr [xsp]                                  ;;Duplicate value
  BitSet WORD ptr [xsp], (BIT10 or BIT11)               ;;Modify the control word, int(x) = Truncate (toward 0)
  fldcw WORD ptr [xsp]                                  ;;Restore modified FPU control word
  frndint                                               ;;Round down
  fldcw WORD ptr[xsp + @WordSize]                       ;;Restore previous FPU control word
  add xsp, 2*@WordSize                                  ;;Restore stack. Don't use pop xax. We'll not destroy it.
endm

jj2007 · July 21, 2023, 01:36:52 AM

Quote from: guga on July 20, 2023, 11:29:16 PMBtw. I´m doing these tests for positive numbers 1st that extrapolates the limits, and then see if it is ok to do the same for negative angles.

Quote from: jj2007 on July 20, 2023, 11:43:20 AMP.S.: In the source, change the third roundsd operator to 11. Both 9 and 11 work, but the result will be different for negative inputs. See the RoundSD thread for explanations.

Quote from: HSE on July 20, 2023, 11:54:45 PM
Quote from: jj2007 on July 20, 2023, 10:22:01 PMfrndint

By default FPU round to nearest, not to floor. You have to modify that.

Init
Read i$()
Cls 3
push 360
FpuSet MbDown64

HSE · July 21, 2023, 02:21:46 AM

Quote from: jj2007 on July 21, 2023, 01:36:52 AMFpuSet MbDown64

raymond · July 21, 2023, 06:13:31 AM

Precision, precision, precision, ......

The precision available with the equipment using "floating point" in currently available computers has a specific limit. For REAL4, it is the equivalent to 7 significant digits. For example, even if you know the square root of 2.0 would be equal to

1.41421 35623 73095 04880 16887 24209 69807 85696 71875 37694 80731 76679 73799 07324 78462 10703 88503 87534 32764 15727

a REAL4 would only provide it at best as 1.4142136 even if you tried to feed it manually with the 100 decimal digits.

Before wasting time trying to handle astronomical numbers of angles with the SSE, one should examine the necessity of converting such numbers having more than 7 digits in the integer portion, i.e. numbers in degrees expresssed in scientific notation as >9.0e7.

Where and when would such numbers be generated???

NoCforMe · July 21, 2023, 06:27:07 AM

But why limit yourself to 7 digits of precision when you have REAL10s available?

jj2007 · July 21, 2023, 07:26:14 AM

This is a strange use case, and I am still waiting for a convincing answer from Guga why he needs that.

Generally speaking, the Windows x64 ABI does allow the use of the FPU. From what I saw in The Laboratory in the past, the FPU is a) much more precise but b) not slower than SIMD code. The only convincing reason to not use it is if you have lots of data that you can process in parallel. That is the only area where SIMD shines.

six_L · July 21, 2023, 06:08:51 PM

Hi,all
interesting...
If the Laser weapon hits the missile,what are the needed precision about the following image?
R=500km
α=α0+0.000000000000001°
β=β0-0.000000000000001°

mineiro · July 21, 2023, 09:42:46 PM

This is one example that you can need a higher precision:

bellow it's a really "big" string:
m i n e i r o
6dh,69h,6eh,65h,69h,72h,6fh
109,105,110,101,105,114,111

1/x + uint = 1/x +uint = 1/x + uint = ...

Example to letters "mi", first will be added +1, nexts no.
1/((1/109+1) + 105) = 0,009522985

Getting "encoded" values back, LIFO (last in, first out:
1/0,009522985 = 105,009091162 - 105 = 0,009091162
1/0,009091162 = 109,996939885 - 109 = ...

In linux I'm using gnu multiple precision arithmetic library (bignum). Limits are computer memory.
https://gmplib.org/

The MASM Forum

News:

Equivalence angle conversion in SSE2

HSE

guga

guga

HSE

jj2007

guga

HSE

HSE

jj2007

HSE

raymond

NoCforMe

jj2007

six_L

mineiro