Equivalence angle conversion in SSE2

guga · July 25, 2023, 10:16:25 AM

Quote from: NoCforMe on July 25, 2023, 09:34:51 AMOK, thanks for answer. But you didn't address one thing:

The FPU uses 80-bit numbers; why are you restricting yourself to 64 bits?

JJ´s solution on FPU is elegant, but, seems to lead to some lack of precision at some point. Using the SSE4 opcode seems to fix this particular problem as well and seems to be faster, but i didn´t implemented yet to check.

HSE · July 25, 2023, 10:20:01 AM

Quote from: jj2007 on July 25, 2023, 10:10:40 AMI can set that register, but results don't change

~~I can't set register~~, but happen that result is 220

JBasic error is with https://masm32.com/board/index.php?msg=121818

Later: To set register was the other way ST-LD

File for Masm64 SDK, just in case

jj2007 · July 25, 2023, 10:41:18 AM

Quote from: guga on July 25, 2023, 10:16:25 AMJJ´s solution on FPU is elegant, but, seems to lead to some lack of precision at some point.

Where?

guga · July 25, 2023, 11:32:11 AM

Hi JJ

i didn´t tested yet fully with FPU, but wasn´t precision issue you talked about the answer #12 ?

QuoteIf you use the second version of my proggie (which is SIMD, not FPU), you will see that 33.0e17 as input will not work. However, 33.0e10 will indeed work. As HSE rightly noted, the issue is precision.

I didn´t tested after it to see if it was ok about the precision you were talking about

NoCforMe · July 25, 2023, 01:36:27 PM

Quote... (which is SIMD, not FPU) ...

guga · July 25, 2023, 02:54:02 PM

Ops, Tks for the clarification,NoCforMe. Sorry, JJ, i missread it.

daydreamer · July 25, 2023, 05:54:14 PM

dont need to be AND 0fffh,it can be use any 2^x-1that works with your problem
Myself used AND 1023 with UV coordinates for drawing 1024*1024 tiled water texture

HSE · July 25, 2023, 10:11:11 PM

Quote from: guga on July 25, 2023, 09:31:02 AMto avoid extra checking on a RGB to CieLCH function

Then that is where the error is, and where it must be corrected. Try to manage the error later, inside a simple modulus calculation, is just foolish.

(Somebody must remember me this from time to time

)

jj2007 · July 26, 2023, 06:41:00 PM

I've added a new macro, MbMod(number, divisor) to MasmBasic:

include \masm32\MasmBasic\MasmBasic.inc
TestData REAL8 870.41, 33.0e17, 1.11116e15, 2.2222e16, 3.3333e17, 4.4444e18,
5.5555e19, 6.6666e20, 7.7777e21, 8.8888e22
Init
mov esi, offset TestData
PrintLine CrLf$, "Source mod 360"
.Repeat
Print Str$("%9f", REAL8 ptr [esi]), Str$("\t%4f\n", MbMod(REAL8 ptr [esi], 360))
add esi, REAL8
.Until esi>=offset TestData+sizeof TestData
Inkey
EndOfCode

The data marked in red are those proposed by Guga in the first post.

Output:

Code Select

Source          mod 360
870.410000      150.4
3.30000000e+18  240.0
1.11116000e+15  200.0
2.22220000e+16  280.0
3.33330000e+17  240.0
4.44440000e+18  200.0
5.55550000e+19  160.0
6.66660000e+20  120.0
7.77770000e+21  1.489e+12
8.88880000e+22  4.517e+11

Note the limit for correct results is about 4.72e21 - you can use the Windows Calculator to verify that, it uses a BigNum lib (kind of...).

raymond · July 27, 2023, 04:16:22 AM

Sorry but this is getting TOTALLY ABSURD from a MATHEMATICAL point of view. Might as well use the following as an example:

1.234560e16 mod 360 = 120 for ALL the integers between 123456000000000000 and 12345601000000000.

Using REAL4, The maximum number of significant digits it can offer is 7. The above example source could easily be generated as a REAL4; in fact it can generate numbers up to 3.4e38, but still with only a maximum of 7 significant digits.

The only way a modulo of 360 will make any sense is if the integer portion of the source contains all significant digits. Fractional digits (when the number is displayed without exponents) would have no importance if a precision of +/-1 degree is sufficient.

Furthermore, loading a number generated as a REAL4 onto the FPU to be used for REAL8 or REAL10 computations will NOT improve the precision of results even though those are capable of better accuracy when fed with more accurate data.

Considering REAL8 data, this can be accurate to 15 significant digits AS LONG AS all input data would have been also accurate to at least 15 significant digits. Although REAL8 can handle numbers up to 1.79e308, using it to get a modulo 360 with an accuracy of +/-1 degree would still be limited to 1.0e16 (or less depending on its own accuracy) for the source.

jj2007 · July 27, 2023, 06:40:50 AM

Dear Ray,

I stand corrected, you are right, of course. Here is an example that respects the precision limits:

Code Select

include \masm32\MasmBasic\MasmBasic.inc    ; thread
  SetGlobals fct:REAL8
  Init
  PrintLine CrLf$, "Source              mod 360"
  For_ fct=1234567890000350.0 To 1234567890000366.0 Step 1.0
    Print Str$("%Hf", fct), Str$("\t%4f\n", MbMod(fct, 360)v)
  Next
  Inkey
EndOfCode

Code Select

Source                mod 360
1234567890000350.0      350.0
1234567890000351.0      351.0
1234567890000352.0      352.0
1234567890000353.0      353.0
1234567890000354.0      354.0
1234567890000355.0      355.0
1234567890000356.0      356.0
1234567890000357.0      357.0
1234567890000358.0      358.0
1234567890000359.0      359.0
1234567890000360.0      0.0
1234567890000361.0      1.0
1234567890000362.0      2.0
1234567890000363.0      3.0
1234567890000364.0      4.0
1234567890000365.0      5.0
1234567890000366.0      6.0

raymond · July 27, 2023, 10:11:47 AM

I would agree conditionally with all those results.

(i) All those middle 0's must be considered as significant digits and you now have 16 significant digits for an integer lower than 1.0e19 which can be processed as a REAL10.

(ii) The next question would be which components were used to generate those numbers. Did they all have the equivalent 16 significant digits of accuracy.
(A multiplication by any integer would be considered as having an infinite number of significant digits, even if it only has a single digit. Addition/subtraction of integers is treated in the same category. A division by any number would be allowed as long as there would be no remainder.)

(iii) Would any component appear to be an integer after being truncated by some other process. (Must remember that this thread was started to process data generated as REAL4!!!!)

Otherwise, I'm glad someone finally saw the light.

NoCforMe · July 27, 2023, 12:49:42 PM

One question to the OP, Guga, and I think this touches on the issues that Raymond has raised here:

You never answered this: Is there any reason you don't want to use the FPU (which gives you more precision) than other methods? I'm still puzzled by this.

guga · July 28, 2023, 11:53:49 PM

Quote from: raymond on July 27, 2023, 10:11:47 AM
(iii) Would any component appear to be an integer after being truncated by some other process. (Must remember that this thread was started to process data generated as REAL4!!!!)

Otherwise, I'm glad someone finally saw the light.

One correction only. It started as Real8 " R$ = Real8 number", but it can work for Real4 or Tenbyte as well. Your comments are valid and helped to better work on the proper fixes or ways to better represent the equivalent angles (assuming they are valid ones of course)

Hi NoCforMe

I´ll use the FPU TenByte format as well, but i´ll try to check for speed 1st to see which is faster to use when assuuming a Tenbyte data (SSE or regular FPU). Didn´t had time yet to finish the checkings, so i´m using FPU for Real10 untill i finish it.

The MASM Forum

News:

Equivalence angle conversion in SSE2

guga

HSE

jj2007

guga

NoCforMe

guga

daydreamer

HSE

jj2007

raymond

jj2007

raymond

NoCforMe

guga