News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Equivalence angle conversion in SSE2

Started by guga, July 19, 2023, 05:15:28 PM

Previous topic - Next topic

guga

Quote from: NoCforMe on July 25, 2023, 09:34:51 AMOK, thanks for answer. But you didn't address one thing:

The FPU uses 80-bit numbers; why are you restricting yourself to 64 bits?

JJ´s solution on FPU is elegant, but, seems to lead to some lack of precision at some point.  Using the SSE4 opcode seems to fix this particular problem as well and seems to be faster, but i didn´t implemented yet to check.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

HSE

Quote from: jj2007 on July 25, 2023, 10:10:40 AMI can set that register, but results don't change

:biggrin: I can't set register, but happen that result is 220  :rolleyes:

JBasic error is with https://masm32.com/board/index.php?msg=121818

Later:  To set register was the other way ST-LD :biggrin:

File for Masm64 SDK, just in case
Equations in Assembly: SmplMath

jj2007

Quote from: guga on July 25, 2023, 10:16:25 AMJJ´s solution on FPU is elegant, but, seems to lead to some lack of precision at some point.

Where?

guga

Hi JJ

i didn´t tested yet fully with FPU, but wasn´t precision issue you talked about the answer #12 ?

QuoteIf you use the second version of my proggie (which is SIMD, not FPU), you will see that 33.0e17 as input will not work. However, 33.0e10 will indeed work. As HSE rightly noted, the issue is precision.

I didn´t tested after it to see if it was ok about the precision you were talking about
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

NoCforMe

Quote... (which is SIMD, not FPU) ...
Assembly language programming should be fun. That's why I do it.

guga

Ops, Tks for the clarification,NoCforMe. Sorry, JJ, i missread it.  :azn:
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

daydreamer

 dont need to be  AND 0fffh,it can be use any 2^x-1that works with your problem
Myself used AND 1023 with UV coordinates for drawing 1024*1024 tiled water texture

my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

HSE

Quote from: guga on July 25, 2023, 09:31:02 AMto avoid extra checking on a RGB to CieLCH function

Then that is where the error is, and where it must be corrected. Try to manage the error later, inside a simple modulus calculation, is just foolish.

(Somebody must remember me this from time to time  :biggrin: )
Equations in Assembly: SmplMath

jj2007

I've added a new macro, MbMod(number, divisor) to MasmBasic:

include \masm32\MasmBasic\MasmBasic.inc
TestData    REAL8 870.41, 33.0e17, 1.11116e15, 2.2222e16, 3.3333e17, 4.4444e18,
                  5.5555e19, 6.6666e20, 7.7777e21, 8.8888e22
  Init
  mov esi, offset TestData
  PrintLine CrLf$, "Source        mod 360"
  .Repeat
    Print Str$("%9f", REAL8 ptr [esi]), Str$("\t%4f\n", MbMod(REAL8 ptr [esi], 360))
    add esi, REAL8
  .Until esi>=offset TestData+sizeof TestData
  Inkey
EndOfCode

The data marked in red are those proposed by Guga in the first post.

Output:
Source          mod 360
870.410000      150.4
3.30000000e+18  240.0
1.11116000e+15  200.0
2.22220000e+16  280.0
3.33330000e+17  240.0
4.44440000e+18  200.0
5.55550000e+19  160.0
6.66660000e+20  120.0
7.77770000e+21  1.489e+12
8.88880000e+22  4.517e+11

Note the limit for correct results is about 4.72e21 - you can use the Windows Calculator to verify that, it uses a BigNum lib (kind of...).

raymond

Sorry but this is getting TOTALLY ABSURD from a MATHEMATICAL point of view. Might as well use the following as an example:

1.234560e16 mod 360 = 120 for ALL the integers between 123456000000000000 and 12345601000000000.

Using REAL4, The maximum number of significant digits it can offer is 7. The above example source could easily be generated as a REAL4; in fact it can generate numbers up to 3.4e38, but still with only a maximum of 7 significant digits.

The only way a modulo of 360 will make any sense is if the integer portion of the source contains all significant digits. Fractional digits (when the number is displayed without exponents) would have no importance if a precision of +/-1 degree is sufficient.

Furthermore, loading a number generated as a REAL4 onto the FPU to be used for REAL8 or REAL10 computations will NOT improve the precision of results even though those are capable of better accuracy when fed with more accurate data.

Considering REAL8 data, this can be accurate to 15 significant digits AS LONG AS all input data would have been also accurate to at least 15 significant digits. Although REAL8 can handle numbers up to 1.79e308, using it to get a modulo 360 with an accuracy of +/-1 degree would still be limited to 1.0e16 (or less depending on its own accuracy) for the source.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com

jj2007

Dear Ray,

I stand corrected, you are right, of course. Here is an example that respects the precision limits:

include \masm32\MasmBasic\MasmBasic.inc    ; thread
  SetGlobals fct:REAL8
  Init
  PrintLine CrLf$, "Source              mod 360"
  For_ fct=1234567890000350.0 To 1234567890000366.0 Step 1.0
    Print Str$("%Hf", fct), Str$("\t%4f\n", MbMod(fct, 360)v)
  Next
  Inkey
EndOfCode

Source                mod 360
1234567890000350.0      350.0
1234567890000351.0      351.0
1234567890000352.0      352.0
1234567890000353.0      353.0
1234567890000354.0      354.0
1234567890000355.0      355.0
1234567890000356.0      356.0
1234567890000357.0      357.0
1234567890000358.0      358.0
1234567890000359.0      359.0
1234567890000360.0      0.0
1234567890000361.0      1.0
1234567890000362.0      2.0
1234567890000363.0      3.0
1234567890000364.0      4.0
1234567890000365.0      5.0
1234567890000366.0      6.0

raymond

I would agree conditionally with all those results.

(i) All those middle 0's must be considered as significant digits and you now have 16 significant digits for an integer lower than 1.0e19 which can be processed as a REAL10.

(ii) The next question would be which components were used to generate those numbers. Did they all have the equivalent 16 significant digits of accuracy.
      (A multiplication by any integer would be considered as having an infinite number of significant digits, even if it only has a single digit. Addition/subtraction of integers is treated in the same category. A division by any number would be allowed as long as there would be no remainder.)

(iii) Would any component appear to be an integer after being truncated by some other process. (Must remember that this thread was started to process data generated as REAL4!!!!)


Otherwise, I'm glad someone finally saw the light.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com

NoCforMe

One question to the OP, Guga, and I think this touches on the issues that Raymond has raised here:

You never answered this: Is there any reason you don't want to use the FPU (which gives you more precision) than other methods? I'm still puzzled by this.
Assembly language programming should be fun. That's why I do it.

guga

Quote from: raymond on July 27, 2023, 10:11:47 AM
(iii) Would any component appear to be an integer after being truncated by some other process. (Must remember that this thread was started to process data generated as REAL4!!!!)


Otherwise, I'm glad someone finally saw the light.

One correction only. It started as Real8 " R$ = Real8 number", but it can work for Real4 or Tenbyte as well. Your comments are valid and helped to better work on the proper fixes or ways to better represent the equivalent angles (assuming they are valid ones of course)

Hi NoCforMe

I´ll use the FPU TenByte format as well, but i´ll try to check for speed 1st to see which is faster to use when assuuming a Tenbyte data (SSE or regular FPU). Didn´t had time yet to finish the checkings, so i´m using FPU for Real10 untill i finish it.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com