Author Topic: ArcSin timings  (Read 1939 times)

daydreamer

  • Member
  • *****
  • Posts: 1553
  • building nextdoor
Re: ArcSin timings
« Reply #30 on: December 24, 2020, 06:44:30 AM »
I start wondering what is the motivation behind your critical comments, H├ęctor :badgrin:

Just beating what I can. It's the laboratory:
Quote
Algorithm and code design research laboratory. This is the place to post assembler algorithms and code design for discussion, optimisation and any other improvements that can be made on it. Post code here to be beaten to death to make it better, smaller, faster or more powerful. Feel free to explain the optimisation methods used so that everyone can get a feel for the code design.


Tables are interesting tools, especially if you macro make so easy to create, but how to build it and when to use it deserve some considerations.
well if you use trigo later for drawing in this 64bit era,wouldnt fixed point table be faster alternative,even 32bit support MUL that results in 32bit in eax,32bit in edx?also DIV use both 32bit registers?
SIMD fan and macro fan
Happy new year 2021 that can only turn out to become better than worse 2020 :)

HSE

  • Member
  • *****
  • Posts: 1612
  • <AMD>< 7-32>
Re: ArcSin timings
« Reply #31 on: December 24, 2020, 07:08:13 AM »
well if you use trigo later for drawing in this 64bit era,wouldnt fixed point table be faster alternative,even 32bit support MUL that results in 32bit in eax,32bit in edx?also DIV use both 32bit registers?
You can timing that to be sure  :biggrin:

daydreamer

  • Member
  • *****
  • Posts: 1553
  • building nextdoor
Re: ArcSin timings
« Reply #32 on: December 28, 2020, 12:42:47 AM »
well if you use trigo later for drawing in this 64bit era,wouldnt fixed point table be faster alternative,even 32bit support MUL that results in 32bit in eax,32bit in edx?also DIV use both 32bit registers?
You can timing that to be sure  :biggrin:
it's best to make optimisation and timings also in practical uses,whole tunnel(stargate),sphere(planet) code,float to int conversion take not only some cycles,mixing SSE floating point code and SSE 2 integer code you get some penalty
For example circle , can be from 32x32 sprite to  big hires 4k hd screen, 32 diameter* pi vs 1080 diameter * pi, so a general 360 degree LUT, works best for 360 pixel circle, too many for 32diameter and too few for 1080 diameter
SIMD fan and macro fan
Happy new year 2021 that can only turn out to become better than worse 2020 :)