News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Double-precision to string

Started by hyder, May 14, 2024, 01:16:18 PM

Previous topic - Next topic

hyder

A long time ago, for "The Art of 64-bit Assembly", I wrote code to convert a floating-point value (real10) to a string, both decimal and scientific notation forms. This code used the FPU.
A couple of years ago I wrote "The Art of ARM Assembly" (currently in the copy edit/production stage at No Starch Press) and I cloned the x86 code supporting 64-bit double-precision values to the (64-bit) ARM.
Most recently, I've been working on "The Art of ARM Assembly, Volume 2" which is for 32-bit ARM CPUs (mainly embedded parts such as Arduino and Pico, but also for the Pi running a 32-bit OS). Rather than simply translate the 64-bit code from the first volume to 32-bit code for the second, I decided to play around with the algorithm and see if I could produce faster code. Using several huge lookup tables and some specialized coding, I was able to increase the speed by quite a bit.

Once I got the 32-bit ARM code operating, I was wondering how much better that algorithm would be on the X86. So I translated the 32-bit ARM code (largely instruction by instruction) and compared the result to the original x86 code I wrote for "The Art of 64-bit Assembly". It ran a little better than 2.6 times faster; a successful job.

Comparing the output from the two algorithms, I found that most strings I produced by the two algorithms were the same; the few differences were largely due to the extra precision afforded by the real10 format (versus the real8 double-precision format). Also, the real10 code didn't handle special cases such as NaN, Inf, and denormalized numbers (which the new code does). I've included output files for the two algorithms if you want to compare their output.

Because this x86 code is almost a line-by-line (instruction-by-instruction) translation, I'm sure there is ample opportunity for additional improvement (using x86 coding style and optimization rather than ARM). Have at it.
Cheers,
Randy Hyde

jack