Converting real4 to string

dedndave · April 23, 2013, 05:26:20 AM

"hara-kiri"

that is a simple function, Rui
for 32-bits, not even sure i would put it in a PROC - maybe a macro
you just want to use the BSR instruction :t

RuiLoureiro · April 23, 2013, 05:34:52 AM

Quote from: dedndave on April 23, 2013, 05:26:20 AM
"hara-kiri"

it is what you do, Dave; i do arakiri, a kind of hara_kiri ! :P

dedndave · April 23, 2013, 05:36:20 AM

not me

my wife won't let me :(

RuiLoureiro · April 24, 2013, 06:38:52 AM

Here are some timings to convert
Real10 to string for 19,18,10 digits

note: CT -compress table

Quote
19 digits

Code Select


3.141592653589793238  <---- pi 19 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DF ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DR ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DX ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DY ***
2384 cycles, ConvertFloat10ZX, _Real10_2
3706 cycles, ConvertFloat10, _Real10_2
2380 cycles, ConvertFloat10Z, _Real10_2
2272 cycles, ConvertFloat10DF, _Real10_2
2169 cycles, ConvertFloat10DR, _Real10_2
2602 cycles, ConvertFloat10BX, _Real10_2
2681 cycles, ConvertFloat10BF, _Real10_2
2647 cycles, ConvertFloat10BY, _Real10_2
2128 cycles, ConvertFloat10DY, _Real10_2
2146 cycles, ConvertFloat10DX, _Real10_2
2400 cycles, ConvertFloat10CT, _Real10_2

 *** Press any key to get the time table ***

 ***** Time table *****

2128  cycles, ConvertFloat10DY, direct, esp
2146  cycles, ConvertFloat10DX, direct, esp
2169  cycles, ConvertFloat10DR, direct, ebp
2272  cycles, ConvertFloat10DF, direct, ebp
2380  cycles, ConvertFloat10Z, BCD
2384  cycles, ConvertFloat10ZX, BCD
2400  cycles, ConvertFloat10CT, BCD, esp, CT
2602  cycles, ConvertFloat10BX, BCD
2647  cycles, ConvertFloat10BY, BCD, esp
2681  cycles, ConvertFloat10BF, BCD
3706  cycles, ConvertFloat10, BCD, Save FPU
 ********** END **********

; ##########################################

Quote
18 digits

Code Select


3.14159265358979324  <---- pi 18 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DF ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DR ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DX ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DY ***
1067 cycles, ConvertFloat10ZX, _Real10_2
2958 cycles, ConvertFloat10, _Real10_2
1083 cycles, ConvertFloat10Z, _Real10_2
2154 cycles, ConvertFloat10DF, _Real10_2
2021 cycles, ConvertFloat10DR, _Real10_2
863 cycles, ConvertFloat10BX, _Real10_2
855 cycles, ConvertFloat10BF, _Real10_2
896 cycles, ConvertFloat10BY, _Real10_2
1909 cycles, ConvertFloat10DY, _Real10_2
1910 cycles, ConvertFloat10DX, _Real10_2
883 cycles, ConvertFloat10CT, _Real10_2

 *** Press any key to get the time table ***

 ***** Time table *****

855  cycles, ConvertFloat10BF, BCD
863  cycles, ConvertFloat10BX, BCD
883  cycles, ConvertFloat10CT, BCD, esp, CT
896  cycles, ConvertFloat10BY, BCD, esp
1067  cycles, ConvertFloat10ZX, BCD
1083  cycles, ConvertFloat10Z, BCD
1909  cycles, ConvertFloat10DY, direct, esp
1910  cycles, ConvertFloat10DX, direct, esp
2021  cycles, ConvertFloat10DR, direct, ebp
2154  cycles, ConvertFloat10DF, direct, ebp
2958  cycles, ConvertFloat10, BCD, Save FPU
 ********** END **********

; #########################################

Quote
10 digits

Code Select


3.141592654  <---- pi 10 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DF ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DR ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DX ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
 *** STOP - ConvertFloat10DY ***
1132 cycles, ConvertFloat10ZX, _Real10_2
3015 cycles, ConvertFloat10, _Real10_2
1146 cycles, ConvertFloat10Z, _Real10_2
638 cycles, ConvertFloat10DF, _Real10_2
505 cycles, ConvertFloat10DR, _Real10_2
759 cycles, ConvertFloat10BX, _Real10_2
769 cycles, ConvertFloat10BF, _Real10_2
758 cycles, ConvertFloat10BY, _Real10_2
500 cycles, ConvertFloat10DY, _Real10_2
490 cycles, ConvertFloat10DX, _Real10_2

748 cycles, ConvertFloat10CT, _Real10_2

 *** Press any key to get the time table ***

 ***** Time table *****

490  cycles, ConvertFloat10DX, direct, esp
500  cycles, ConvertFloat10DY, direct, esp
505  cycles, ConvertFloat10DR, direct, ebp
638  cycles, ConvertFloat10DF, direct, ebp
748  cycles, ConvertFloat10CT, BCD, esp, CT
758  cycles, ConvertFloat10BY, BCD, esp
759  cycles, ConvertFloat10BX, BCD
769  cycles, ConvertFloat10BF, BCD
1132  cycles, ConvertFloat10ZX, BCD
1146  cycles, ConvertFloat10Z, BCD
3015  cycles, ConvertFloat10, BCD, Save FPU
 ********** END **********

RuiLoureiro · April 26, 2013, 04:34:17 AM

Sometimes, there are people that say BCD method is slow
Here we can see that it is not true (in my P4).

*** STOP - ConvertFloat10DY ***
1168 cycles, ConvertFloat10ZX, _Real10_2
3016 cycles, ConvertFloat10, _Real10_2
1144 cycles, ConvertFloat10Z, _Real10_2
602 cycles, ConvertFloat10DF, _Real10_2
468 cycles, ConvertFloat10DR, _Real10_2
730 cycles, ConvertFloat10BX, _Real10_2
736 cycles, ConvertFloat10BF, _Real10_2
716 cycles, ConvertFloat10BY, _Real10_2
467 cycles, ConvertFloat10DY, _Real10_2
463 cycles, ConvertFloat10DX, _Real10_2
761 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

Quote
***** Time table *****

463 cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 10 digits
467 cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 10 digits
468 cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 10 digits
602 cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 10 digits
716 cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 10 digits
730 cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 10 digits
736 cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 10 digits
761 cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 10 digits
1144 cycles, ConvertFloat10Z, BCD -old - 10 digits
1168 cycles, ConvertFloat10ZX,BCD - old - 10 digits
3016 cycles, ConvertFloat10, BCD, Save FPU -old - 10 digits
********** END **********

; ---------------------------------------------------------------------

*** STOP - ConvertFloat10DY ***
1049 cycles, ConvertFloat10ZX, _Real10_2
2933 cycles, ConvertFloat10, _Real10_2
1057 cycles, ConvertFloat10Z, _Real10_2
1408 cycles, ConvertFloat10DF, _Real10_2
1283 cycles, ConvertFloat10DR, _Real10_2
774 cycles, ConvertFloat10BX, _Real10_2
762 cycles, ConvertFloat10BF, _Real10_2
761 cycles, ConvertFloat10BY, _Real10_2
1457 cycles, ConvertFloat10DY, _Real10_2
1296 cycles, ConvertFloat10DX, _Real10_2
784 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

Quote
***** Time table *****

761 cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 15 digits
762 cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 15 digits
774 cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 15 digits
784 cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 15 digits <- compresstable
1049 cycles, ConvertFloat10ZX, BCD - old - 15 digits
1057 cycles, ConvertFloat10Z, BCD -old - 15 digits
1283 cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 15 digits
1296 cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 15 digits
1408 cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 15 digits
1457 cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 15 digits
2933 cycles, ConvertFloat10, BCD, Save FPU -old - 15 digits
********** END **********

; ---------------------------------------------------------------------
*** STOP - ConvertFloat10DY ***
1161 cycles, ConvertFloat10ZX, _Real10_2
3043 cycles, ConvertFloat10, _Real10_2
1219 cycles, ConvertFloat10Z, _Real10_2
1729 cycles, ConvertFloat10DF, _Real10_2
1639 cycles, ConvertFloat10DR, _Real10_2
820 cycles, ConvertFloat10BX, _Real10_2
812 cycles, ConvertFloat10BF, _Real10_2
792 cycles, ConvertFloat10BY, _Real10_2
1636 cycles, ConvertFloat10DY, _Real10_2
1609 cycles, ConvertFloat10DX, _Real10_2
881 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

Quote
***** Time table *****

792 cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 17 digits
812 cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 17 digits
820 cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 17 digits
881 cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 17 digits
1161 cycles, ConvertFloat10ZX, BCD - old - 17 digits
1219 cycles, ConvertFloat10Z, BCD -old - 17 digits
1609 cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 17 digits
1636 cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 17 digits
1639 cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 17 digits
1729 cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 17 digits
3043 cycles, ConvertFloat10, BCD, Save FPU -old - 17 digits
********** END **********

; ---------------------------------------------------------------------

*** STOP - ConvertFloat10DY ***
2258 cycles, ConvertFloat10DF, _Real10_2
2144 cycles, ConvertFloat10DR, _Real10_2
2157 cycles, ConvertFloat10DY, _Real10_2
2135 cycles, ConvertFloat10DX, _Real10_2

*** Press any key to get the time table ***

Quote
We cannot use BCD for 19 digits

***** Time table *****
2135 cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 19 digits
2144 cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 19 digits
2157 cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 19 digits
2258 cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 19 digits
********** END **********

dedndave · April 26, 2013, 04:36:53 AM

BCD probably is slow if you want to multiply or divide long strings - lol

but, for adding and subtracting, it's quite fast :t

a good bignum library is probably faster for large values, though - fewer memory accesses

RuiLoureiro · April 26, 2013, 04:42:35 AM

Quote from: dedndave on April 26, 2013, 04:36:53 AM
BCD probably is slow if you want to multiply or divide long strings - lol

but, for adding and subtracting, it's quite fast :t

a good bignum library is probably faster for large values, though - fewer memory accesses

Dave,
the last results show that if we want to convert
a real10 to string with 15-17-18 digits, the best is
BCD. But if we want only 10, the best is converting
the integer (direct).
Do you know what i am saying with BCD ? It is nothing to do
with multiply or divide: use fbstp tbyte ...

dedndave · April 26, 2013, 04:45:35 AM

that's because you can get a 10-decimal-digit binary into a dword register, if it's under 4294967296 :P

by the way, Ray has done some cool stuff with BCD, if you didn't already know...

http://www.ray.masmcode.com/BCDtut.html

RuiLoureiro · April 26, 2013, 04:57:20 AM

Quote from: dedndave on April 26, 2013, 04:45:35 AM
that's because you can get a 10-decimal-digit binary into a dword register, if it's under 4294967296 :P

Yes, of course ! But BCD has nothing to do with multiply/divide
We simple use fbstp tbyte _packedBCD and we need to unpack
it to string no need to multiply,divide,add, subtract.
I use compresstable also but it is not better.

Well, it seems that converting EDX:EAX to string is not fast as
fbstp + unpack it. Well, in 32 bit !!!!!

The other things about that stuff it is not useful for me
About your other post, i need to test your results. I need time.

dedndave · April 26, 2013, 05:16:12 AM

there are some very fast 64-bit routines (32-bit code)
Paul Dixon wrote one that uses a look-up table that is nearly the fastest, i think
but - the table is rather large - lol

for a non-LUT version, probably drizz has the best code
you can search the old forum for routines

RuiLoureiro · April 26, 2013, 06:27:36 AM

Quote from: dedndave on April 26, 2013, 05:16:12 AM
there are some very fast 64-bit routines (32-bit code)
Paul Dixon wrote one that uses a look-up table that is nearly the fastest, i think
but - the table is rather large - lol

for a non-LUT version, probably drizz has the best code
you can search the old forum for routines

It is good to know. Nevertheless the cycles i got
is already good to me. Dont forget i am using the
old routines in The calculator and it solves the
expressions and show, very fast. I think less 400
or more cycles. For now i want to finish this work.
May be later i will think to improve it more. :t

dedndave · April 26, 2013, 10:50:41 AM

one thing i learned today about floating point to string conversion.....

the hard parts are handling the special values and rounding - lol
those sound easy, right ?

floating point conversion was the easy part :lol:

RuiLoureiro · April 26, 2013, 09:09:49 PM

Dave,
About the question of processing denormalized numbers
my first quick conclusion is this: it should not be
processed or it needs a particular processing.
In your previous post you said: to process Denormal
(exponent=0000) we simply inc exponent etc.

But pay attention: using the method i am using, not
the way you are working. Each method have its own
limitations.

Of course, first, we must and you must understand
the method to understand where is the problem.

Meanwhile, i think, the unnormalized numbers are
numbers near the zero beyond the limit (exponent
near -4932 etc.) if i am correct. (?)

If we have X=a.bcd... * 10^-4931, for instance,
and if we want to get an integer with 16 digits
we need to multiply X by

10^(15+4931)=10^4936

and we get infinity and then we get invalid operation
when we try to multiply. The limit is 10^4932.

This limit means this (for normal processing):

exponent= -(4932-dp) (dp=decimal places)

For 18 digits, dp=17 => exponent= -4915 is the limit

What it means ? For instance, it means that if after
some FPU operations we get X=a.bcd... * 10^-4916 and
then we want to convert this X to string it must
give error because the procedure cannot calculate
more than 10^4932.

Dave, did you understand or did i make some error ?

So, if i am correct and the unnormalized numbers
are numbers near the zero beyond the limit, it means
we may not pay attention to them (procedure limitations).

I hope you study this question before giving an answer.

dedndave · April 26, 2013, 09:26:01 PM

the smallest normal in REAL10 format is 0001_80000000_00000000
for REAL10, the integer bit (bit 63) is explicit, and is 1 for all normals

the next lower value is the largest denormal, 0000_7FFFFFFF_FFFFFFFF

it depends on how your code is written, i suppose
if you are using the FPU to evaluate reals, you might multiply it by 10^N, then adjust the exponent when done

in my code, i convert the values so that the integer bit is bit 0, instead of bit 63
that is simple - i just subtract 63 from the exponent :P
that makes it easy to evaluate denormals
for some other method, that may not work so well - you have to choose what works :t

in my routine, i convert 0000_7FFFFFFF_FFFFFFFF to 0001_7FFFFFFF_FFFFFFFF
with the shifted exponent, it evaluates correctly using the same code as normals

RuiLoureiro · April 26, 2013, 11:08:51 PM

Quote from: dedndave on April 26, 2013, 09:26:01 PM
in my routine, i convert 0000_7FFFFFFF_FFFFFFFF to 0001_7FFFFFFF_FFFFFFFF
with the shifted exponent, it evaluates correctly using the same code as normals

Ok, i think i found the solution and will explain it
but what you get ? What is the decimal value for
0001_7FFFFFFF_FFFFFFFF ?

EDIT:
In the other post you have

Quote
0000_00000000_00000000: 00000000 00000002 +0
8000_00000000_00000000: 00000000 00000002 -0
0000_00000000_00000001: 00000000 00002CE7 364519953188247460252
0000_7FFFFFFF_FFFFFFFF: 00000000 00002CFA 336210314311209350589
0000_80000000_00000000: 00000000 00002CFA 336210314311209350626
0000_FFFFFFFF_FFFFFFFF: 00000000 00002CFA 672420628622418701216
0001_00000000_00000000: 00000006 00000007 Invalid
7FFE_7FFFFFFF_FFFFFFFF: 00000006 00000007 Invalid
0001_80000000_00000000: 00000000 00002CFA 336210314311209350626
4000_C90FDAA2_2168C235: 00000000 0000003F 314159265358979323851
7FFE_FFFFFFFF_FFFFFFFF: 00000000 00001345 118973149535723176502
7FFF_00000000_00000000: 00000006 00000007 Invalid
7FFF_7FFFFFFF_FFFFFFFF: 00000006 00000007 Invalid
7FFF_80000000_00000000: 00000001 00000002 +∞
FFFF_80000000_00000000: 00000001 00000002 -∞
7FFF_80000000_00000001: 00000005 00000004 SNaN
7FFF_BFFFFFFF_FFFFFFFF: 00000005 00000004 SNaN
7FFF_C0000000_00000000: 00000003 0000000F Indefinite QNaN
7FFF_C0000000_00000001: 00000004 00000004 QNaN
7FFF_FFFFFFFF_FFFFFFFF: 00000004 00000004 QNaN

So it seems it is an INVALID range (red)
and a valid (green)
may be you want to say not 0001_7FFFFFFF_FFFFFFFF
but 0001_FFFFFFFF_FFFFFFFF, no ?
I saw it just now!

The MASM Forum

News:

Converting real4 to string

dedndave

RuiLoureiro

dedndave

RuiLoureiro

RuiLoureiro

dedndave

RuiLoureiro

dedndave

RuiLoureiro

dedndave

RuiLoureiro

dedndave

RuiLoureiro

dedndave

RuiLoureiro