News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Converting real4 to string

Started by RuiLoureiro, April 19, 2013, 05:50:44 AM

Previous topic - Next topic

dedndave

"hara-kiri"   :badgrin:

that is a simple function, Rui
for 32-bits, not even sure i would put it in a PROC - maybe a macro
you just want to use the BSR instruction   :t

RuiLoureiro

Quote from: dedndave on April 23, 2013, 05:26:20 AM
"hara-kiri"   :badgrin:

        it is what you do, Dave; i do arakiri, a kind of hara_kiri !  :P

dedndave

not me   :biggrin:

my wife won't let me   :(

RuiLoureiro

Here are some timings to convert
Real10 to string for 19,18,10 digits

note: CT -compress table
Quote
                       19 digits


3.141592653589793238  <---- pi 19 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DF ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DR ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DX ***
3.141592653589793238
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DY ***
2384 cycles, ConvertFloat10ZX, _Real10_2
3706 cycles, ConvertFloat10, _Real10_2
2380 cycles, ConvertFloat10Z, _Real10_2
2272 cycles, ConvertFloat10DF, _Real10_2
2169 cycles, ConvertFloat10DR, _Real10_2
2602 cycles, ConvertFloat10BX, _Real10_2
2681 cycles, ConvertFloat10BF, _Real10_2
2647 cycles, ConvertFloat10BY, _Real10_2
2128 cycles, ConvertFloat10DY, _Real10_2
2146 cycles, ConvertFloat10DX, _Real10_2
2400 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

***** Time table *****

2128  cycles, ConvertFloat10DY, direct, esp
2146  cycles, ConvertFloat10DX, direct, esp
2169  cycles, ConvertFloat10DR, direct, ebp
2272  cycles, ConvertFloat10DF, direct, ebp
2380  cycles, ConvertFloat10Z, BCD
2384  cycles, ConvertFloat10ZX, BCD
2400  cycles, ConvertFloat10CT, BCD, esp, CT
2602  cycles, ConvertFloat10BX, BCD
2647  cycles, ConvertFloat10BY, BCD, esp
2681  cycles, ConvertFloat10BF, BCD
3706  cycles, ConvertFloat10, BCD, Save FPU
********** END **********

; ##########################################
Quote
                  18 digits

3.14159265358979324  <---- pi 18 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DF ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DR ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DX ***
3.14159265358979324
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DY ***
1067 cycles, ConvertFloat10ZX, _Real10_2
2958 cycles, ConvertFloat10, _Real10_2
1083 cycles, ConvertFloat10Z, _Real10_2
2154 cycles, ConvertFloat10DF, _Real10_2
2021 cycles, ConvertFloat10DR, _Real10_2
863 cycles, ConvertFloat10BX, _Real10_2
855 cycles, ConvertFloat10BF, _Real10_2
896 cycles, ConvertFloat10BY, _Real10_2
1909 cycles, ConvertFloat10DY, _Real10_2
1910 cycles, ConvertFloat10DX, _Real10_2
883 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

***** Time table *****

855  cycles, ConvertFloat10BF, BCD
863  cycles, ConvertFloat10BX, BCD
883  cycles, ConvertFloat10CT, BCD, esp, CT
896  cycles, ConvertFloat10BY, BCD, esp
1067  cycles, ConvertFloat10ZX, BCD
1083  cycles, ConvertFloat10Z, BCD
1909  cycles, ConvertFloat10DY, direct, esp
1910  cycles, ConvertFloat10DX, direct, esp
2021  cycles, ConvertFloat10DR, direct, ebp
2154  cycles, ConvertFloat10DF, direct, ebp
2958  cycles, ConvertFloat10, BCD, Save FPU
********** END **********

; #########################################
Quote
                        10 digits

3.141592654  <---- pi 10 digits
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DF ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DR ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DX ***
3.141592654
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
*** STOP - ConvertFloat10DY ***
1132 cycles, ConvertFloat10ZX, _Real10_2
3015 cycles, ConvertFloat10, _Real10_2
1146 cycles, ConvertFloat10Z, _Real10_2
638 cycles, ConvertFloat10DF, _Real10_2
505 cycles, ConvertFloat10DR, _Real10_2
759 cycles, ConvertFloat10BX, _Real10_2
769 cycles, ConvertFloat10BF, _Real10_2
758 cycles, ConvertFloat10BY, _Real10_2
500 cycles, ConvertFloat10DY, _Real10_2
490 cycles, ConvertFloat10DX, _Real10_2

748 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

***** Time table *****

490  cycles, ConvertFloat10DX, direct, esp
500  cycles, ConvertFloat10DY, direct, esp
505  cycles, ConvertFloat10DR, direct, ebp
638  cycles, ConvertFloat10DF, direct, ebp
748  cycles, ConvertFloat10CT, BCD, esp, CT
758  cycles, ConvertFloat10BY, BCD, esp
759  cycles, ConvertFloat10BX, BCD
769  cycles, ConvertFloat10BF, BCD
1132  cycles, ConvertFloat10ZX, BCD
1146  cycles, ConvertFloat10Z, BCD
3015  cycles, ConvertFloat10, BCD, Save FPU
********** END **********


RuiLoureiro

#19
Sometimes, there are people that say BCD method is slow
Here we can see that it is not true (in my P4).

*** STOP - ConvertFloat10DY ***
1168 cycles, ConvertFloat10ZX, _Real10_2
3016 cycles, ConvertFloat10, _Real10_2
1144 cycles, ConvertFloat10Z, _Real10_2
602 cycles, ConvertFloat10DF, _Real10_2
468 cycles, ConvertFloat10DR, _Real10_2
730 cycles, ConvertFloat10BX, _Real10_2
736 cycles, ConvertFloat10BF, _Real10_2
716 cycles, ConvertFloat10BY, _Real10_2
467 cycles, ConvertFloat10DY, _Real10_2
463 cycles, ConvertFloat10DX, _Real10_2
761 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***
Quote
***** Time table *****

463  cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 10 digits
467  cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 10 digits
468  cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 10 digits
602  cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 10 digits
716  cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 10 digits
730  cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 10 digits
736  cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 10 digits
761  cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 10 digits
1144  cycles, ConvertFloat10Z, BCD -old - 10 digits
1168  cycles, ConvertFloat10ZX,BCD - old - 10 digits
3016  cycles, ConvertFloat10,  BCD, Save FPU -old - 10 digits
********** END **********
; ---------------------------------------------------------------------

*** STOP - ConvertFloat10DY ***
1049 cycles, ConvertFloat10ZX, _Real10_2
2933 cycles, ConvertFloat10, _Real10_2
1057 cycles, ConvertFloat10Z, _Real10_2
1408 cycles, ConvertFloat10DF, _Real10_2
1283 cycles, ConvertFloat10DR, _Real10_2
774 cycles, ConvertFloat10BX, _Real10_2
762 cycles, ConvertFloat10BF, _Real10_2
761 cycles, ConvertFloat10BY, _Real10_2
1457 cycles, ConvertFloat10DY, _Real10_2
1296 cycles, ConvertFloat10DX, _Real10_2
784 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

Quote
***** Time table *****

761  cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 15 digits
762  cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 15 digits
774  cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 15 digits
784  cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 15 digits <- compresstable
1049  cycles, ConvertFloat10ZX, BCD - old - 15 digits
1057  cycles, ConvertFloat10Z, BCD -old - 15 digits
1283  cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 15 digits
1296  cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 15 digits
1408  cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 15 digits
1457  cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 15 digits
2933  cycles, ConvertFloat10, BCD, Save FPU -old - 15 digits
********** END **********

; ---------------------------------------------------------------------
*** STOP - ConvertFloat10DY ***
1161 cycles, ConvertFloat10ZX, _Real10_2
3043 cycles, ConvertFloat10, _Real10_2
1219 cycles, ConvertFloat10Z, _Real10_2
1729 cycles, ConvertFloat10DF, _Real10_2
1639 cycles, ConvertFloat10DR, _Real10_2
820 cycles, ConvertFloat10BX, _Real10_2
812 cycles, ConvertFloat10BF, _Real10_2
792 cycles, ConvertFloat10BY, _Real10_2
1636 cycles, ConvertFloat10DY, _Real10_2
1609 cycles, ConvertFloat10DX, _Real10_2
881 cycles, ConvertFloat10CT, _Real10_2

*** Press any key to get the time table ***

Quote
***** Time table *****

792  cycles, ConvertFloat10BY, BCD, examine, fxtract, esp - 17 digits
812  cycles, ConvertFloat10BF, BCD, fxam, fxtract, esp - 17 digits
820  cycles, ConvertFloat10BX, BCD, fxam, fxtract, ebp - 17 digits
881  cycles, ConvertFloat10CT, BCD-CT, fxam, fxtract, esp - 17 digits
1161  cycles, ConvertFloat10ZX, BCD - old - 17 digits
1219  cycles, ConvertFloat10Z, BCD -old - 17 digits
1609  cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 17 digits
1636  cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 17 digits
1639  cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 17 digits
1729  cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 17 digits
3043  cycles, ConvertFloat10, BCD, Save FPU -old - 17 digits
********** END **********
; ---------------------------------------------------------------------

*** STOP - ConvertFloat10DY ***
2258 cycles, ConvertFloat10DF, _Real10_2
2144 cycles, ConvertFloat10DR, _Real10_2
2157 cycles, ConvertFloat10DY, _Real10_2
2135 cycles, ConvertFloat10DX, _Real10_2

*** Press any key to get the time table ***
Quote
We cannot use BCD for 19 digits

***** Time table *****
2135  cycles, ConvertFloat10DX, direct, fxam, fxtract, esp - 19 digits
2144  cycles, ConvertFloat10DR, direct, examine, fxtract, ebp - 19 digits
2157  cycles, ConvertFloat10DY, direct, examine, fxtract, esp - 19 digits
2258  cycles, ConvertFloat10DF, direct, examine, fyl2x, ebp - 19 digits
********** END **********

dedndave

BCD probably is slow if you want to multiply or divide long strings - lol

but, for adding and subtracting, it's quite fast   :t

a good bignum library is probably faster for large values, though - fewer memory accesses

RuiLoureiro

Quote from: dedndave on April 26, 2013, 04:36:53 AM
BCD probably is slow if you want to multiply or divide long strings - lol

but, for adding and subtracting, it's quite fast   :t

a good bignum library is probably faster for large values, though - fewer memory accesses
Dave,
       the last results show that if we want to convert
       a real10 to string with 15-17-18 digits, the best is
       BCD. But if we want only 10, the best is converting
       the integer (direct).
       Do you know what i am saying with BCD ? It is nothing to do
       with multiply or divide: use fbstp tbyte ...

dedndave

that's because you can get a 10-decimal-digit binary into a dword register, if it's under 4294967296   :P

by the way, Ray has done some cool stuff with BCD, if you didn't already know...

http://www.ray.masmcode.com/BCDtut.html

RuiLoureiro

Quote from: dedndave on April 26, 2013, 04:45:35 AM
that's because you can get a 10-decimal-digit binary into a dword register, if it's under 4294967296   :P
Yes, of course ! But BCD has nothing to do with multiply/divide
                 We simple use fbstp tbyte _packedBCD and we need to unpack
                 it to string no need to multiply,divide,add, subtract.
                 I use compresstable also but it is not better.

                 Well, it seems that converting EDX:EAX to string is not fast as
                 fbstp + unpack it. Well, in 32 bit !!!!!

                 The other things about that stuff it is not useful for me
                 About your other post, i need to test your results. I need time.               

dedndave

there are some very fast 64-bit routines (32-bit code)
Paul Dixon wrote one that uses a look-up table that is nearly the fastest, i think
but - the table is rather large - lol

for a non-LUT version, probably drizz has the best code
you can search the old forum for routines

RuiLoureiro

Quote from: dedndave on April 26, 2013, 05:16:12 AM
there are some very fast 64-bit routines (32-bit code)
Paul Dixon wrote one that uses a look-up table that is nearly the fastest, i think
but - the table is rather large - lol

for a non-LUT version, probably drizz has the best code
you can search the old forum for routines
It is good to know. Nevertheless the cycles i got
               is already good to me. Dont forget i am using the
               old routines in The calculator and it solves the
               expressions and show, very fast. I think less 400
               or more cycles. For now i want to finish this work.
               May be later i will think to improve it more.  :t

dedndave

one thing i learned today about floating point to string conversion.....

the hard parts are handling the special values and rounding - lol
those sound easy, right ?

floating point conversion was the easy part   :lol:

RuiLoureiro

Dave,
        About the question of processing denormalized numbers
        my first quick conclusion is this: it should not be
        processed or it needs a particular processing.
        In your previous post you said: to process Denormal
        (exponent=0000) we simply inc exponent etc.

        But pay attention: using the method i am using, not
        the way you are working
. Each method have its own
        limitations
.

        Of course, first, we must and you must understand
        the method to understand where is the problem.

        Meanwhile, i think, the unnormalized numbers are
        numbers near the zero beyond the limit (exponent
        near -4932 etc.) if i am correct. (?)

        If we have X=a.bcd... * 10^-4931, for instance,
        and if we want to get an integer with 16 digits
        we need to multiply X by

                    10^(15+4931)=10^4936

        and we get infinity and then we get invalid operation
        when we try to multiply. The limit is 10^4932.

        This limit means this (for normal processing):

                         exponent= -(4932-dp)    (dp=decimal places)


        For 18 digits, dp=17 => exponent= -4915  is the limit

        What it means ? For instance, it means that if after
        some FPU operations we get X=a.bcd... * 10^-4916 and
        then we want to convert this X to string it must
        give error because the procedure cannot calculate
        more than 10^4932.

        Dave, did you understand or did i make some error ?

        So, if i am correct and the unnormalized numbers
        are numbers near the zero beyond the limit, it means
        we may not pay attention to them (procedure limitations).

        I hope you study this question before giving an answer.

dedndave

the smallest normal in REAL10 format is 0001_80000000_00000000
for REAL10, the integer bit (bit 63) is explicit, and is 1 for all normals

the next lower value is the largest denormal, 0000_7FFFFFFF_FFFFFFFF

it depends on how your code is written, i suppose
if you are using the FPU to evaluate reals, you might multiply it by 10^N, then adjust the exponent when done

in my code, i convert the values so that the integer bit is bit 0, instead of bit 63
that is simple - i just subtract 63 from the exponent   :P
that makes it easy to evaluate denormals
for some other method, that may not work so well - you have to choose what works   :t

in my routine, i convert 0000_7FFFFFFF_FFFFFFFF to 0001_7FFFFFFF_FFFFFFFF
with the shifted exponent, it evaluates correctly using the same code as normals

RuiLoureiro

Quote from: dedndave on April 26, 2013, 09:26:01 PM
in my routine, i convert 0000_7FFFFFFF_FFFFFFFF to 0001_7FFFFFFF_FFFFFFFF
with the shifted exponent, it evaluates correctly using the same code as normals
Ok, i think i found the solution and will explain it
                  but what you get ? What is the decimal value for
                  0001_7FFFFFFF_FFFFFFFF ?

EDIT:
In the other post you have
Quote
0000_00000000_00000000: 00000000 00000002 +0
8000_00000000_00000000: 00000000 00000002 -0
0000_00000000_00000001: 00000000 00002CE7 364519953188247460252
0000_7FFFFFFF_FFFFFFFF: 00000000 00002CFA 336210314311209350589
0000_80000000_00000000: 00000000 00002CFA 336210314311209350626
0000_FFFFFFFF_FFFFFFFF: 00000000 00002CFA 672420628622418701216
0001_00000000_00000000: 00000006 00000007 Invalid
7FFE_7FFFFFFF_FFFFFFFF: 00000006 00000007 Invalid

0001_80000000_00000000: 00000000 00002CFA 336210314311209350626
4000_C90FDAA2_2168C235: 00000000 0000003F 314159265358979323851
7FFE_FFFFFFFF_FFFFFFFF: 00000000 00001345 118973149535723176502
7FFF_00000000_00000000: 00000006 00000007 Invalid
7FFF_7FFFFFFF_FFFFFFFF: 00000006 00000007 Invalid
7FFF_80000000_00000000: 00000001 00000002 +∞
FFFF_80000000_00000000: 00000001 00000002 -∞
7FFF_80000000_00000001: 00000005 00000004 SNaN
7FFF_BFFFFFFF_FFFFFFFF: 00000005 00000004 SNaN
7FFF_C0000000_00000000: 00000003 0000000F Indefinite QNaN
7FFF_C0000000_00000001: 00000004 00000004 QNaN
7FFF_FFFFFFFF_FFFFFFFF: 00000004 00000004 QNaN
So it seems it is an INVALID range (red)
                and a valid (green)
                may be you want to say not 0001_7FFFFFFF_FFFFFFFF
                but 0001_FFFFFFFF_FFFFFFFF, no ?
                I saw it just now!