News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

SIMD Real4 to ASCII string routine

Started by Siekmanski, September 28, 2018, 07:58:04 AM

Previous topic - Next topic

Siekmanski

Hi Rui,
You are right, this is not what we want.  :(
The only logical thing I can think of right now is that the 8 digit calculation is not enough to cover the 32bit floating-point rounding phenomena.
Have to think this all over, suggestions are welcome of course.
Creative coders use backward thinking techniques as a strategy.

RuiLoureiro

Quote from: Siekmanski on September 29, 2018, 12:52:19 PM
Hi Rui,
You are right, this is not what we want.  :(
The only logical thing I can think of right now is that the 8 digit calculation is not enough to cover the 32bit floating-point rounding phenomena.
Have to think this all over, suggestions are welcome of course.
Hi Siekmanski,
                     It seems that you need to study the problem or you may try another algo to get the digits. It seems that there is a problem when it prints the string in the scientific format or it doesnt do it. When we multiply -12345.678 by -123456.78 it gives 000000005 but the result is 1.5241577E+9 (last digit rounded). So i think you need time and a little bit of work. Try another algo.  :t

Siekmanski

Tomorrow i'll try to do something else, my purpuse was to write it in SSE only with 32bit float calculations. ( max 8 digits )
1.5241577E+7 fits inside a real4, but 1.5241577E+9 doesn't.

AFAIK the largest number that fits inside a real4 is 16777215 (24bit), maybe I'm wrong here?
If so I need to do calculations for more than 8 digits.
Creative coders use backward thinking techniques as a strategy.

RuiLoureiro

Quote from: Siekmanski on September 30, 2018, 04:42:44 AM
Tomorrow i'll try to do something else, my purpuse was to write it in SSE only with 32bit float calculations. ( max 8 digits )
1.5241577E+7 fits inside a real4, but 1.5241577E+9 doesn't.

AFAIK the largest number that fits inside a real4 is 16777215 (24bit), maybe I'm wrong here?
If so I need to do calculations for more than 8 digits.
           You are not right, in real4 the exponent goes from -38 to +38. So the converter should show some numbers d.xxxxxxE-38 to d.xxxxxxE+38 (see simplyFPU). So if it shows 1.5241577E+9 it is far from the limit. You dont need to do calculations for more than 8 digits but you need to decode the exponent part. It seems that you dont do it.

Siekmanski

Thanks Rui,

Now I know what to do.  :t
Creative coders use backward thinking techniques as a strategy.

nidud

#20
deleted

Siekmanski

Thanks nidud,  :t

My mistake was that I thought there would be no more than 8 digits in the largest value.
I misunderstood the real4 format.

Thanks to Rui I'm a bit wiser now.

So far I tested this in masm to see what the largest real4 value would be:

masm real4 3.40282356E+38     maximum input for a real4 value
sprintf    340282306073709650000000000000000000000.000000 39:6 digits this is the result from sprintf
sprintf    3.40282e+038 scientific notation

The maximum possible digits before the floating-point is 39, from which the first most significant 7 digits are reliable values, the rest is just garbage but need to be counted as digits to present the number and the rest are just zeros.
sprintf, prints the first 7-8 digits, then followed by 9 or 10 garbage numbers, the rest are zeros.

If the number fits as a whole in 8 digits, i'll print it as such else, I will print it as scientific notation with 8 digits.
Creative coders use backward thinking techniques as a strategy.

RuiLoureiro

Hi Siekmanski,
                     >> If the number fits as a whole in 8 digits, i'll print it as such else, I will print it as scientific notation with 8 digits. (which means 7 decimal places)

                        Very well, seems to be a good decision (we dont need to see garbage) :t

Siekmanski

In the previous sources I calculated with 8 digits which causes the occasional rounding errors.
And I didn't had enough knowledge of the internal workings of the floating point format.

In this new routine 7 digits are used for the calculations and does the job without errors ( so far as I have tested it, no errors occurred ).
And it now covers the whole range -1.175494E-38 to 3.402823E+38

I'm still not happy with the speed of the maximum digits count routine.
It now uses a fast Log10(x)+1 approximation routine but, it needs a few checks to get the exact number of digits from the float.
For now it only prints in scientific notation but, it's a fast one and without memory swaps to insert the decimal point to construct the string.
The decimal point is now integrated into the ascii converter constant.

I'll continue and try to write the fastest possible float to ascii routine.
I still have another idea to write a maximum digits count routine in a totally other way and hope it will be faster than the Log10(x)+1 approach.
Next week I'll start coding it and will see if it is faster or not.
Will be continued.

I have posted the fully commented source code in the first post. http://masm32.com/board/index.php?topic=7441.msg81351#msg81351

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 69 RoutineTime: 0.022400193 seconds
sprintf       Cycles: 1955 RoutineTime: 0.600757429 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
Creative coders use backward thinking techniques as a strategy.

mabdelouahab


SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 61 RoutineTime: 0.028886304 seconds
sprintf       Cycles: 1866 RoutineTime: 3.438464466 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...

HSE


AMD A6-3500 APU with Radeon(tm) HD Graphics

Real4_2_ASCII Cycles: 98 RoutineTime: 0.049566970 seconds
sprintf       Cycles: 2630 RoutineTime: 1.251230629 seconds

:t
Equations in Assembly: SmplMath

Siekmanski

Rui found a typo in the floating-point exceptions list for the +Infinity message.

change 0FF800000h to 07F800000h
Should be:

    cmp         eax,07F800000h
    je            message_Inf

Creative coders use backward thinking techniques as a strategy.

HSE

More easy to read that lines:
    ; check floating-point exceptions
    check macro value, message
        cmp         eax, &value
        je          &message
    endm   

    check 0FFFFFFFFh, message_QnegNaN
    check 0FFC00001h, message_QnegNaN
    check 0FFBFFFFFh, message_SnegNaN
    check 0FF800001h, message_SnegNaN
    check 0FFC00000h, message_Indeterm
    check 0FF800000h, message_NegInf
    check 0FF7FFFFFh, message_NegNorm
    check 07F7FFFFFh, message_Norm
    check 07F800000h, message_Inf
    check 07F800001h, message_SNaN
    check 07FBFFFFFh, message_SNaN
    check 07FC00000h, message_QNaN
    check 07FFFFFFFh, message_QNaN

Equations in Assembly: SmplMath

Siekmanski

Quote from: HSE on October 12, 2018, 11:09:41 AM
More easy to read that lines:
    ; check floating-point exceptions
    check macro value, message
        cmp         eax, &value
        je          &message
    endm   

    check 0FFFFFFFFh, message_QnegNaN
    check 0FFC00001h, message_QnegNaN
    check 0FFBFFFFFh, message_SnegNaN
    check 0FF800001h, message_SnegNaN
    check 0FFC00000h, message_Indeterm
    check 0FF800000h, message_NegInf
    check 0FF7FFFFFh, message_NegNorm
    check 07F7FFFFFh, message_Norm
    check 07F800000h, message_Inf
    check 07F800001h, message_SNaN
    check 07FBFFFFFh, message_SNaN
    check 07FC00000h, message_QNaN
    check 07FFFFFFFh, message_QNaN



:t
Creative coders use backward thinking techniques as a strategy.

FORTRANS

Hi.

= = =
Redirect to file.

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Pentium(R) M processor 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 106 RoutineTime: 0.082408239 seconds
sprintf       Cycles: 4925 RoutineTime: 3.293436177 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
= = =
Screen Capture
F:\TEMP\TEST>REAL4_2_.EXE

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Pentium(R) M processor 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 107 RoutineTime: 0.066058396 seconds
sprintf       Cycles: 4941 RoutineTime: 2.925534111 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
= = =

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 68 RoutineTime: 0.043186625 seconds
sprintf       Cycles: 2923 RoutineTime: 1.514711201 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...


   Some timing difference between a redirect results to a file
and a screen capture of the results.

HTH,

Steve N.