The MASM Forum

General => The Laboratory => Topic started by: Siekmanski on September 28, 2018, 07:58:04 AM

Title: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 28, 2018, 07:58:04 AM
Wrote a Real4 to ASCII string routine.
It checks floating-point exceptions and sends messages if they occur

edit: See Reply #23 for comments http://masm32.com/board/index.php?topic=7441.msg81609#msg81609

If you spot something stupid or find optimizations please tell me.  :idea:

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 69 RoutineTime: 0.022400193 seconds
sprintf       Cycles: 1955 RoutineTime: 0.600757429 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014




EDIT: Latest source code (october 11 2018):
Title: Re: SIMD Real4 to ASCII string routine
Post by: LiaoMi on September 28, 2018, 08:27:46 AM
Hi Siekmanski,

cool implementation, thanks!


SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 263 RoutineTime: 0.094416496 seconds
sprintf       Cycles: 1052 RoutineTime: 0.376839394 seconds

Result Real4_2_ASCII: -0.12345670
Result sprintf      : -0.123457

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: jj2007 on September 28, 2018, 08:35:24 AM
Timings for the i7's little brother:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 302 RoutineTime: 0.128082656 seconds
sprintf       Cycles: 1328 RoutineTime: 0.513514157 seconds
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 28, 2018, 09:09:42 AM
Minor correction in the source code, wrong character was send to the string when correcting the rounding.  :lol: :lol:
New source code posted in original post.
Title: Re: SIMD Real4 to ASCII string routine
Post by: HSE on September 28, 2018, 09:38:31 AM
Very nice  :t


SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

AMD A6-3500 APU with Radeon(tm) HD Graphics

Routine timers starting now....

Real4_2_ASCII Cycles: 802 RoutineTime: 0.391841977 seconds
sprintf       Cycles: 1602 RoutineTime: 0.771314581 seconds

Result Real4_2_ASCII: -0.12345675
Result sprintf      : -0.123457

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: Mikl__ on September 28, 2018, 05:40:38 PM
Hi, Siekmanski!
SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i3-4330 CPU @ 3.50GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 327 RoutineTime: 0.094714404 seconds
sprintf       Cycles: 1308 RoutineTime: 0.374306773 seconds

Result Real4_2_ASCII: -0.12345675
Result sprintf      : -0.123457

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: HSE on September 28, 2018, 10:19:54 PM
Almost thinking :biggrin:

How much difference result because the algorithm and how much because the function is inside a system dll?
Title: Re: SIMD Real4 to ASCII string routine
Post by: FORTRANS on September 28, 2018, 11:25:12 PM
Hi,

   A mixed set of results.

{P-III}
= = =
SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

????

Routine timers starting now....

{Illegal Instruction Occured.}
= = =
SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Pentium(R) M processor 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 1249 RoutineTime: 0.741747624 seconds
sprintf       Cycles: 3200 RoutineTime: 1.885160265 seconds

Result Real4_2_ASCII: -0.12345675
Result sprintf      : -0.123457

Press any key to continue...
= = =
SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 338 RoutineTime: 0.196068641 seconds
sprintf       Cycles: 1449 RoutineTime: 0.906805013 seconds

Result Real4_2_ASCII: -0.12345675
Result sprintf      : -0.123457

Press any key to continue...


   Except not a cursor key apparently.

HTH,

Steve N.
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on September 28, 2018, 11:33:59 PM
Hi Siekmanski,
                     Good job! Seems to work fine  :t
                     But rounding seems to be always 5: 0 is 0.00000005
                     It should be 0 or 0.0000000 ?
                     Dont post my results P4 is too old! :biggrin:
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 29, 2018, 01:02:40 AM
Thanks guys,   :biggrin:

The rounding problem is solved, I will post the new sources in about a few hours and maybe it is now 32 times faster then sprintf, stay tuned.  :t
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on September 29, 2018, 01:22:35 AM
Quote from: Siekmanski on September 29, 2018, 01:02:40 AM
Thanks guys,   :biggrin:

The rounding problem is solved, I will post the new sources in about a few hours and maybe it is now 32 times faster then sprintf, stay tuned.  :t
Well, you may call it Real4_3_ASCII, Real4_4_ASCII... to say version3, version4,... we may follow it easily. Is only one idea :t

LATER: if the result is 0 the routine should print 0.0000000 ? If it is 0.1 should print 0.1000000 ?
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 29, 2018, 02:51:13 AM
Massive improvement in speed and precision.
Got new logic insight in the algorithm, when reading the routine notes again, which I wrote on paper first.  :idea:
Now it is +/- 50 times faster than sprintf.  :eusa_dance:

New source code uploaded in the first post.
Title: Re: SIMD Real4 to ASCII string routine
Post by: HSE on September 29, 2018, 03:25:43 AM
Only 41 times!

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

AMD A6-3500 APU with Radeon(tm) HD Graphics

Routine timers starting now....

Real4_2_ASCII Cycles: 49 RoutineTime: 0.026793207 seconds
sprintf       Cycles: 2017 RoutineTime: 1.048435206 seconds

Result Real4_2_ASCII:  1.6777216
Result sprintf      : 1.677722

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: LiaoMi on September 29, 2018, 04:07:20 AM

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 24 RoutineTime: 0.010178617 seconds
sprintf       Cycles: 1239 RoutineTime: 0.439053661 seconds

Result Real4_2_ASCII:  1.6777216
Result sprintf      : 1.677722

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on September 29, 2018, 04:58:01 AM
 :biggrin:
Hi Siekmanski,
                      The rounding problem is not solved yet: all strings end in 5 starting with 0 that gives 0.00000005 which is not zero. -333.3 gives -333.20995 (only 3 digits are correct) i dont know why. -99999999.9 gives -0000000/5.
0.00000001  gives  0.00000005
0.000000001 gives 0.0000000<<< the same and the same for 2,3,4,5,6,7,8,9.
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 29, 2018, 12:52:19 PM
Hi Rui,
You are right, this is not what we want.  :(
The only logical thing I can think of right now is that the 8 digit calculation is not enough to cover the 32bit floating-point rounding phenomena.
Have to think this all over, suggestions are welcome of course.
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on September 30, 2018, 04:01:08 AM
Quote from: Siekmanski on September 29, 2018, 12:52:19 PM
Hi Rui,
You are right, this is not what we want.  :(
The only logical thing I can think of right now is that the 8 digit calculation is not enough to cover the 32bit floating-point rounding phenomena.
Have to think this all over, suggestions are welcome of course.
Hi Siekmanski,
                     It seems that you need to study the problem or you may try another algo to get the digits. It seems that there is a problem when it prints the string in the scientific format or it doesnt do it. When we multiply -12345.678 by -123456.78 it gives 000000005 but the result is 1.5241577E+9 (last digit rounded). So i think you need time and a little bit of work. Try another algo.  :t
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on September 30, 2018, 04:42:44 AM
Tomorrow i'll try to do something else, my purpuse was to write it in SSE only with 32bit float calculations. ( max 8 digits )
1.5241577E+7 fits inside a real4, but 1.5241577E+9 doesn't.

AFAIK the largest number that fits inside a real4 is 16777215 (24bit), maybe I'm wrong here?
If so I need to do calculations for more than 8 digits.
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on September 30, 2018, 04:59:58 AM
Quote from: Siekmanski on September 30, 2018, 04:42:44 AM
Tomorrow i'll try to do something else, my purpuse was to write it in SSE only with 32bit float calculations. ( max 8 digits )
1.5241577E+7 fits inside a real4, but 1.5241577E+9 doesn't.

AFAIK the largest number that fits inside a real4 is 16777215 (24bit), maybe I'm wrong here?
If so I need to do calculations for more than 8 digits.
           You are not right, in real4 the exponent goes from -38 to +38. So the converter should show some numbers d.xxxxxxE-38 to d.xxxxxxE+38 (see simplyFPU). So if it shows 1.5241577E+9 it is far from the limit. You dont need to do calculations for more than 8 digits but you need to decode the exponent part. It seems that you dont do it.
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 01, 2018, 09:10:04 AM
Thanks Rui,

Now I know what to do.  :t
Title: Re: SIMD Real4 to ASCII string routine
Post by: nidud on October 01, 2018, 09:27:33 AM
deleted
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 01, 2018, 10:01:11 AM
Thanks nidud,  :t

My mistake was that I thought there would be no more than 8 digits in the largest value.
I misunderstood the real4 format.

Thanks to Rui I'm a bit wiser now.

So far I tested this in masm to see what the largest real4 value would be:

masm real4 3.40282356E+38     maximum input for a real4 value
sprintf    340282306073709650000000000000000000000.000000 39:6 digits this is the result from sprintf
sprintf    3.40282e+038 scientific notation

The maximum possible digits before the floating-point is 39, from which the first most significant 7 digits are reliable values, the rest is just garbage but need to be counted as digits to present the number and the rest are just zeros.
sprintf, prints the first 7-8 digits, then followed by 9 or 10 garbage numbers, the rest are zeros.

If the number fits as a whole in 8 digits, i'll print it as such else, I will print it as scientific notation with 8 digits.
Title: Re: SIMD Real4 to ASCII string routine
Post by: RuiLoureiro on October 02, 2018, 03:32:44 AM
Hi Siekmanski,
                     >> If the number fits as a whole in 8 digits, i'll print it as such else, I will print it as scientific notation with 8 digits. (which means 7 decimal places)

                        Very well, seems to be a good decision (we dont need to see garbage) :t
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 12, 2018, 07:57:32 AM
In the previous sources I calculated with 8 digits which causes the occasional rounding errors.
And I didn't had enough knowledge of the internal workings of the floating point format.

In this new routine 7 digits are used for the calculations and does the job without errors ( so far as I have tested it, no errors occurred ).
And it now covers the whole range -1.175494E-38 to 3.402823E+38

I'm still not happy with the speed of the maximum digits count routine.
It now uses a fast Log10(x)+1 approximation routine but, it needs a few checks to get the exact number of digits from the float.
For now it only prints in scientific notation but, it's a fast one and without memory swaps to insert the decimal point to construct the string.
The decimal point is now integrated into the ascii converter constant.

I'll continue and try to write the fastest possible float to ascii routine.
I still have another idea to write a maximum digits count routine in a totally other way and hope it will be faster than the Log10(x)+1 approach.
Next week I'll start coding it and will see if it is faster or not.
Will be continued.

I have posted the fully commented source code in the first post. http://masm32.com/board/index.php?topic=7441.msg81351#msg81351

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 69 RoutineTime: 0.022400193 seconds
sprintf       Cycles: 1955 RoutineTime: 0.600757429 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: mabdelouahab on October 12, 2018, 09:48:43 AM

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 61 RoutineTime: 0.028886304 seconds
sprintf       Cycles: 1866 RoutineTime: 3.438464466 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
Title: Re: SIMD Real4 to ASCII string routine
Post by: HSE on October 12, 2018, 10:17:54 AM

AMD A6-3500 APU with Radeon(tm) HD Graphics

Real4_2_ASCII Cycles: 98 RoutineTime: 0.049566970 seconds
sprintf       Cycles: 2630 RoutineTime: 1.251230629 seconds

:t
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 12, 2018, 10:54:45 AM
Rui found a typo in the floating-point exceptions list for the +Infinity message.

change 0FF800000h to 07F800000h
Should be:

    cmp         eax,07F800000h
    je            message_Inf

Title: Re: SIMD Real4 to ASCII string routine
Post by: HSE on October 12, 2018, 11:09:41 AM
More easy to read that lines:
    ; check floating-point exceptions
    check macro value, message
        cmp         eax, &value
        je          &message
    endm   

    check 0FFFFFFFFh, message_QnegNaN
    check 0FFC00001h, message_QnegNaN
    check 0FFBFFFFFh, message_SnegNaN
    check 0FF800001h, message_SnegNaN
    check 0FFC00000h, message_Indeterm
    check 0FF800000h, message_NegInf
    check 0FF7FFFFFh, message_NegNorm
    check 07F7FFFFFh, message_Norm
    check 07F800000h, message_Inf
    check 07F800001h, message_SNaN
    check 07FBFFFFFh, message_SNaN
    check 07FC00000h, message_QNaN
    check 07FFFFFFFh, message_QNaN

Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 12, 2018, 11:16:10 AM
Quote from: HSE on October 12, 2018, 11:09:41 AM
More easy to read that lines:
    ; check floating-point exceptions
    check macro value, message
        cmp         eax, &value
        je          &message
    endm   

    check 0FFFFFFFFh, message_QnegNaN
    check 0FFC00001h, message_QnegNaN
    check 0FFBFFFFFh, message_SnegNaN
    check 0FF800001h, message_SnegNaN
    check 0FFC00000h, message_Indeterm
    check 0FF800000h, message_NegInf
    check 0FF7FFFFFh, message_NegNorm
    check 07F7FFFFFh, message_Norm
    check 07F800000h, message_Inf
    check 07F800001h, message_SNaN
    check 07FBFFFFFh, message_SNaN
    check 07FC00000h, message_QNaN
    check 07FFFFFFFh, message_QNaN



:t
Title: Re: SIMD Real4 to ASCII string routine
Post by: FORTRANS on October 13, 2018, 01:09:16 AM
Hi.

= = =
Redirect to file.

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Pentium(R) M processor 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 106 RoutineTime: 0.082408239 seconds
sprintf       Cycles: 4925 RoutineTime: 3.293436177 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
= = =
Screen Capture
F:\TEMP\TEST>REAL4_2_.EXE

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Pentium(R) M processor 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 107 RoutineTime: 0.066058396 seconds
sprintf       Cycles: 4941 RoutineTime: 2.925534111 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...
= = =

SIMD Real4 to ASCII conversion by Siekmanski 2018.

1000000 calls per Run for the Cycle counter and the Routine timer.

Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz

Routine timers starting now....

Real4_2_ASCII Cycles: 68 RoutineTime: 0.043186625 seconds
sprintf       Cycles: 2923 RoutineTime: 1.514711201 seconds

Result Real4_2_ASCII:  1.234567e+14
Result sprintf      : 1.234567e+014

Press any key to continue...


   Some timing difference between a redirect results to a file
and a screen capture of the results.

HTH,

Steve N.
Title: Re: SIMD Real4 to ASCII string routine
Post by: Siekmanski on October 13, 2018, 01:37:08 AM
Thank you guys.

@Steve, seems to run faster on older cpu's.