News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Dword to ascii (dw2a, dwtoa, dw2str, Str$, ...)

Started by jj2007, April 01, 2024, 03:42:50 AM

Previous topic - Next topic

jj2007

Inspired by ahsat's Binary to displayable text, any base thread, I searched the Lab for our fastest dword-to-string. To my surprise, there is almost nothing; please correct me if I am wrong.

So I fumbled together a testbed. Can I have some timings, please? Even better: some really fast algos?

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

4471    cycles for 100 * dwtoa
31143   cycles for 100 * dw2a
3531    cycles for 100 * dw2str

4428    cycles for 100 * dwtoa
31325   cycles for 100 * dw2a
3456    cycles for 100 * dw2str

4463    cycles for 100 * dwtoa
31232   cycles for 100 * dw2a
3440    cycles for 100 * dw2str

4400    cycles for 100 * dwtoa
31364   cycles for 100 * dw2a
3458    cycles for 100 * dw2str

4440    cycles for 100 * dwtoa
31783   cycles for 100 * dw2a
3472    cycles for 100 * dw2str

Averages:
4444    cycles for dwtoa
31307   cycles for dw2a
3462    cycles for dw2str

20      bytes for dwtoa
20      bytes for dw2a
74      bytes for dw2str

dwtoa                                   123456789
dw2a                                    123456789
dw2str                                  123456789

dw2a is CRT printf
dw2str is an adaption of the Masm32 lib dwtoa algo

jj2007

Version 2, with Ray's algo:
AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

4485    cycles for 100 * dwtoa
3452    cycles for 100 * dw2str
16911  cycles for 100 * MasmBasic Str$()
12479  cycles for 100 * Ray's algo

4412    cycles for 100 * dwtoa
3480    cycles for 100 * dw2str
17000  cycles for 100 * MasmBasic Str$()
12466  cycles for 100 * Ray's algo

4446    cycles for 100 * dwtoa
3470    cycles for 100 * dw2str
16844  cycles for 100 * MasmBasic Str$()
12497  cycles for 100 * Ray's algo

4400    cycles for 100 * dwtoa
3444    cycles for 100 * dw2str
16814  cycles for 100 * MasmBasic Str$()
12511  cycles for 100 * Ray's algo

Averages:
4429    cycles for dwtoa
3461    cycles for dw2str
16878  cycles for MasmBasic Str$()
12488  cycles for Ray's algo

20      bytes for dwtoa
74      bytes for dw2str
16      bytes for MasmBasic Str$()
138    bytes for Ray's algo

dwtoa                                  123456789
dw2a                                    123456789
dw2str                                  123456789
MasmBasic Str$()                        123456789
Ray's algo                              123456789

Here is Ray's algo, with minor modifications:
;----------------------------------------------------------------------------
;          Original by Ray Gwinn
; Any number to any base with a specified digit length, will pad leading zeros.
; Parameters:
;   eax    The number to convert
;   ebx    The desired base
;   ecx    The desired digit count
;   edi    Where to place the converted string
;----------------------------------------------------------------------------
ctAnyToAny:
  push edx ; save edx, or a digit
  cdq ; same as xor edx, edx: zero edx
  div ebx ; generate next digit in edx
  dec ecx ; decrement digit count
  .if !Zero? ; br if done
call ctAnyToAny ; generate next digit
  .endif
  mov al, @digits[edx] ; get the ascii value
  stosb ; save it
  pop edx ; restore edx, or get next digit
  ret

sinsi

13th Gen Intel(R) Core(TM) i9-13900KF (SSE4)

2155    cycles for 100 * dwtoa
1928    cycles for 100 * dw2str
11807   cycles for 100 * MasmBasic Str$()
2912    cycles for 100 * Ray's algo

2137    cycles for 100 * dwtoa
1897    cycles for 100 * dw2str
11673   cycles for 100 * MasmBasic Str$()
2922    cycles for 100 * Ray's algo

2133    cycles for 100 * dwtoa
1913    cycles for 100 * dw2str
11662   cycles for 100 * MasmBasic Str$()
2934    cycles for 100 * Ray's algo

2133    cycles for 100 * dwtoa
1897    cycles for 100 * dw2str
11687   cycles for 100 * MasmBasic Str$()
2924    cycles for 100 * Ray's algo

Averages:
2135    cycles for dwtoa
1905    cycles for dw2str
11680   cycles for MasmBasic Str$()
2923    cycles for Ray's algo

20      bytes for dwtoa
74      bytes for dw2str
16      bytes for MasmBasic Str$()
138     bytes for Ray's algo

dwtoa                                   123456789
dw2a                                    123456789
dw2str                                  123456789
MasmBasic Str$()                        123456789
Ray's algo                              123456789

TimoVJL

AMD Athlon(tm) II X2 220 Processor (SSE3)

7967    cycles for 100 * dwtoa
7799    cycles for 100 * dw2str
27640  cycles for 100 * MasmBasic Str$()
31049  cycles for 100 * Ray's algo

7999    cycles for 100 * dwtoa
7750    cycles for 100 * dw2str
27413  cycles for 100 * MasmBasic Str$()
31224  cycles for 100 * Ray's algo

7968    cycles for 100 * dwtoa
7747    cycles for 100 * dw2str
27365  cycles for 100 * MasmBasic Str$()
31046  cycles for 100 * Ray's algo

7965    cycles for 100 * dwtoa
7747    cycles for 100 * dw2str
27587  cycles for 100 * MasmBasic Str$()
31092  cycles for 100 * Ray's algo

Averages:
7968    cycles for dwtoa
7748    cycles for dw2str
27500  cycles for MasmBasic Str$()
31070  cycles for Ray's algo

20      bytes for dwtoa
74      bytes for dw2str
16      bytes for MasmBasic Str$()
138    bytes for Ray's algo

dwtoa                                  123456789
dw2a                                    123456789
dw2str                                  123456789
MasmBasic Str$()                        123456789
Ray's algo                              123456789

--- ok ---
May the source be with you

ahsat

Any base that is a power of two, such as, 2=bianry, 8=octal, 16=hex, will convert faster because they can be done with shifts, avoiding timely multiply and divide instructions. They will always be faster than the recursive code.

Any base that is not a power of two, like our everyday base 10=decimal and base 60=Sexagesimal :-), will convert faster using the recursive code.

Don't forget that in the everyday world, base 10 is the most used.

ahsat

Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (SSE4)

3709    cycles for 100 * dwtoa
3567    cycles for 100 * dw2str
17710   cycles for 100 * MasmBasic Str$()
11878   cycles for 100 * Ray's algo

4336    cycles for 100 * dwtoa
3609    cycles for 100 * dw2str
15036   cycles for 100 * MasmBasic Str$()
9591    cycles for 100 * Ray's algo

3862    cycles for 100 * dwtoa
3340    cycles for 100 * dw2str
13489   cycles for 100 * MasmBasic Str$()
10160   cycles for 100 * Ray's algo

4346    cycles for 100 * dwtoa
3768    cycles for 100 * dw2str
16407   cycles for 100 * MasmBasic Str$()
10623   cycles for 100 * Ray's algo

ahsat

Quote from: sinsi on April 01, 2024, 06:22:32 AM13th Gen Intel(R) Core(TM) i9-13900KF (SSE4)
Intel must have really improved the speed of a divide on this processor.

jj2007

Version 3, dword to decimal string, this time with a negative number:

Averages:
4670    cycles for dwtoa
2923    cycles for dw2str
17191   cycles for MasmBasic Str$()
14168   cycles for Ray's algo

20      bytes for dwtoa
74      bytes for dw2str
16      bytes for MasmBasic Str$()
110     bytes for Ray's algo

dwtoa                                   -123456789
dw2a                                    4171510507
dw2str                                  -123456789
MasmBasic Str$()                        -123456789
Ray's algo                              171510507

For positive numbers, all algos work fine. My dw2str got a little speed boost, strangely enough also thanks to a jecxz :cool:

NoCforMe

OK, it's spring, and apparently that's time for a favorite indoor sport around here: How Fast is My Algo?

I always have to laugh at all this attention paid here to speed when it comes to numeric-conversion routines, which seem to be the favorite category in this sport. Why? Because it's all so ridiculous; really, except in a few cases, who cares how fast they are? Usually these routines will be used to either accept or display user input/output, in which case the time it takes to do the conversion is grossly swamped by the input/output time.

But "Oh, look, my a2bin function is a whole 25 microseconds faster than yours!".

But I suppose it's a harmless sport at that. And so far as our OP here goes (here's looking at you, @ahsat), it's all part of your assembly learning experience, so I won't begrudge you that. Just don't fall for the hype here.
Assembly language programming should be fun. That's why I do it.

jj2007

Oh dear... can you explain, in a few words, why you are hanging around here? Apparently, it's not for assembly programming.

NoCforMe

I just like pissing you off, JJ.

Besides, as you well know, I've posted plenty of code here and continue to do so.
Assembly language programming should be fun. That's why I do it.

sinsi

Quote from: NoCforMe on April 01, 2024, 09:37:41 AMOK, it's spring, and apparently that's time for a favorite indoor sport around here: How Fast is My Algo?

I always have to laugh at all this attention paid here to speed when it comes to numeric-conversion routines, which seem to be the favorite category in this sport. Why? Because it's all so ridiculous; really, except in a few cases, who cares how fast they are? Usually these routines will be used to either accept or display user input/output, in which case the time it takes to do the conversion is grossly swamped by the input/output time.

But I suppose it's a harmless sport at that. And so far as our OP here goes (here's looking at you, @ahsat), it's all part of your assembly learning experience, so I won't begrudge you that. Just don't fall for the hype here.
So when Excel wants to show 10000 integers in a spreadsheet an algo 4 times slower is enough?


More results
13th Gen Intel(R) Core(TM) i9-13900KF (SSE4)

3219    cycles for 100 * dwtoa
2059    cycles for 100 * dw2str
14581   cycles for 100 * MasmBasic Str$()
3617    cycles for 100 * Ray's algo

2692    cycles for 100 * dwtoa
2056    cycles for 100 * dw2str
14797   cycles for 100 * MasmBasic Str$()
3627    cycles for 100 * Ray's algo

3120    cycles for 100 * dwtoa
2063    cycles for 100 * dw2str
11700   cycles for 100 * MasmBasic Str$()
2900    cycles for 100 * Ray's algo

2201    cycles for 100 * dwtoa
1669    cycles for 100 * dw2str
11825   cycles for 100 * MasmBasic Str$()
2916    cycles for 100 * Ray's algo

Averages:
2906    cycles for dwtoa
2058    cycles for dw2str
13203   cycles for MasmBasic Str$()
3266    cycles for Ray's algo

20      bytes for dwtoa
74      bytes for dw2str
16      bytes for MasmBasic Str$()
110     bytes for Ray's algo

dwtoa                                   -123456789
dw2a                                    4171510507
dw2str                                  -123456789
MasmBasic Str$()                        -123456789
Ray's algo                              171510507

--- ok ---

jj2007

Quote from: NoCforMe on April 01, 2024, 09:57:18 AMI just like pissing you off, JJ.

Besides, as you well know, I've posted plenty of code here and continue to do so.

No problem. From now on, I will intervene under each and every of your coding attempts and leave some crispy remarks. I'm quite good at that game, too :thup:

Btw what happened to your funny "cubby holes" editor? Any chance to see a release version this year?

NoCforMe

Quote from: sinsi on April 01, 2024, 10:00:49 AM
Quote from: NoCforMe on April 01, 2024, 09:37:41 AMOK, it's spring, and apparently that's time for a favorite indoor sport around here: How Fast is My Algo?

I always have to laugh at all this attention paid here to speed when it comes to numeric-conversion routines, which seem to be the favorite category in this sport. Why? Because it's all so ridiculous; really, except in a few cases, who cares how fast they are? Usually these routines will be used to either accept or display user input/output, in which case the time it takes to do the conversion is grossly swamped by the input/output time.
So when Excel wants to show 10000 integers in a spreadsheet an algo 4 times slower is enough?
As I wrote: except in a few cases, who cares how fast they are?. That's one of those few cases.
Assembly language programming should be fun. That's why I do it.

NoCforMe

Quote from: jj2007 on April 01, 2024, 10:02:21 AMBtw what happened to your funny "cubby holes" editor? Any chance to see a release version this year?
Yes, good question and thanks for asking. Yes, there'll be a new "release" coming along. And I think that's a perfectly fine (and humorous) way to describe it. (I do wish I had a bit more feedback on it, though.)

And of course I wish you the best of luck with your editor efforts (which I have admitted have some very nice functionality that mine lacks).
Assembly language programming should be fun. That's why I do it.