The MASM Forum

General => The Laboratory => Topic started by: jj2007 on April 17, 2024, 02:06:33 AM

Title: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 02:06:33 AM
Timings please, just for fun. Credits go to qWord :thumbsup:

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

5503    cycles for 100 * Masm32 hex$
4298    cycles for 100 * MasmBasic Hex$
352     cycles for 100 * qWord mmx hex$
428     cycles for 100 * qWord xmm hex$
44520   cycles for 100 * CRT hex$

5675    cycles for 100 * Masm32 hex$
4338    cycles for 100 * MasmBasic Hex$
377     cycles for 100 * qWord mmx hex$
431     cycles for 100 * qWord xmm hex$
44424   cycles for 100 * CRT hex$

5499    cycles for 100 * Masm32 hex$
4305    cycles for 100 * MasmBasic Hex$
353     cycles for 100 * qWord mmx hex$
427     cycles for 100 * qWord xmm hex$
44564   cycles for 100 * CRT hex$

5479    cycles for 100 * Masm32 hex$
4320    cycles for 100 * MasmBasic Hex$
352     cycles for 100 * qWord mmx hex$
433     cycles for 100 * qWord xmm hex$
47006   cycles for 100 * CRT hex$

5513    cycles for 100 * Masm32 hex$
4400    cycles for 100 * MasmBasic Hex$
361     cycles for 100 * qWord mmx hex$
429     cycles for 100 * qWord xmm hex$
44405   cycles for 100 * CRT hex$

Averages:
5505    cycles for Masm32 hex$
4321    cycles for MasmBasic Hex$
355     cycles for qWord mmx hex$
429     cycles for qWord xmm hex$
44503   cycles for CRT hex$

16      bytes for Masm32 hex$
12      bytes for MasmBasic Hex$
92      bytes for qWord mmx hex$
124     bytes for qWord xmm hex$
29      bytes for CRT hex$

Masm32 hex$                             1234ABCD
MasmBasic Hex$                          1234ABCD
qWord mmx hex$                          1234ABCD
qWord xmm hex$                          1234ABCD
CRT hex$                                1234ABCD
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 17, 2024, 02:11:53 AM
Intel(R) Core(TM)2 Duo CPU    E8400  @ 3.00GHz (SSE4)

4132    cycles for 100 * Masm32 hex$
8482    cycles for 100 * MasmBasic Hex$
797    cycles for 100 * qWord mmx hex$
1017    cycles for 100 * qWord xmm hex$
75531  cycles for 100 * CRT hex$

4105    cycles for 100 * Masm32 hex$
8427    cycles for 100 * MasmBasic Hex$
797    cycles for 100 * qWord mmx hex$
1018    cycles for 100 * qWord xmm hex$
75587  cycles for 100 * CRT hex$

4090    cycles for 100 * Masm32 hex$
8443    cycles for 100 * MasmBasic Hex$
795    cycles for 100 * qWord mmx hex$
1020    cycles for 100 * qWord xmm hex$
75549  cycles for 100 * CRT hex$

4089    cycles for 100 * Masm32 hex$
8482    cycles for 100 * MasmBasic Hex$
801    cycles for 100 * qWord mmx hex$
1017    cycles for 100 * qWord xmm hex$
75745  cycles for 100 * CRT hex$

4262    cycles for 100 * Masm32 hex$
8521    cycles for 100 * MasmBasic Hex$
808    cycles for 100 * qWord mmx hex$
1019    cycles for 100 * qWord xmm hex$
75669  cycles for 100 * CRT hex$

Averages:
4109    cycles for Masm32 hex$
8469    cycles for MasmBasic Hex$
798    cycles for qWord mmx hex$
1018    cycles for qWord xmm hex$
75602  cycles for CRT hex$

16      bytes for Masm32 hex$
12      bytes for MasmBasic Hex$
92      bytes for qWord mmx hex$
124    bytes for qWord xmm hex$
29      bytes for CRT hex$

Masm32 hex$                            1234ABCD
MasmBasic Hex$                          1234ABCD
qWord mmx hex$                          1234ABCD
qWord xmm hex$                          1234ABCD
CRT hex$                                1234ABCD

--- ok ---

QuoteTimings please, just for fun.
Cycle counts??  :wink2:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 02:14:34 AM
Thanks, sudoku :thumbsup:

It seems my attempt to port qWord's algo from mmx to xmm failed miserably :sad:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 17, 2024, 02:21:54 AM
Need some more intels, so I can compare others to my computer.
Masm32 beats MasmBasic. No, it can't be.  :joking:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 17, 2024, 02:40:45 AM
13th Gen Intel(R) Core(TM) i9-13900KF (SSE4)

1041    cycles for 100 * Masm32 hex$
2648    cycles for 100 * MasmBasic Hex$
317     cycles for 100 * qWord mmx hex$
246     cycles for 100 * qWord xmm hex$
18828   cycles for 100 * CRT hex$

1046    cycles for 100 * Masm32 hex$
2578    cycles for 100 * MasmBasic Hex$
314     cycles for 100 * qWord mmx hex$
249     cycles for 100 * qWord xmm hex$
18788   cycles for 100 * CRT hex$

1034    cycles for 100 * Masm32 hex$
2655    cycles for 100 * MasmBasic Hex$
315     cycles for 100 * qWord mmx hex$
248     cycles for 100 * qWord xmm hex$
18821   cycles for 100 * CRT hex$

1035    cycles for 100 * Masm32 hex$
2607    cycles for 100 * MasmBasic Hex$
334     cycles for 100 * qWord mmx hex$
254     cycles for 100 * qWord xmm hex$
18860   cycles for 100 * CRT hex$

1051    cycles for 100 * Masm32 hex$
2629    cycles for 100 * MasmBasic Hex$
321     cycles for 100 * qWord mmx hex$
248     cycles for 100 * qWord xmm hex$
18909   cycles for 100 * CRT hex$

Averages:
1041    cycles for Masm32 hex$
2628    cycles for MasmBasic Hex$
318     cycles for qWord mmx hex$
248     cycles for qWord xmm hex$
18836   cycles for CRT hex$

16      bytes for Masm32 hex$
12      bytes for MasmBasic Hex$
92      bytes for qWord mmx hex$
124     bytes for qWord xmm hex$
29      bytes for CRT hex$

Masm32 hex$                             1234ABCD
MasmBasic Hex$                          1234ABCD
qWord mmx hex$                          1234ABCD
qWord xmm hex$                          1234ABCD
CRT hex$                                1234ABCD
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: daydreamer on April 17, 2024, 03:25:59 AM
18900 crt vs 248 xmm ,if all CRT functions are much slower than a library with fast asm function library does that mean a c program using CRT functions  is that much slower compared to asm program using asm library ?
Its 76.20967741935 times faster
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 06:28:04 AM
Quote from: daydreamer on April 17, 2024, 03:25:59 AMdoes that mean a c program using CRT functions  is that much slower compared to asm program using asm library ?

Well, of course it means that that particular CRT function is that much slower compared to the other (assembler) functions. It doesn't necessarily mean that the program as a whole is that much slower. And as I always try to point out, it may not mean anything at all if these functions are only used to display user input or output, instead of in a way which would significantly affect overall program speed (say if one is processing 100,000 numbers in a spreadsheet or something).
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 07:08:04 AM
Quote from: sudoku on April 17, 2024, 02:21:54 AMMasm32 beats MasmBasic. No, it can't be.  :joking:

You are right - I forgot the fast option :thumbsup:

mov somevar, Hex$(123456789, fast) (https://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1198) (DWORDs only)

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

Averages:
5528   cycles for Masm32 hex$
275    cycles for MasmBasic Hex$
357    cycles for qWord mmx hex$
433    cycles for qWord xmm hex$
44624  cycles for CRT hex$
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 17, 2024, 07:10:29 AM
Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz (SSE4)

4105    cycles for 100 * Masm32 hex$
624     cycles for 100 * MasmBasic Hex$
797     cycles for 100 * qWord mmx hex$
1018    cycles for 100 * qWord xmm hex$
75615   cycles for 100 * CRT hex$

4128    cycles for 100 * Masm32 hex$
603     cycles for 100 * MasmBasic Hex$
801     cycles for 100 * qWord mmx hex$
1036    cycles for 100 * qWord xmm hex$
75388   cycles for 100 * CRT hex$

4114    cycles for 100 * Masm32 hex$
594     cycles for 100 * MasmBasic Hex$
801     cycles for 100 * qWord mmx hex$
1021    cycles for 100 * qWord xmm hex$
75292   cycles for 100 * CRT hex$

4110    cycles for 100 * Masm32 hex$
593     cycles for 100 * MasmBasic Hex$
801     cycles for 100 * qWord mmx hex$
1020    cycles for 100 * qWord xmm hex$
75426   cycles for 100 * CRT hex$

4103    cycles for 100 * Masm32 hex$
594     cycles for 100 * MasmBasic Hex$
798     cycles for 100 * qWord mmx hex$
1021    cycles for 100 * qWord xmm hex$
75507   cycles for 100 * CRT hex$

Averages:
4110    cycles for Masm32 hex$
597     cycles for MasmBasic Hex$
800     cycles for qWord mmx hex$
1021    cycles for qWord xmm hex$
75440   cycles for CRT hex$

16      bytes for Masm32 hex$
16      bytes for MasmBasic Hex$
92      bytes for qWord mmx hex$
124     bytes for qWord xmm hex$
29      bytes for CRT hex$

Masm32 hex$                             1234ABCD
MasmBasic Hex$                          1234ABCD
qWord mmx hex$                          1234ABCD
qWord xmm hex$                          1234ABCD
CRT hex$                                1234ABCD

--- ok ---

Turbo Mode!  :biggrin:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 17, 2024, 07:21:49 AM
13th Gen Intel(R) Core(TM) i9-13900KF (SSE4)

Averages:
1036    cycles for Masm32 hex$
132     cycles for MasmBasic Hex$
306     cycles for qWord mmx hex$
249     cycles for qWord xmm hex$
18706   cycles for CRT hex$
Showoff
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 07:43:15 AM
Thanks, folks :thup:

Honestly, I had forgotten the fast option, which was tested in this thread (https://masm32.com/board/index.php?msg=122102) in July 2023 :cool:

Quote from: NoCforMe on April 17, 2024, 06:28:04 AMit may not mean anything at all if these functions are only used to display user input or output

That would indeed be nonsense, but nobody proposed that.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 08:22:48 AM
Quote from: jj2007 on April 17, 2024, 07:43:15 AMThanks, folks :thup:

Honestly, I had forgotten the fast option, which was tested in this thread (https://masm32.com/board/index.php?msg=122102) in July 2023 :cool:

Quote from: NoCforMe on April 17, 2024, 06:28:04 AMit may not mean anything at all if these functions are only used to display user input or output

That would indeed be nonsense, but nobody proposed that.

No, nobody proposed that, but everybody seems to be ignoring that.

I'd be willing to bet that 80-90% of use cases for these (numeric conversion) functions are for displaying or inputting small amounts of user output/input. Can't prove it, of course, but I'm pretty sure.

How many people here are actually writing applications where it does make a difference in speed? Someone somewhere here mentioned spreadsheets, but really, who besides Micro$oft is actually coding a spreadsheet?
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 08:44:59 AM
Quote from: NoCforMe on April 17, 2024, 08:22:48 AMwho besides Micro$oft is actually coding a spreadsheet?

May 28, 2014, 01:53:07 PM: Spreadsheet viewer (https://masm32.com/board/index.php?topic=3231.0) (nowadays am ultrafast spreadsheet editor)

April 01, 2024, 09:53:55 AM: Obsession with speed (https://masm32.com/board/index.php?msg=128433)

You are missing the point. An application that uses a slow library will be slow because everything it does depends on slow functions. Every sane programmer will avoid using slow functions in an innermost loop, but if the only library he has is as slow as the CRT, then inevitably his applications will be, ehm, a bit slow. And the World of software is full of awfully slow applications. Oh, btw, LibreOffice seems to use a slow library: I once measured its spreadsheet editor Calc's sorting performance against M$ Excel: horrible, over a factor 10 slower (others have done that, too (https://gigazine.net/gsc_news/en/20191216-spreadsheet-benchmark/)). Which doesn't mean that Excel is fast, of course - my spreadsheet editor sorts column considerably faster than Excel...

QuoteIn addition, when Google Sheet exceeded 20,000 lines, processing took 1.5 seconds, and when it exceeded 50,000 lines, it was found that it waited nearly 5 seconds. (https://gigazine.net/gsc_news/en/20191216-spreadsheet-benchmark/)

Quoteall three spreadsheet systems, i.e., Excel, Calc, and Google Sheets, require more than 500ms to sort a spreadsheet with 10k, 6k, and 10k rows, respectively (https://sajjadur.net/files/benchmarking_spreadsheets.pdf)

My spreadsheet editor is not perfect, but it sorts a 44,600 rows spreadsheet in about 30 milliseconds. It uses a fast library.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 17, 2024, 09:10:01 AM
OK, real world scenario. You are using an Explorer-like interface and your ListView needs to show file sizes.
Would a function that takes 10 times longer be OK? No worries for a half-screen of files, but 10,000?

Quote from: NoCforMe on April 17, 2024, 08:22:48 AMwho besides Micro$oft is actually coding a spreadsheet?
[rant]I worked as an on-site tech, the number of people who use Excel for everything would astound you (and make you slightly ill). Weekly budgets where they just add a few rows per week until there are thousands of rows, all automatically refreshing themselves every time data is entered. One lady used it to store recipes  :rolleyes: granted there's not much use for numbers there.
Every accountant in every business I've serviced has their own pet spreadsheet, filled with VBA that uses strings for numbers because they are accountants, not programmers.

Because MS made it part of the basic versions of Office, everyone uses and abuses it.
[/rant]


Off topic, but I find it hard to take someone seriously when they use the term Micro$oft.
Makes you sound 14 :biggrin:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 17, 2024, 09:21:56 AM
[off topic]I'll just sit here and eat my popcorn. [/off topic] :tongue:  :joking:  :rofl:  :badgrin:

To be ON topic, fast enough is fast enough. Always depends on the application of any algo, and how often you need to use it.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 09:35:11 AM
Popcorn is always good :thumbsup:

I am trying to get some real data from the UN (https://unstats.un.org/sdgs/indicators/databaseLegacy), but it may take a while.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 11:02:30 AM
Quote from: sinsi on April 17, 2024, 09:10:01 AM
Quote from: NoCforMe on April 17, 2024, 08:22:48 AMwho besides Micro$oft is actually coding a spreadsheet?
[rant]I worked as an on-site tech, the number of people who use Excel for everything would astound you (and make you slightly ill).
Ackshooly, no it wouldn't and no it wouldn't: the CFO for a company I worked for for many years (in the computer industry) used not Excel, not Lotus 1-2-3 but the Lotus clone, Quattro Pro, for absolutely everything. Including some very clever uses most people would never have thought of. None of which I would consider an "abuse" of that tool (what, you're only supposed to use software for things approved of by the maker and documented in the user manual?)
QuoteOff topic, but I find it hard to take someone seriously when they use the term Micro$oft.
Makes you sound 14 :biggrin:
Well, that's your problem, not mine.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 11:14:22 AM
Quote from: sinsi on April 17, 2024, 09:10:01 AMOK, real world scenario. You are using an Explorer-like interface and your ListView needs to show file sizes.
Would a function that takes 10 times longer be OK? No worries for a half-screen of files, but 10,000?
As I have taken pains to write every damn time I bring this up, that is one of the exceptions. Obviously. I still hold that those exceptions are maybe 5-10% of most usage of these functions, if that.

I was hoping JJ's UN database would show the distribution of such usages, but instead it seems to be a collection of wokeness and unattainable goals, quanitized somehow.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: TimoVJL on April 17, 2024, 05:02:10 PM
CSV / TSV tables are in textual format and needs conversions.
Also endian free tables avoid binary formats.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 05:56:54 PM
Quote from: sinsi on April 17, 2024, 09:10:01 AMOK, real world scenario.

Another one: you have a 30k Assembly source, and you are debugging the beast. Will you use MASM (20 seconds) or UAsm (4 seconds) to build it?

Staring four entire seconds at the screen is wayyyy too slow btw. The UAsm developers should try one day to use a fast library (http://masm32.com/board/index.php?topic=94.0) for string parsing and the like :cool:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 05:57:09 PM
I know about big-endian and little-endian, but what are "endian free tables"?
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: TimoVJL on April 17, 2024, 06:51:07 PM
Quote from: NoCforMe on April 17, 2024, 05:57:09 PMI know about big-endian and little-endian, but what are "endian free tables"?
tables of numerical values in text format for transfers between different systems.
Sometimes in database formats.

jj2007 might told that many times.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: NoCforMe on April 17, 2024, 06:59:46 PM
Quote from: jj2007 on April 17, 2024, 05:56:54 PMAnother one: you have a 30k Assembly source, and you are debugging the beast. Will you use MASM (20 seconds) or UAsm (4 seconds) to build it?
20 seconds? Maybe you need a new computer, JJ: my current assembly project source is 52+k, and it assembles and links (with ml.exe and link.exe) in a blink of an eye. (ML 6.14.8444) Oh, and that includes the resource compiler & converter.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 17, 2024, 08:26:56 PM
Quote from: NoCforMe on April 17, 2024, 06:59:46 PMmy current assembly project source is 52+k, and it assembles and links (with ml.exe and link.exe) in a blink of an eye

Lucky you :thumbsup:

My sources are a bit complex. RichMasm, for example, has 25k lines, but a lot of that is macros. The resulting exe is 186,880 bytes, of which only 13k are resources*). For some time, I compared the assembly times (link and rc are always negligeable), and ML was typically a factor 3-5 slower than UAsm, which in turn was about 25% slower than AsmC.

Same for the MasmBasic library at 3.7 seconds with UAsm, over 10 seconds with ML 6.15 - it creates a 154k lib file.

Note that I invested a lot of time to make sure that all my templates build fine with MASM. They do. Sadly enough, my personal sources don't: ML.exe (any version) can't handle their complexities any more. Often, it fails with "internal error". Bad luck.

*) For comparison, your EdAsm exe has 118784 bytes, of which 62440 bytes resources, i.e. 56,344 net exe bytes for a 10.9 kLines source with a high share of comments
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: FORTRANS on April 17, 2024, 10:22:22 PM
Hi,

   Three laptops, Intel processors.

Intel(R) Core(TM) i3-4005U CPU @ 1.70GHz (SSE4)

3276 cycles for 100 * Masm32 hex$
412 cycles for 100 * MasmBasic Hex$
678 cycles for 100 * qWord mmx hex$
686 cycles for 100 * qWord xmm hex$
60721 cycles for 100 * CRT hex$

3294 cycles for 100 * Masm32 hex$
380 cycles for 100 * MasmBasic Hex$
765 cycles for 100 * qWord mmx hex$
686 cycles for 100 * qWord xmm hex$
58533 cycles for 100 * CRT hex$

3286 cycles for 100 * Masm32 hex$
375 cycles for 100 * MasmBasic Hex$
688 cycles for 100 * qWord mmx hex$
679 cycles for 100 * qWord xmm hex$
58595 cycles for 100 * CRT hex$

3267 cycles for 100 * Masm32 hex$
375 cycles for 100 * MasmBasic Hex$
695 cycles for 100 * qWord mmx hex$
675 cycles for 100 * qWord xmm hex$
58223 cycles for 100 * CRT hex$

3266 cycles for 100 * Masm32 hex$
374 cycles for 100 * MasmBasic Hex$
676 cycles for 100 * qWord mmx hex$
675 cycles for 100 * qWord xmm hex$
62799 cycles for 100 * CRT hex$

Averages:
3276 cycles for Masm32 hex$
377 cycles for MasmBasic Hex$
687 cycles for qWord mmx hex$
680 cycles for qWord xmm hex$
59283 cycles for CRT hex$

16 bytes for Masm32 hex$
16 bytes for MasmBasic Hex$
92 bytes for qWord mmx hex$
124 bytes for qWord xmm hex$
29 bytes for CRT hex$

Masm32 hex$1234ABCD
MasmBasic Hex$1234ABCD
qWord mmx hex$1234ABCD
qWord xmm hex$1234ABCD
CRT hex$1234ABCD

--- ok ---

Intel(R) Pentium(R) M processor 1.70GHz (SSE2)

5053 cycles for 100 * Masm32 hex$
1152 cycles for 100 * MasmBasic Hex$
1224 cycles for 100 * qWord mmx hex$
2215 cycles for 100 * qWord xmm hex$
95099 cycles for 100 * CRT hex$

5066 cycles for 100 * Masm32 hex$
1117 cycles for 100 * MasmBasic Hex$
1218 cycles for 100 * qWord mmx hex$
2220 cycles for 100 * qWord xmm hex$
95139 cycles for 100 * CRT hex$

5056 cycles for 100 * Masm32 hex$
1122 cycles for 100 * MasmBasic Hex$
1221 cycles for 100 * qWord mmx hex$
2221 cycles for 100 * qWord xmm hex$
95172 cycles for 100 * CRT hex$

5073 cycles for 100 * Masm32 hex$
1110 cycles for 100 * MasmBasic Hex$
1209 cycles for 100 * qWord mmx hex$
2211 cycles for 100 * qWord xmm hex$
95145 cycles for 100 * CRT hex$

5057 cycles for 100 * Masm32 hex$
1126 cycles for 100 * MasmBasic Hex$
1217 cycles for 100 * qWord mmx hex$
2219 cycles for 100 * qWord xmm hex$
95141 cycles for 100 * CRT hex$

Averages:
5060 cycles for Masm32 hex$
1122 cycles for MasmBasic Hex$
1219 cycles for qWord mmx hex$
2218 cycles for qWord xmm hex$
95142 cycles for CRT hex$

16 bytes for Masm32 hex$
16 bytes for MasmBasic Hex$
92 bytes for qWord mmx hex$
124 bytes for qWord xmm hex$
29 bytes for CRT hex$

Masm32 hex$1234ABCD
MasmBasic Hex$1234ABCD
qWord mmx hex$1234ABCD
qWord xmm hex$1234ABCD
CRT hex$1234ABCD

--- ok ---

Intel(R) Core(TM) i3-10110U CPU @ 2.10GHz (SSE4)

3186 cycles for 100 * Masm32 hex$
791 cycles for 100 * MasmBasic Hex$
892 cycles for 100 * qWord mmx hex$
669 cycles for 100 * qWord xmm hex$
66380 cycles for 100 * CRT hex$

2895 cycles for 100 * Masm32 hex$
412 cycles for 100 * MasmBasic Hex$
603 cycles for 100 * qWord mmx hex$
579 cycles for 100 * qWord xmm hex$
50447 cycles for 100 * CRT hex$

2569 cycles for 100 * Masm32 hex$
351 cycles for 100 * MasmBasic Hex$
977 cycles for 100 * qWord mmx hex$
1061 cycles for 100 * qWord xmm hex$
72118 cycles for 100 * CRT hex$

3579 cycles for 100 * Masm32 hex$
515 cycles for 100 * MasmBasic Hex$
553 cycles for 100 * qWord mmx hex$
540 cycles for 100 * qWord xmm hex$
51155 cycles for 100 * CRT hex$

2973 cycles for 100 * Masm32 hex$
366 cycles for 100 * MasmBasic Hex$
533 cycles for 100 * qWord mmx hex$
795 cycles for 100 * qWord xmm hex$
57763 cycles for 100 * CRT hex$

Averages:
3018 cycles for Masm32 hex$
431 cycles for MasmBasic Hex$
683 cycles for qWord mmx hex$
681 cycles for qWord xmm hex$
58433 cycles for CRT hex$

16 bytes for Masm32 hex$
16 bytes for MasmBasic Hex$
92 bytes for qWord mmx hex$
124 bytes for qWord xmm hex$
29 bytes for CRT hex$

Masm32 hex$1234ABCD
MasmBasic Hex$1234ABCD
qWord mmx hex$1234ABCD
qWord xmm hex$1234ABCD
CRT hex$1234ABCD

--- ok ---

Regards,

Steve
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: daydreamer on April 18, 2024, 01:30:20 AM
Quote from: NoCforMe on April 17, 2024, 08:22:48 AMHow many people here are actually writing applications where it does make a difference in speed? Someone somewhere here mentioned spreadsheets, but really, who besides Micro$oft is actually coding a spreadsheet?
In non hardware accelerated Games it does matter in speed,especially if you are limited to dos 16 bit emulator, which makes x86 code run much slower than on your 3 ghz cpu
I had several android devices with many different office clone app,seem each brand developed their own version of office clone
Btw the name quatro pro sounds more like audio car model than computer program  :biggrin:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 18, 2024, 01:51:34 AM
Quote from: FORTRANS on April 17, 2024, 10:22:22 PMThree laptops, Intel processors.

Thanks a lot, Steve :thup:
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 19, 2024, 12:02:37 PM
I've been playing around with threads and had a crazy idea...multithreaded hex$  :biggrin:

Create 4 suspended threads. Each thread is passed the offset of a byte ([number+i])and a word ([result+j]).
Fill each thread's byte, start the timer and the threads and have them convert it to two ascii characters.
Wait until all threads finish, get the elapsed time.

Faster?
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 19, 2024, 12:19:41 PM
Quote from: sinsi on April 19, 2024, 12:02:37 PM...multithreaded hex$  :biggrin:
You would have to include the overhead for the thread creation, etc. in the times... to be fair to the other algos == not from thread(s) start to thread(s) end only.  Are you already experimenting with it?
...  Yes, *should* be faster.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 19, 2024, 12:29:32 PM
You could create the threads and leave them suspended, just wake them up to use them then back to sleep.

Quote from: sudoku on April 19, 2024, 12:19:41 PMAre you already experimenting with it?
I'm waiting for someone to tell me "you're crazy" or "hmmm, interesting".
It's a silly idea as far as using it for something so trivial, but there aren't that many practical uses for threads.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 19, 2024, 12:31:28 PM
@sinsi: You're crazy.   :tongue:

I "might" take a look at this later... "maybe" (I'm lazy)
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 19, 2024, 12:37:03 PM
Quote from: sinsi on April 19, 2024, 12:29:32 PMIt's a silly idea as far as using it for something so trivial...
Right. That's why you would do as many conversion as possible to justify the overhead... not just one conversion. The overhead might negate any savings otherwise. Would be perfect for a spreadsheet, methinks.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: daydreamer on April 19, 2024, 03:52:03 PM
Quote from: sinsi on April 19, 2024, 12:29:32 PMYou could create the threads and leave them suspended, just wake them up to use them then back to sleep.

Quote from: sudoku on April 19, 2024, 12:19:41 PMAre you already experimenting with it?
I'm waiting for someone to tell me "you're crazy" or "hmmm, interesting".
It's a silly idea as far as using it for something so trivial, but there aren't that many practical uses for threads.
Interesting, I tried SIMT using one worker thread adding together very big fibonnaci numbers,while main thread takes care of milliseconds print numbers
So I reduce the print in loop that adds together fibonnaci = reduce the slowest thing that takes Milliseconds, when rest of the loop takes clock cycles
Tried Workerthread in Windows program take and timed it 10 times more clock cycles than with peekmessage method


Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 19, 2024, 05:51:27 PM
Quote from: sinsi on April 19, 2024, 12:02:37 PMFaster?

Thread overhead is far too high for individual numbers. However, if you have a gigabyte to convert, splitting the task will certainly be faster.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: sinsi on April 19, 2024, 06:11:37 PM
Create the threads beforehand (like making a lookup table, once is enough).
Each thread suspends itself after the calculation by setting a bit to say "I've finished" and SuspendThread.
The thread waits for the next job via ResumeThread.

A more realistic use might be to fill a huge block of memory?
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: jj2007 on April 19, 2024, 06:41:35 PM
Quote from: sinsi on April 19, 2024, 06:11:37 PMA more realistic use might be to fill a huge block of memory?

Yes indeed. And each thread working on "its" slots, i.e.
thread #1 working on start+16*n+0
thread #2 working on start+16*n+4
thread #3 working on start+16*n+8
thread #4 working on start+16*n+12

That would ensure good usage of the L1 cache while keeping busy a significant share of the CPU. I wonder, though, about cache misses if one thread lags behind. That is, if there can be a significant speed difference.
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: daydreamer on April 20, 2024, 12:15:06 AM
I would like to start a new thread for SIMT exercises + timings
Sleep in main thread suggest cpu switches to another thread after starting x number of threads
Program Decides # threads after check cpu # of cores?, because we have very different cpu's with different # of cores
I am curious if Createthread directly start execution with set custom bigger stack space to use local arrays vs several invokes
Invoke Createthread suspended
Invoke alloc memory
Invoke start thread?
Title: Re: Very fast hex$ (our hobby is beating the CRT)
Post by: zedd on April 20, 2024, 12:39:46 AM
Quote from: daydreamer on April 20, 2024, 12:15:06 AMI would like to start a new thread for SIMT exercises + timings
Sleep in main thread suggest cpu switches to another thread after starting x number of threads
Program Decides # threads after check cpu # of cores?, because we have very different cpu's with different # of cores
Post some code and we'll test it, Magnus...
We're always up for a new speed test/challenge...