The MASM Forum

General => The Laboratory => Topic started by: jj2007 on February 11, 2024, 06:33:52 AM

Title: Fast hexstring to binary conversion
Post by: jj2007 on February 11, 2024, 06:33:52 AM
May I have some timings, please?

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

20314   cycles for 100 * MasmBasic Val
39045   cycles for 100 * CRT a2ud
2536    cycles for 100 * Masm32 SDK hex2bin
2686    cycles for 100 * HexStr2Bin
2092    cycles for 100 * HexStr2BinT (with table)
9181    cycles for 100 * HexStr2X (64-bit)

19732   cycles for 100 * MasmBasic Val
40132   cycles for 100 * CRT a2ud
3281    cycles for 100 * Masm32 SDK hex2bin
3952    cycles for 100 * HexStr2Bin
2231    cycles for 100 * HexStr2BinT (with table)
8839    cycles for 100 * HexStr2X (64-bit)

19825   cycles for 100 * MasmBasic Val
39673   cycles for 100 * CRT a2ud
3598    cycles for 100 * Masm32 SDK hex2bin
2747    cycles for 100 * HexStr2Bin
2054    cycles for 100 * HexStr2BinT (with table)
8678    cycles for 100 * HexStr2X (64-bit)

20010   cycles for 100 * MasmBasic Val
39973   cycles for 100 * CRT a2ud
3242    cycles for 100 * Masm32 SDK hex2bin
3565    cycles for 100 * HexStr2Bin
2150    cycles for 100 * HexStr2BinT (with table)
9123    cycles for 100 * HexStr2X (64-bit)

19712   cycles for 100 * MasmBasic Val
39055   cycles for 100 * CRT a2ud
2539    cycles for 100 * Masm32 SDK hex2bin
3479    cycles for 100 * HexStr2Bin
2385    cycles for 100 * HexStr2BinT (with table)
9158    cycles for 100 * HexStr2X (64-bit)

3       bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
104     bytes for HexStr2X (64-bit)

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT a2ud
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax HexStr2Bin
12ABCDEFh       eax HexStr2BinT (with table)
56789DEFh       eax HexStr2X (64-bit)

Remarks:
- The string used is 12AbCdEfh
- MasmBasic Val is slower because it's an allrounder; you can throw $123, 456h, 12345, 010010101y at it, and you'll always get the correct result. Therefore it's only twice as fast as crt_sscanf :biggrin:
- The last one, HexStr2X, uses the string 1234AbcD56789Defh; however, it returns the result in xmm0, therefore (for technical reasons) eax shows only the second half, 56789DEFh
Title: Re: Fast hexstring to binary conversion
Post by: NoCforMe on February 11, 2024, 07:20:27 AM
Two things:
Title: Re: Fast hexstring to binary conversion
Post by: fearless on February 11, 2024, 07:20:54 AM
AMD Ryzen 9 5950X 16-Core Processor            (SSE4)

15921  cycles for 100 * MasmBasic Val
26961  cycles for 100 * CRT a2ud
2126    cycles for 100 * Masm32 SDK hex2bin
2209    cycles for 100 * HexStr2Bin
1845    cycles for 100 * HexStr2BinT (with table)
6857    cycles for 100 * HexStr2X (64-bit)

16110  cycles for 100 * MasmBasic Val
26237  cycles for 100 * CRT a2ud
2205    cycles for 100 * Masm32 SDK hex2bin
2314    cycles for 100 * HexStr2Bin
1763    cycles for 100 * HexStr2BinT (with table)
6459    cycles for 100 * HexStr2X (64-bit)

15842  cycles for 100 * MasmBasic Val
26733  cycles for 100 * CRT a2ud
2039    cycles for 100 * Masm32 SDK hex2bin
2167    cycles for 100 * HexStr2Bin
1740    cycles for 100 * HexStr2BinT (with table)
6541    cycles for 100 * HexStr2X (64-bit)

15816  cycles for 100 * MasmBasic Val
26510  cycles for 100 * CRT a2ud
2085    cycles for 100 * Masm32 SDK hex2bin
2287    cycles for 100 * HexStr2Bin
1792    cycles for 100 * HexStr2BinT (with table)
6441    cycles for 100 * HexStr2X (64-bit)

15748  cycles for 100 * MasmBasic Val
26419  cycles for 100 * CRT a2ud
2087    cycles for 100 * Masm32 SDK hex2bin
2131    cycles for 100 * HexStr2Bin
1665    cycles for 100 * HexStr2BinT (with table)
6532    cycles for 100 * HexStr2X (64-bit)

3      bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
104    bytes for HexStr2X (64-bit)

12ABCDEFh      eax MasmBasic Val
12ABCDEFh      eax CRT a2ud
12ABCDEFh      eax Masm32 SDK hex2bin
12ABCDEFh      eax HexStr2Bin
12ABCDEFh      eax HexStr2BinT (with table)
56789DEFh      eax HexStr2X (64-bit)

--- ok ---
Title: Re: Fast hexstring to binary conversion
Post by: HSE on February 11, 2024, 08:00:35 AM
Intel(R) Core(TM) i3-10100 CPU @ 3.60GHz (SSE4)

21786   cycles for 100 * MasmBasic Val
38609   cycles for 100 * CRT a2ud
2921    cycles for 100 * Masm32 SDK hex2bin
2688    cycles for 100 * HexStr2Bin
1686    cycles for 100 * HexStr2BinT (with table)
6975    cycles for 100 * HexStr2X (64-bit)

19965   cycles for 100 * MasmBasic Val
38482   cycles for 100 * CRT a2ud
2960    cycles for 100 * Masm32 SDK hex2bin
3349    cycles for 100 * HexStr2Bin
1829    cycles for 100 * HexStr2BinT (with table)
6983    cycles for 100 * HexStr2X (64-bit)

19990   cycles for 100 * MasmBasic Val
38434   cycles for 100 * CRT a2ud
2973    cycles for 100 * Masm32 SDK hex2bin
3279    cycles for 100 * HexStr2Bin
1725    cycles for 100 * HexStr2BinT (with table)
6875    cycles for 100 * HexStr2X (64-bit)

19922   cycles for 100 * MasmBasic Val
38457   cycles for 100 * CRT a2ud
3042    cycles for 100 * Masm32 SDK hex2bin
2749    cycles for 100 * HexStr2Bin
1725    cycles for 100 * HexStr2BinT (with table)
6979    cycles for 100 * HexStr2X (64-bit)

19995   cycles for 100 * MasmBasic Val
38463   cycles for 100 * CRT a2ud
7092    cycles for 100 * Masm32 SDK hex2bin
4304    cycles for 100 * HexStr2Bin
5392    cycles for 100 * HexStr2BinT (with table)
14712   cycles for 100 * HexStr2X (64-bit)

3       bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
104     bytes for HexStr2X (64-bit)

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT a2ud
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax HexStr2Bin
12ABCDEFh       eax HexStr2BinT (with table)
56789DEFh       eax HexStr2X (64-bit)

--- ok ---
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 11, 2024, 08:11:41 AM
Quote from: NoCforMe on February 11, 2024, 07:20:27 AM
  • Is my HexString2Bin() in there somewhere?

I don't think so, but check yourself. Search the source for endp.

Quote
  • Again: who really cares how fast this is?

The OP?

@fearless & Héctor: Thanks :thup:
Title: Re: Fast hexstring to binary conversion
Post by: NoCforMe on February 11, 2024, 08:52:15 AM
Quote from: jj2007 on February 11, 2024, 08:11:41 AM
Quote from: NoCforMe on February 11, 2024, 07:20:27 AM
  • Again: who really cares how fast this is?
The OP?
Speaking of which:
Quote from: hyder on January 30, 2024, 08:09:18 AMA while back I posted a function that converts 64-bit numeric values to a string of hexadecimal digits. After much work, I've come up with an algorithm for the reverse operation: converting hexadecimal strings to a 64-bit numeric value.
So Mr. Hyde, do you have anything to say about what's evolved here from your post?
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 11, 2024, 12:37:47 PM
Version 2 beats the CRT by a factor 17:
AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

19526   cycles for 100 * MasmBasic Val
39707   cycles for 100 * CRT a2ud
3668    cycles for 100 * Masm32 SDK hex2bin
3695    cycles for 100 * HexStr2Bin
2588    cycles for 100 * HexStr2BinT (with table)
2139    cycles for 100 * HexStr2X (64-bit)

20294   cycles for 100 * MasmBasic Val
39182   cycles for 100 * CRT a2ud
3594    cycles for 100 * Masm32 SDK hex2bin
3672    cycles for 100 * HexStr2Bin
2596    cycles for 100 * HexStr2BinT (with table)
2207    cycles for 100 * HexStr2X (64-bit)

19826   cycles for 100 * MasmBasic Val
39041   cycles for 100 * CRT a2ud
2552    cycles for 100 * Masm32 SDK hex2bin
3765    cycles for 100 * HexStr2Bin
2543    cycles for 100 * HexStr2BinT (with table)
2186    cycles for 100 * HexStr2X (64-bit)

20136   cycles for 100 * MasmBasic Val
39231   cycles for 100 * CRT a2ud
3732    cycles for 100 * Masm32 SDK hex2bin
3786    cycles for 100 * HexStr2Bin
2509    cycles for 100 * HexStr2BinT (with table)
2237    cycles for 100 * HexStr2X (64-bit)

20169   cycles for 100 * MasmBasic Val
39727   cycles for 100 * CRT a2ud
5039    cycles for 100 * Masm32 SDK hex2bin
2817    cycles for 100 * HexStr2Bin
2662    cycles for 100 * HexStr2BinT (with table)
2386    cycles for 100 * HexStr2X (64-bit)

3       bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
128     bytes for HexStr2X (64-bit)

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT a2ud
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax HexStr2Bin
12ABCDEFh       eax HexStr2BinT (with table)
12ABCDEFh       eax HexStr2X (64-bit)

For a better comparison, the same string is used for all algos. Note that the 64-bit HexStr2X is now the fastest, thanks to some SSE 4.1 acrobacy - sorry for our friends with legacy CPUs :cool:

Note also that the Masm32 SDK hex2bin has a strange little bug with odd-sized strings.
Title: Re: Fast hexstring to binary conversion
Post by: sinsi on February 11, 2024, 12:50:53 PM
Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz (SSE4)

15415   cycles for 100 * MasmBasic Val
27670   cycles for 100 * CRT a2ud
2401    cycles for 100 * Masm32 SDK hex2bin
2577    cycles for 100 * HexStr2Bin
1515    cycles for 100 * HexStr2BinT (with table)
1991    cycles for 100 * HexStr2X (64-bit)

15419   cycles for 100 * MasmBasic Val
27507   cycles for 100 * CRT a2ud
2387    cycles for 100 * Masm32 SDK hex2bin
2578    cycles for 100 * HexStr2Bin
1538    cycles for 100 * HexStr2BinT (with table)
1776    cycles for 100 * HexStr2X (64-bit)

15464   cycles for 100 * MasmBasic Val
27932   cycles for 100 * CRT a2ud
2384    cycles for 100 * Masm32 SDK hex2bin
2589    cycles for 100 * HexStr2Bin
1611    cycles for 100 * HexStr2BinT (with table)
1776    cycles for 100 * HexStr2X (64-bit)

15432   cycles for 100 * MasmBasic Val
27581   cycles for 100 * CRT a2ud
2393    cycles for 100 * Masm32 SDK hex2bin
2529    cycles for 100 * HexStr2Bin
1555    cycles for 100 * HexStr2BinT (with table)
1728    cycles for 100 * HexStr2X (64-bit)

15411   cycles for 100 * MasmBasic Val
27860   cycles for 100 * CRT a2ud
2441    cycles for 100 * Masm32 SDK hex2bin
2613    cycles for 100 * HexStr2Bin
1654    cycles for 100 * HexStr2BinT (with table)
1735    cycles for 100 * HexStr2X (64-bit)

3       bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
128     bytes for HexStr2X (64-bit)

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT a2ud
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax HexStr2Bin
12ABCDEFh       eax HexStr2BinT (with table)
12ABCDEFh       eax HexStr2X (64-bit)

--- ok ---
The first version ran OK, this second one tripped Windows Defender
Title: Re: Fast hexstring to binary conversion
Post by: daydreamer on February 11, 2024, 06:20:52 PM
Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (SSE4)

26332   cycles for 100 * MasmBasic Val
49023   cycles for 100 * CRT a2ud
3193    cycles for 100 * Masm32 SDK hex2bin
6746    cycles for 100 * HexStr2Bin
3455    cycles for 100 * HexStr2BinT (with table)
10090   cycles for 100 * HexStr2X (64-bit)

25977   cycles for 100 * MasmBasic Val
42327   cycles for 100 * CRT a2ud
3540    cycles for 100 * Masm32 SDK hex2bin
3117    cycles for 100 * HexStr2Bin
1996    cycles for 100 * HexStr2BinT (with table)
7329    cycles for 100 * HexStr2X (64-bit)

29533   cycles for 100 * MasmBasic Val
43779   cycles for 100 * CRT a2ud
4857    cycles for 100 * Masm32 SDK hex2bin
5937    cycles for 100 * HexStr2Bin
3429    cycles for 100 * HexStr2BinT (with table)
7939    cycles for 100 * HexStr2X (64-bit)

41624   cycles for 100 * MasmBasic Val
50731   cycles for 100 * CRT a2ud
3926    cycles for 100 * Masm32 SDK hex2bin
6351    cycles for 100 * HexStr2Bin
2003    cycles for 100 * HexStr2BinT (with table)
7744    cycles for 100 * HexStr2X (64-bit)

36712   cycles for 100 * MasmBasic Val
42581   cycles for 100 * CRT a2ud
3201    cycles for 100 * Masm32 SDK hex2bin
6077    cycles for 100 * HexStr2Bin
2002    cycles for 100 * HexStr2BinT (with table)
7664    cycles for 100 * HexStr2X (64-bit)

3       bytes for MasmBasic Val
19      bytes for CRT a2ud
12      bytes for Masm32 SDK hex2bin
48      bytes for HexStr2Bin
76      bytes for HexStr2BinT (with table)
104     bytes for HexStr2X (64-bit)

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT a2ud
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax HexStr2Bin
12ABCDEFh       eax HexStr2BinT (with table)
56789DEFh       eax HexStr2X (64-bit)

-
also working on a SSE2 packed conversion while commuting on train,now at home need to run through debugger fix it

Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 11, 2024, 07:55:44 PM
Quote from: sinsi on February 11, 2024, 12:50:53 PMthis second one tripped Windows Defender

The OS must be defended against exotic modern stuff like psllq, pextrb and pinsrb :thumbsup: 

Quote from: daydreamer on February 11, 2024, 06:20:52 PMworking on a SSE2 packed conversion

What kind of conversion?
Title: Re: Fast hexstring to binary conversion
Post by: hyder on February 14, 2024, 08:19:47 AM
Quote from: NoCforMe on February 11, 2024, 08:52:15 AMSo Mr. Hyde, do you have anything to say about what's evolved here from your post?
FWIW, I'm currently working on 32-bit ARM code for Volume 2 of "The Art of ARM Assembly" and I will occasionally post an x86 conversion here to see if I can get some ideas for improving the ARM code. Most of the crazy SSE/AVX stuff won't translate well, but the generic x86 code is useful to look at. I would have loved to find an SSE/AVX algorithm that processes multiple characters at a time, but nothing like that appears here (that I could see, anyway).

There is considerable Apples to Oranges comparisons going on here (for example, none of the routines I've seen handle underscores in the input, so comparing my function against those is not a good comparison; likewise, MASMBasic Val does so much more, it is also an unfair comparison). Running the tests on a single input string is dangerous, to say the least. That's why I used a large number of strings as inputs to my function, that tended to hit some boundary conditions). Of course, in the real world, most input strings are going to be relatively short (probably four digits or less), so choosing a large number of longer strings (or a long string as your only input) can be misleading for certain algorithms.

Also, I rarely drop down into optimizations involving instruction scheduling or code alignment. Such code executes well on *one* CPU, not as well on other CPUs (in the same CPU family). Modern compilers (with command-line switches) do a much better job of this kind of optimization these days. I'm not say that a human couldn't beat a compiler if they really tried, I'm just saying that human probably wouldn't redo the code for every CPU possibility whereas the C programmer can just change a command-line option and get better code for a different CPU. FWIW, I back ported my ARM code to (very unstructured) C code and the compiler generated code almost identical to my hand-written code. I was then able to generate Cortex-A72 (Pi 3), -A74 (Pi 4), -A76 (Pi 5), Cortex-M7F (Teensy 4.1), Cortex-M4 (Teensy 3.2), and Cortex-M0+ (Pico) code just by changing a command-line option. Except for the Cortex-M0+ (a brain-dead instruction set), the resulting code was quite good (this was all 32-bit code, btw).


And to answer the question about "why would anyone care about the speed?"
If you're writing library code for others to call, it should be optimized for space or speed (depending on the user's requirements). I generally choose speed. Of course, for generic library code, you cannot get away with some of the algorithms posted here as they wouldn't mesh well with calling code. As I am writing my code for use in "The Art of ARM Assembly Volume 2" (32-bit code), I like to preserve all the registers, which makes the code much easier to use by assembly language programmers (especially beginners, who tend to be the ones reading my books) even if it costs a little performance. For example, I have a "print" function I call, which is a front end for the C printf function, that preserves all the registers that printf() might wipe out (a large number, considering the SSE/AVX set). Not that making printf() any faster would be noticeable (as it is *really* slow to begin with), but not having to preserve any registers around the call to print (other than possible parameters you are passing in registers) is a big win, even with the performance loss. 


Cheers,
Randy Hyde
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 14, 2024, 11:20:44 AM
Quote from: hyder on February 14, 2024, 08:19:47 AMfor generic library code, you cannot get away with some of the algorithms posted here as they wouldn't mesh well with calling code

Hi Randy,

Can you elaborate a bit on that one? The algos posted here are normally compatible with the Windows ABI. Some of mine may not run on very old CPUs, but they run perfectly on 99% of all machines... so I don't quite understand what you mean :cool:
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 14, 2024, 12:45:31 PM
Version 3: CRT sscanf is out, slightly faster strtoull is in:

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

19228   cycles for 100 * MasmBasic Val
2638    cycles for 100 * Masm32 SDK hex2bin
25866   cycles for 100 * strtoull
1647    cycles for 100 * HexVal

19188   cycles for 100 * MasmBasic Val
2627    cycles for 100 * Masm32 SDK hex2bin
25486   cycles for 100 * strtoull
1713    cycles for 100 * HexVal

19315   cycles for 100 * MasmBasic Val
2649    cycles for 100 * Masm32 SDK hex2bin
25776   cycles for 100 * strtoull
1651    cycles for 100 * HexVal

19143   cycles for 100 * MasmBasic Val
2678    cycles for 100 * Masm32 SDK hex2bin
25735   cycles for 100 * strtoull
1665    cycles for 100 * HexVal

19124   cycles for 100 * MasmBasic Val
2641    cycles for 100 * Masm32 SDK hex2bin
25542   cycles for 100 * strtoull
1840    cycles for 100 * HexVal

strtoull sits in ucrtbase.dll, which might not be available on older Windows versions. The program checks for its presence, though.
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 14, 2024, 02:31:59 PM
OS msvcrt.dll
_strtoui64 (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strtoui64-wcstoui64-strtoui64-l-wcstoui64-l?view=msvc-170)

AMD Athlon(tm) II X2 220 Processor (SSE3)

42190   cycles for 100 * MasmBasic Val
4935    cycles for 100 * Masm32 SDK hex2bin
41442   cycles for 100 * strtoull
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 14, 2024, 08:29:49 PM
Quote from: TimoVJL on February 14, 2024, 02:31:59 PMAMD Athlon(tm) II X2 220 Processor (SSE3)

42190   cycles for 100 * MasmBasic Val
41442   cycles for 100 * strtoull

Congrats, you beat MasmBasic :thup:
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 14, 2024, 09:20:27 PM
An old AMD is still in use and i got it from my niece, when i bought her a HP Intel i5 laptop.
I let AMD Ryzen just rest with it's 32 GB memory in upstairs, as your AMD Athlon Gold 3150U have same kind of CPU, so don't help tests.
Title: Re: Fast hexstring to binary conversion
Post by: HSE on February 14, 2024, 10:14:56 PM
Quote from: jj2007 on February 14, 2024, 08:29:49 PMCongrats, you beat MasmBasic :thup:

Apparently it's no so hard :biggrin:, but  not always  :thumbsup:

Intel(R) Core(TM) i3-10100 CPU @ 3.60GHz (SSE4)

20432   cycles for 100 * MasmBasic Val
3208    cycles for 100 * Masm32 SDK hex2bin
20506   cycles for 100 * strtoull
2656    cycles for 100 * HexVal

20648   cycles for 100 * MasmBasic Val
3658    cycles for 100 * Masm32 SDK hex2bin
20626   cycles for 100 * strtoull
2764    cycles for 100 * HexVal

20714   cycles for 100 * MasmBasic Val
3649    cycles for 100 * Masm32 SDK hex2bin
20098   cycles for 100 * strtoull
2762    cycles for 100 * HexVal

20720   cycles for 100 * MasmBasic Val
3628    cycles for 100 * Masm32 SDK hex2bin
20784   cycles for 100 * strtoull
2771    cycles for 100 * HexVal

20596   cycles for 100 * MasmBasic Val
3613    cycles for 100 * Masm32 SDK hex2bin
20639   cycles for 100 * strtoull
2734    cycles for 100 * HexVal
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 14, 2024, 11:03:11 PM
Quote from: HSE on February 14, 2024, 10:14:56 PMApparently it's no so hard :biggrin:, but  not always  :thumbsup:

One should not compare apples and oranges: you can throw 0x123, $123, 123h, 1234 (ordinary decimal number) or 101010101010b at MasmBasic Val (https://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1202)(), and always get a correct answer - plus, for fast parsing, the number of used characters in edx *).

The proper comparison is between strtoull and the new HexVal macro, which both handle hexadecimal Ascii strings; and there I see a factor 7.x ;-)

Besides, while the strtoull docs says "long long" (IBM (https://www.ibm.com/docs/en/zos/2.4.0?topic=programs-strtoull-convert-string-unsigned-long-long): "strtoull() returns the converted unsigned long long value, represented in the string"), it doesn't say that you have to grab the long long from eax and edx:
include \masm32\MasmBasic\MasmBasic.inc
  Init
  Cls 3
  MbHexQ=1        ; limit Hex$ to 16 bytes
  Dll "ucrtbase"
  Declare strtoull, C:3
  Let esi="1234567890AbCdEfh"
  PrintLine "strtoull returns ", Hex$(strtoull(esi, 0, 16))
  PrintLine "HexVal returns  ", Hex$(HexVal(esi, 64))
  MbHexQ=0        ; no limit
  Inkey "HexVal returns  ", Hex$(HexVal("1234567890AbCdEf1234567890AbCdEfh", 128))
EndOfCode

strtoull returns 90ABCDEF
HexVal returns  12345678 90ABCDEF
HexVal returns  12345678 90ABCDEF 12345678 90ABCDEF

*) RichMasm uses Val(): Paste 0x123, $123, 123h, 1234, 101010101010y, then select each number and hit Ctrl N.
Title: Re: Fast hexstring to binary conversion
Post by: HSE on February 14, 2024, 11:39:44 PM
Hi JJ,

Quote from: jj2007 on February 14, 2024, 11:03:11 PMOne should not compare apples and oranges

If MasmBasic Val() is not comparable, just don't compare  :biggrin:

Apparently strtoull is less friendly than MasmBasic Val(), but can process more numeric bases. I don't even know about that function. Good investigation  :thumbsup:
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 15, 2024, 12:03:43 AM
Quote from: HSE on February 14, 2024, 11:39:44 PMcan process more numeric bases

If I need to convert octal numbers, I'll use strtoull then, thanks.
Title: Re: Fast hexstring to binary conversion
Post by: HSE on February 15, 2024, 12:10:26 AM
Just to note that strtoull return edx:eax

strtoull returns in edx 12345678
strtoull returns in eax 90ABCDEF
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 15, 2024, 12:46:39 AM
Quote from: jj2007 on February 14, 2024, 11:03:11 PMwhile the strtoull docs says "long long" ..., it doesn't say that you have to grab the long long from eax and edx

HexVal(pStr, 64) returns a QWORD in xmm0, which is easier to handle than two volatile registers.
Title: Re: Fast hexstring to binary conversion
Post by: daydreamer on February 15, 2024, 02:40:44 AM
Best print performance to put code in Workerthread you have a whole separate set of registers and main thread prints results
But I have no knowledge of ARM SIMT works???
So easiest to port to ARM assembler is stick to use scalar gp registers? Both 32 and 64 bit?


Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 15, 2024, 02:42:38 AM
Hi Daydreamer, you are obviously in the wrong thread...
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 15, 2024, 09:09:02 PM
QuoteIt really depends on the calling convention used, but typically EAX is used for 32-bit and smaller integral data types, floating point values tend to use FPU or MMX registers, and 64-bit integral types tend to use a combination of EAX and EDX instead. Then there is the issue of complex class/struct types, in which case the compiler may decide to optimize away the return value and use an extra output parameter on the call stack to pass the returned object by reference to the caller.
Does the return value always go into eax register after a method call? (https://stackoverflow.com/questions/21195910/does-the-return-value-always-go-into-eax-register-after-a-method-call)
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 15, 2024, 09:53:52 PM
Quote from: TimoVJL on February 15, 2024, 09:09:02 PM64-bit integral types tend to use a combination of EAX and EDX

Quote from: jj2007 on February 15, 2024, 12:46:39 AMHexVal(pStr, 64) returns a QWORD in xmm0, which is easier to handle than two volatile registers.

HexVal("1234ABCDh") returns the value in eax, which covers probably 99% of all use cases.

When I need a 64-bit value, however, xmm0 is a far better choice, at least in a library that has no problem to Print Str$(xmm0) :cool:

It's not my fault that the C/C++ family has problems with accepting xmm regs as input...
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 16, 2024, 08:58:32 PM
Quote from: jj2007 on February 15, 2024, 09:53:52 PMIt's not my fault that the C/C++ family has problems with accepting xmm regs as input...
There are so called standards, that keeps some features out of question ?
Also i386 was tricky for standards, as CPUs don't rule standards in common programming languages.
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 16, 2024, 09:25:19 PM
Quote from: TimoVJL on February 16, 2024, 08:58:32 PMThere are so called standards, that keeps some features out of question ?

You want us all to go back to the dark pre-SIMD ages?
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 16, 2024, 10:25:25 PM
Quote from: jj2007 on February 16, 2024, 09:25:19 PM
Quote from: TimoVJL on February 16, 2024, 08:58:32 PMThere are so called standards, that keeps some features out of question ?

You want us all to go back to the dark pre-SIMD ages?
No, x64 is already for us now  :biggrin:
Title: Re: Fast hexstring to binary conversion
Post by: daydreamer on February 22, 2024, 02:20:58 AM
Quote from: jj2007 on February 16, 2024, 09:25:19 PM
Quote from: TimoVJL on February 16, 2024, 08:58:32 PMThere are so called standards, that keeps some features out of question ?

You want us all to go back to the dark pre-SIMD ages?
Don't forget those dark ages of non SIMD and slow cpu's and tiny ram is what started this fun asm clock cycles reduction, without it we would found it pointless to make code faster with 3+ghz cpu's and very powerful gpus

Some members still enjoy the bigger challenge 16 bit real mode dos coding is

After try coding one non SIMD converter and one SIMD version, I saw the pros and cons and now trying code a hybrid version

Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 22, 2024, 03:03:03 AM
Quote from: daydreamer on February 22, 2024, 02:20:58 AMAfter try coding one non SIMD converter and one SIMD version, I saw the pros and cons and now trying code a hybrid version

Show me, I'm curious :biggrin:
Title: Re: Fast hexstring to binary conversion
Post by: LiaoMi on February 22, 2024, 06:15:58 AM
Quote from: jj2007 on February 14, 2024, 12:45:31 PMVersion 3: CRT sscanf is out, slightly faster strtoull is in:

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

19228  cycles for 100 * MasmBasic Val
2638    cycles for 100 * Masm32 SDK hex2bin
25866  cycles for 100 * strtoull
1647    cycles for 100 * HexVal

19188  cycles for 100 * MasmBasic Val
2627    cycles for 100 * Masm32 SDK hex2bin
25486  cycles for 100 * strtoull
1713    cycles for 100 * HexVal

19315  cycles for 100 * MasmBasic Val
2649    cycles for 100 * Masm32 SDK hex2bin
25776  cycles for 100 * strtoull
1651    cycles for 100 * HexVal

19143  cycles for 100 * MasmBasic Val
2678    cycles for 100 * Masm32 SDK hex2bin
25735  cycles for 100 * strtoull
1665    cycles for 100 * HexVal

19124  cycles for 100 * MasmBasic Val
2641    cycles for 100 * Masm32 SDK hex2bin
25542  cycles for 100 * strtoull
1840    cycles for 100 * HexVal

strtoull sits in ucrtbase.dll, which might not be available on older Windows versions. The program checks for its presence, though.

13th Gen Intel(R) Core(TM) i9-13980HX (SSE4)

11628   cycles for 100 * MasmBasic Val
1082    cycles for 100 * Masm32 SDK hex2bin
6352    cycles for 100 * strtoull
847     cycles for 100 * HexVal

11620   cycles for 100 * MasmBasic Val
1079    cycles for 100 * Masm32 SDK hex2bin
6427    cycles for 100 * strtoull
858     cycles for 100 * HexVal

11619   cycles for 100 * MasmBasic Val
1070    cycles for 100 * Masm32 SDK hex2bin
6360    cycles for 100 * strtoull
844     cycles for 100 * HexVal

11600   cycles for 100 * MasmBasic Val
1109    cycles for 100 * Masm32 SDK hex2bin
6444    cycles for 100 * strtoull
857     cycles for 100 * HexVal

11669   cycles for 100 * MasmBasic Val
1090    cycles for 100 * Masm32 SDK hex2bin
6359    cycles for 100 * strtoull
858     cycles for 100 * HexVal

3       bytes for MasmBasic Val
12      bytes for Masm32 SDK hex2bin
16      bytes for strtoull
0       bytes for HexVal

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT sscanf
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax strtoull
12ABCDEFh       eax HexVal

--- ok ---
Title: Re: Fast hexstring to binary conversion
Post by: jj2007 on February 22, 2024, 07:03:23 AM
Quote from: LiaoMi on February 22, 2024, 06:15:58 AM11619   cycles for 100 * MasmBasic Val
1070    cycles for 100 * Masm32 SDK hex2bin
6360    cycles for 100 * strtoull
844     cycles for 100 * HexVal

That's a pretty fast cpu :thumbsup:
Title: Re: Fast hexstring to binary conversion
Post by: TimoVJL on February 22, 2024, 07:32:22 AM
Keep testing.
This test isn't good with variable speed CPUs and C compiler optimizations make troubles.
Tests between msvcrt.dll _strtoui64 and ucrtbase.dll strtoull are interesting.
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>

int __cdecl main(void)
{
long long ll;
// long long llFrequency;
// QueryPerformanceFrequency((LARGE_INTEGER *)&llFrequency);

long long llT1, llT2;
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = strtoull("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);

HMODULE hDll = LoadLibrary("msvcrt.dll");
PROC pProc = (PROC)GetProcAddress(hDll, "_strtoui64");
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = pProc("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);
FreeLibrary(hDll);

hDll = LoadLibrary("ucrtbase.dll");
pProc = (PROC)GetProcAddress(hDll, "strtoull");
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = pProc("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);
FreeLibrary(hDll);

return 0;
}
Title: Re: Fast hexstring to binary conversion
Post by: hyder on March 01, 2024, 08:01:45 AM
Quote from: jj2007 on February 14, 2024, 11:20:44 AM
Quote from: hyder on February 14, 2024, 08:19:47 AMfor generic library code, you cannot get away with some of the algorithms posted here as they wouldn't mesh well with calling code

Hi Randy,

Can you elaborate a bit on that one? The algos posted here are normally compatible with the Windows ABI. Some of mine may not run on very old CPUs, but they run perfectly on 99% of all machines... so I don't quite understand what you mean :cool:
I am referring to my particular code calling these routines.
For example, as an assembly programmer, I always preserve all registers I modify that don't explicitly return values. This is slower than the Intel ABI, but safer for use in assembly programs.
Cheers,
Randy Hyde