News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Fast hexstring to binary conversion

Started by jj2007, February 11, 2024, 06:33:52 AM

Previous topic - Next topic

jj2007

Quote from: daydreamer on February 22, 2024, 02:20:58 AMAfter try coding one non SIMD converter and one SIMD version, I saw the pros and cons and now trying code a hybrid version

Show me, I'm curious :biggrin:

LiaoMi

Quote from: jj2007 on February 14, 2024, 12:45:31 PMVersion 3: CRT sscanf is out, slightly faster strtoull is in:

AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

19228  cycles for 100 * MasmBasic Val
2638    cycles for 100 * Masm32 SDK hex2bin
25866  cycles for 100 * strtoull
1647    cycles for 100 * HexVal

19188  cycles for 100 * MasmBasic Val
2627    cycles for 100 * Masm32 SDK hex2bin
25486  cycles for 100 * strtoull
1713    cycles for 100 * HexVal

19315  cycles for 100 * MasmBasic Val
2649    cycles for 100 * Masm32 SDK hex2bin
25776  cycles for 100 * strtoull
1651    cycles for 100 * HexVal

19143  cycles for 100 * MasmBasic Val
2678    cycles for 100 * Masm32 SDK hex2bin
25735  cycles for 100 * strtoull
1665    cycles for 100 * HexVal

19124  cycles for 100 * MasmBasic Val
2641    cycles for 100 * Masm32 SDK hex2bin
25542  cycles for 100 * strtoull
1840    cycles for 100 * HexVal

strtoull sits in ucrtbase.dll, which might not be available on older Windows versions. The program checks for its presence, though.

13th Gen Intel(R) Core(TM) i9-13980HX (SSE4)

11628   cycles for 100 * MasmBasic Val
1082    cycles for 100 * Masm32 SDK hex2bin
6352    cycles for 100 * strtoull
847     cycles for 100 * HexVal

11620   cycles for 100 * MasmBasic Val
1079    cycles for 100 * Masm32 SDK hex2bin
6427    cycles for 100 * strtoull
858     cycles for 100 * HexVal

11619   cycles for 100 * MasmBasic Val
1070    cycles for 100 * Masm32 SDK hex2bin
6360    cycles for 100 * strtoull
844     cycles for 100 * HexVal

11600   cycles for 100 * MasmBasic Val
1109    cycles for 100 * Masm32 SDK hex2bin
6444    cycles for 100 * strtoull
857     cycles for 100 * HexVal

11669   cycles for 100 * MasmBasic Val
1090    cycles for 100 * Masm32 SDK hex2bin
6359    cycles for 100 * strtoull
858     cycles for 100 * HexVal

3       bytes for MasmBasic Val
12      bytes for Masm32 SDK hex2bin
16      bytes for strtoull
0       bytes for HexVal

12ABCDEFh       eax MasmBasic Val
12ABCDEFh       eax CRT sscanf
12ABCDEFh       eax Masm32 SDK hex2bin
12ABCDEFh       eax strtoull
12ABCDEFh       eax HexVal

--- ok ---

jj2007

Quote from: LiaoMi on February 22, 2024, 06:15:58 AM11619   cycles for 100 * MasmBasic Val
1070    cycles for 100 * Masm32 SDK hex2bin
6360    cycles for 100 * strtoull
844     cycles for 100 * HexVal

That's a pretty fast cpu :thumbsup:

TimoVJL

#33
Keep testing.
This test isn't good with variable speed CPUs and C compiler optimizations make troubles.
Tests between msvcrt.dll _strtoui64 and ucrtbase.dll strtoull are interesting.
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>

int __cdecl main(void)
{
long long ll;
// long long llFrequency;
// QueryPerformanceFrequency((LARGE_INTEGER *)&llFrequency);

long long llT1, llT2;
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = strtoull("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);

HMODULE hDll = LoadLibrary("msvcrt.dll");
PROC pProc = (PROC)GetProcAddress(hDll, "_strtoui64");
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = pProc("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);
FreeLibrary(hDll);

hDll = LoadLibrary("ucrtbase.dll");
pProc = (PROC)GetProcAddress(hDll, "strtoull");
QueryPerformanceCounter((LARGE_INTEGER *)&llT1);
for (int i=0; i<100; i++)
ll = pProc("12ABCDEFh", NULL, 16);
QueryPerformanceCounter((LARGE_INTEGER *)&llT2);
printf("%llu\n", llT2-llT1);
FreeLibrary(hDll);

return 0;
}
May the source be with you

hyder

Quote from: jj2007 on February 14, 2024, 11:20:44 AM
Quote from: hyder on February 14, 2024, 08:19:47 AMfor generic library code, you cannot get away with some of the algorithms posted here as they wouldn't mesh well with calling code

Hi Randy,

Can you elaborate a bit on that one? The algos posted here are normally compatible with the Windows ABI. Some of mine may not run on very old CPUs, but they run perfectly on 99% of all machines... so I don't quite understand what you mean :cool:
I am referring to my particular code calling these routines.
For example, as an assembly programmer, I always preserve all registers I modify that don't explicitly return values. This is slower than the Intel ABI, but safer for use in assembly programs.
Cheers,
Randy Hyde