News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

AsciiHextoDword (SSE2 version)

Started by guga, March 08, 2025, 11:29:50 AM

Previous topic - Next topic

ognil

QuoteYou are Lingo´s lost twin brother :badgrin:

Thank you Guga,
Nice... :badgrin:  :badgrin: 
Yes,I'm Ognil Da Vito :badgrin:
"Not keeping emotions under control is another type of mental distortion."

zedd151

Okay guga, where did we leave off before this side-show started?
Oh yeah, you posted a new attachment. I will download it once I'm am back at my computer. I'm on my iPad on the back porch right now....
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

zedd151

Okay, I'm back at my computer....
From your latest attachment guga, "BenchMarkTest3a2" ...
Quote17 cycles -> Ascii Hex to Dw by Guga (new version. variable lenght),  Input: 0F2A45B7 . Return in EAX: F2A45B7
17 cycles -> Ascii Hex to Dw by Guga (new version. variable lenght),  Input: A45B7 . Return in EAX: A45B7
15 cycles -> Ascii Hex to Dw by Guga (Old version - fixed Lenght),  Input: 0F2A45B7 . Return in EAX: F2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 543210F2A45B7 . Return in EAX: 13 (Bytes)
Output:
D$ 0x54321
D$ 0xF2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 18F2A45B7 . Return in EAX: 9 (Bytes)
Output:
D$ 0x1
D$ 0x8F2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 76543210F2A45B7 . Return in EAX: 15 (Bytes)
Output:
D$ 0x7654321
D$ 0xF2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 76543210F2A45B7 . Return in EAX: 15 (Bytes)
Output:
D$ 0x7654321
D$ 0xF2A45B7

25 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 876543210F2A45B7 . Return in EAX: 16 (Bytes)
Output:
D$ 0x87654321
D$ 0xF2A45B7
:thumbsup:
Did you make any changes to the algorithm code here?
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

NoCforMe

OK, let me set a minor cat amongst the pigeons here:

I've cooked up a really simple ASCII hex--> binary conversion routine.
So simple it borders on the dumbass: just translates each ASCII char. to a binary nybble in a loop and stuffs it into an accumulator:

;====================================
; Partial ASCII table:
; This contains translation elements up to
; ASCII 'f'. Non-hex values have zeroes;
; Valid hex chars. have their corresponding values.
;====================================

HexXlatTable LABEL BYTE
; Chars. below '0':
DB 48 DUP (0)
; '0'-'9':
DB 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
; Chars. up to 'A':
DB 7 DUP (0)
; 'A' - 'F':
DB 10, 11, 12, 13, 14, 15
; Chars. up to 'a':
DB 26 DUP (0)
; 'a' - 'f':
DB 10, 11, 12, 13, 14, 15


;====================================
; AscHex2Bin()
;
; Converts a string of ASCII hex characters
; @ EAX to a numeric value in EAX
; (string must be NULL-terminated)
;
; Hex string can contain:
;  o 0-9
;  o A-F
;  o a-f
;
; No error checking is done on ASCII text.
;====================================

AscHex2Bin PROC

PUSH EBX
MOV EBX, OFFSET HexXlatTable
MOV ECX, EAX ;ECX--> ASCII chars.

XOR EDX, EDX ;EDX: accumulator.

next: XOR EAX, EAX ;Clear entire register.
MOV AL, [ECX] ;Get next ASCII char.
INC ECX
TEST EAX, EAX ;Check for end of string.
JZ done
XLATB ;Get its hex value.
SHL EDX, 4 ;Shift existing accumulator contents.
OR EDX, EAX ;Lay nybble into the accumulator.
JMP next

done: MOV EAX, EDX ;Put into return reg.
POP EBX
RET

AscHex2Bin ENDP

So I'm curious how much slower this might be than all that fancy-schmancy SSE/XMM or whatever code y'all are using here.

Anyone care to put this into a testbed and give it a spin? Zedd? Shouldn't be hard to do.

Routine takes the hex chars. pointed to by EAX, returns value in EAX.
Does a maximum of 8 hex chars. (largest value in 32-bit reg.).
Could easily be converted to 64-bit, giving a max. of 16 hex chars (1 register) or 32 chars. (2-register pair). 32-bit code could be expanded to max. 16. hex chars in a 2-register pair.

If anyone wants the testbed (console program) I can attach that here.
Assembly language programming should be fun. That's why I do it.

zedd151

Quote from: NoCforMe on March 11, 2025, 02:55:50 PMSo I'm curious how much slower this might be than all that fancy-schmancy SSE/XMM or whatever code y'all are using here.

This should give you some indication of the possible speed differences. One of guga's algorithms was tested against one of jj's, and jj usually has pretty fast code...    FROM HERE
AMD Athlon Gold 3150U with Radeon Graphics      (SSE4)

18532  cycles for 100 * Val()
708    cycles for 100 * AsciiHex2dwNew

18615  cycles for 100 * Val()
711    cycles for 100 * AsciiHex2dwNew

18780  cycles for 100 * Val()
743    cycles for 100 * AsciiHex2dwNew

18562  cycles for 100 * Val()
741    cycles for 100 * AsciiHex2dwNew

18535  cycles for 100 * Val()
725    cycles for 100 * AsciiHex2dwNew

Averages:
18571  cycles for Val()
726    cycles for AsciiHex2dwNew

13      bytes for Val()
202    bytes for AsciiHex2dwNew

1234ABCDh      eax Val()
1234ABCDh      eax AsciiHex2dwNew
I would write my version very similar to yours, NoCforme. What guga is doing is totally different. Its like comparing apples to oranges.
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

NoCforMe

Quote from: zedd151 on March 11, 2025, 03:08:36 PMWhat guga is doing is totally different. Its like comparing apples to oranges.

Yeah, yeah, I know all that: he's using them fancy bit-shuffling instructions instead of the regular x86 stuff.

Just curious how much slower my "old-school" code is.
Assembly language programming should be fun. That's why I do it.

zedd151

Quote from: NoCforMe on March 11, 2025, 03:12:45 PM
Quote from: zedd151 on March 11, 2025, 03:08:36 PMWhat guga is doing is totally different. Its like comparing apples to oranges.

Yeah, yeah, I know all that: he's using them fancy bit-shuffling instructions instead of the regular x86 stuff.

Just curious how much slower my "old-school" code is.
Maybe guga can set them both up in a testbed. He knows best how his own function works.
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

guga

Quote from: zedd151 on March 11, 2025, 02:31:13 PMOkay, I'm back at my computer....
From your latest attachment guga, "BenchMarkTest3a2" ...
Quote17 cycles -> Ascii Hex to Dw by Guga (new version. variable lenght),  Input: 0F2A45B7 . Return in EAX: F2A45B7
17 cycles -> Ascii Hex to Dw by Guga (new version. variable lenght),  Input: A45B7 . Return in EAX: A45B7
15 cycles -> Ascii Hex to Dw by Guga (Old version - fixed Lenght),  Input: 0F2A45B7 . Return in EAX: F2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 543210F2A45B7 . Return in EAX: 13 (Bytes)
Output:
D$ 0x54321
D$ 0xF2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 18F2A45B7 . Return in EAX: 9 (Bytes)
Output:
D$ 0x1
D$ 0x8F2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 76543210F2A45B7 . Return in EAX: 15 (Bytes)
Output:
D$ 0x7654321
D$ 0xF2A45B7

52 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 76543210F2A45B7 . Return in EAX: 15 (Bytes)
Output:
D$ 0x7654321
D$ 0xF2A45B7

25 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 876543210F2A45B7 . Return in EAX: 16 (Bytes)
Output:
D$ 0x87654321
D$ 0xF2A45B7
:thumbsup:
Did you make any changes to the algorithm code here?

Hi Zedd

No, i didnt make any changes on this version, it is the same i used for JJ´s test and on the other i uploaded hat didn´t showed the extra '0' before each Dword
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

zedd151

Quote from: guga on March 11, 2025, 03:16:44 PMNo, i didnt make any changes on this version, it is the same i used for JJ´s test and on the other i uploaded hat didn´t showed the extra '0' before each Dword
Just cosmetic changes to the display, got it.
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—

guga

Quote from: NoCforMe on March 11, 2025, 02:55:50 PMOK, let me set a minor cat amongst the pigeons here:

I've cooked up a really simple ASCII hex--> binary conversion routine.
So simple it borders on the dumbass: just translates each ASCII char. to a binary nybble in a loop and stuffs it into an accumulator:

;====================================
; Partial ASCII table:
; This contains translation elements up to
; ASCII 'f'. Non-hex values have zeroes;
; Valid hex chars. have their corresponding values.
;====================================

HexXlatTable    LABEL BYTE
; Chars. below '0':
    DB    48 DUP (0)
; '0'-'9':
    DB    0, 1, 2, 3, 4, 5, 6, 7, 8, 9
; Chars. up to 'A':
    DB    7 DUP (0)
; 'A' - 'F':
    DB    10, 11, 12, 13, 14, 15
; Chars. up to 'a':
    DB    26 DUP (0)
; 'a' - 'f':
    DB    10, 11, 12, 13, 14, 15


;====================================
; AscHex2Bin()
;
; Converts a string of ASCII hex characters
; @ EAX to a numeric value in EAX
; (string must be NULL-terminated)
;
; Hex string can contain:
;  o 0-9
;  o A-F
;  o a-f
;
; No error checking is done on ASCII text.
;====================================

AscHex2Bin    PROC

    PUSH    EBX
    MOV    EBX, OFFSET HexXlatTable
    MOV    ECX, EAX        ;ECX--> ASCII chars.

    XOR    EDX, EDX        ;EDX: accumulator.

next:    XOR    EAX, EAX        ;Clear entire register.
    MOV    AL, [ECX]        ;Get next ASCII char.
    INC    ECX
    TEST    EAX, EAX        ;Check for end of string.
    JZ    done
    XLATB                ;Get its hex value.
    SHL    EDX, 4            ;Shift existing accumulator contents.
    OR    EDX, EAX        ;Lay nybble into the accumulator.
    JMP    next

done:    MOV    EAX, EDX        ;Put into return reg.
    POP    EBX
    RET

AscHex2Bin    ENDP

So I'm curious how much slower this might be than all that fancy-schmancy SSE/XMM or whatever code y'all are using here.

Anyone care to put this into a testbed and give it a spin? Zedd? Shouldn't be hard to do.

Routine takes the hex chars. pointed to by EAX, returns value in EAX.
Does a maximum of 8 hex chars. (largest value in 32-bit reg.).
Could easily be converted to 64-bit, giving a max. of 16 hex chars (1 register) or 32 chars. (2-register pair). 32-bit code could be expanded to max. 16. hex chars in a 2-register pair.

If anyone wants the testbed (console program) I can attach that here.

Yes, pls...It would be nice comparing the results.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Quote from: NoCforMe on March 11, 2025, 03:12:45 PM
Quote from: zedd151 on March 11, 2025, 03:08:36 PMWhat guga is doing is totally different. Its like comparing apples to oranges.

Yeah, yeah, I know all that: he's using them fancy bit-shuffling instructions instead of the regular x86 stuff.

Just curious how much slower my "old-school" code is.

Btw, this was the old code used in RosAsm

Equates
[LowSigns            31
    TextSign            30

  NoSpaceAfterThis    29
    numSign             28   ; #  01C
    IfNumSign           27   ; Substitute of # for the Conditional macros #If, ... 01B

    OpenParaMacro       26   ; { for ParaMacros  01A
  NoSpaceBeforeThis   25
    CloseParaMacro      24   ; } for ParaMacros

    CommaSign           23   ; ,

    OpenVirtual         22   ; [   016 (Macros expanded '[' -{-)
    CloseVirtual        21   ; ]   015 (Macros expanded ']' -}-) 019
    OpenBracket         20   ; [   014
    CloseBracket        19   ; ]   013
; 18, 17 >>> NewOpenBracket / NewCloseBracket
  PartEnds            16
    memMarker           15   ; $ or $  exemple: MOV B$MYVALUE 1
    colonSign           14   ; :
    openSign            13   ; (
    closeSign           12   ; )

  OperatorSigns       11
    addSign             10   ; +
    subSign              9   ; -
    mulSign              8   ; *
    divSign              7   ; /
    expSign              6   ; ^
; 5
  Separators          4
   ; Statement           0FF
    Space               3    ; space
    EOI                 2    ; |  End Of Instruction (separator)
    meEOI               1]   ; |  End Of Instruction in macro expansion
                             ; 0 is used as erase sign inside treatements


TranslateHexa:
    lodsb                                               ; clear first '0'
NackedHexa:
    mov ebx 0,  edx 0, ecx 0
L0: lodsb | cmp al LowSigns | jbe L9>
        sub al '0' | cmp al 9 | jbe L2>
            sub al 7
L2: shld edx ebx 4 | shl ebx 4 | or bl al
    cmp edx ecx | jb L8>
        mov ecx edx
            cmp al 0F | jbe L0<
L8: mov ecx D$HexTypePtr | jmp BadNumberFormat ; <--- for errors routine
L9: mov eax ebx
ret

This routine was done in the late 90's, so it was more than enough time to make a upgrade on it.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

NoCforMe

Easy peasy:
HexChars2Test DB "1234cDeF", 0

MOV EAX, OFFSET HexChars2Test
CALL AscHex2Bin

; result now in EAX
Assembly language programming should be fun. That's why I do it.

NoCforMe

BTW, what a weird assembler, RosAsm, that is.
Assembly language programming should be fun. That's why I do it.

guga

Hi NoCforMe, here is the test on yours.

The equivalent for RosAsm is:

Naked version (Ported as it is:

[HexXlatTable:
 HexXlatTable.Chars:    B$ 0 #48, ; Chars. below '0':
 HexXlatTable.Numbers:  B$ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ; '0'-'9':
 HexXlatTable.Chars2:   B$ 0 #7, ; Chars. up to 'A':
 HexXlatTable.AtoF:     B$ 10, 11, 12, 13, 14, 15, ; 'A' - 'F':
 HexXlatTable.SmallCaps:  B$ 0 #26,; Chars. up to 'a':
 HexXlatTable.atof2:     B$ 10, 11, 12, 13, 14, 15]; ; 'a'-'f' (ASCII 97-102): 10 to 15

Proc AscHex2Bin:
    Uses ebx

    mov ebx HexXlatTable   ; EBX points to translation table
    mov ecx eax            ; ECX = pointer to ASCII string
    xor edx edx            ; EDX = accumulator, zeroed

@NextChar:
    xor eax eax
    mov al B$ecx
    inc ecx
    test eax eax | jz @Done
    xlatb
    shl edx 4
    or edx eax
    jmp @NextChar

@Done:
    mov eax edx

EndP


Modified for test comparisons (Registers preserved and using a parameter as input


Proc AscHex2BinRegPreserved:
    Arguments @pString
    Uses ebx, ecx, edx

    mov ebx HexXlatTable   ; EBX points to translation table
    mov ecx D@pString      ; ECX = pointer to ASCII string
    xor edx edx            ; EDX = accumulator, zeroed

@NextChar:
    xor eax eax
    mov al B$ecx
    inc ecx
    test eax eax | jz @Done
    xlatb
    shl edx 4
    or edx eax
    jmp @NextChar

@Done:
    mov eax edx

EndP


QuoteBTW, what a weird assembler, RosAsm, that is.
Well...The syntax is biased in Nasm and it is very old. Never had enough time to fix all necessary issues on it, neither changed the syntax to be a bit more masm friendly (Although you can emulate some of it´s syntax with the preparser token.).

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

zedd151

Not bad, NoCforMe. Only slightly slower than guga's algorithm, for 8 char input.

24 cycles -> Ascii Hex to Qword by Guga (Variable Lenght),  Input: 87654321 . Return in EAX: 8 (Bytes)
Output:
D$ 0x87654321
D$ 0x0

30 cycles -> Ascii Hex to Dword by NoCForMe,  Input: 87654321 . Return in EAX: 87654321

31 cycles -> Ascii Hex to Dword by NoCForMe - Registers preserved,  Input: 87654321 . Return in EAX: 87654321

Guga...
Maybe you could optimize that then no need for SSE code, guga? I thought that there would be a much bigger difference between NoCforMe's old skool bytewise algo and yours using SSE...
¯\_(ツ)_/¯   :azn:

'As we don't do "requests", show us your code first.'  -  hutch—