News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast DwordtoHex ?

Started by guga, November 27, 2015, 11:16:24 PM

Previous topic - Next topic

guga

Hi guys, i´m facing a small problem

I need a fast dword to hex algorithm that can produces short strings with only one zero at the start.
Example:
Input AAB12F
Output String; 0AAB12F

Input AA
Output String; 0AA

Input FFFFFFFF
Output String; 0FFFFFFFF

etc.

I´m usng Biterider optimized algo (no loops)


Proc dwtohexEx:
    Arguments @dNumber, @pBuffer
    Uses edx, ecx, eax, edi

    mov edx D@dNumber
    mov ecx edx
    shr edx 4
    and edx 0F0F0F0F
    and ecx 0F0F0F0F

    mov eax edx
    mov edi ecx

    add edx (080808080 - 0A0A0A0A) ; Build mask to discern digit > 9
    add ecx (080808080 - 0A0A0A0A)
    shr edx 4
    shr ecx 4
    not edx
    not ecx
    and edx 07070707 ; Mask digit > 9 ... mask = 0111
    and ecx 07070707
    add edx eax ; Add 'A' - '9' if digit > 9
    add ecx edi
    add edx 030303030 ; Add ascii '0'
    add ecx 030303030

    mov edi D@pBuffer ; Using edi is faster
    mov B$edi+7 cl
    mov B$edi+6 dl
    mov B$edi+5 ch
    mov B$edi+4 dh
    shr ecx 16
    shr edx 16
    mov B$edi+3 cl
    mov B$edi+2 dl
    mov B$edi+1 ch
    mov B$edi+0 dh
    mov B$edi+8 0

EndP


But, the result are fixed on a 8 byte lentgh string (Ex: Input 012A, output: 0000012A), instead the short variation as needed. does anyone knows a fast algo that can do this convertion ? (as in the example i posted. So, input 12A, output 012A)
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Just use a buffer for the biggest possible string length plus 1.
- call the algo
- check the bytes at the start until you find a non-"0"
- put "0" before that byte, and use this as as the start address.

fearless

Havn't tested it, and prob could be optimized but based on JJs suggestion this is something id write to format the hex string, hopefully that helps.

FormatHexString PROTO :DWORD, :DWORD

;-------------------------------------------------------------------------------
; Formats hex string so that it start with one zero
; Input: lpszHexString
; Output: lpszFormattedString
;
; make sure buffer pointed to at lpszFormattedString is large enough
;
;
; Exmaple: Invoke FormatHexString, Addr szHEX, Addr szMyNewHexString
;
;-------------------------------------------------------------------------------
FormatHexString PROC USES EDI ESI lpszHexString:DWORD, lpszFormattedString:DWORD
    LOCAL Position:DWORD
    LOCAL FlagFoundHexChars:DWORD
   
    Invoke szLen, lpszHexString
    mov nMaxLen, eax
   
    mov edi, lpszFormattedString
    mov esi, lpszHexString
   
    mov byte ptr [edi], '0' ; start our formatted string with a ascii zero character
    inc edi

    mov FlagFoundHexChars, FALSE ; set flag to false initially
    mov Position, 0
    mov eax, 0
    .WHILE eax < nMaxLen
       
        .IF FlagFoundHexChars == FALSE
            movzx eax, byte ptr [esi]
            .IF al != '0' ; ascii zero
                mov FlagFoundHexChars, TRUE ; looks like we found some ascii chars
                mov byte ptr [edi], al ; so start storing the first of them into our formatted string (edi)
                inc edi ; position for next char when we loop next and we branch to next bit below till end of string
            .ENDIF
        .ELSE ; we have a flag set, so we fetch rest of characters in string till we hit end of string or a null char
            movzx eax, byte ptr [esi]
            .IF al != 0 ; null
                mov byte ptr [edi], al ; start storing next byte in formatted string (edi)
                inc edi ; position for next char when we loop next
            .ELSE
                .BREAK ; break if null found
            .ENDIF
        .ENDIF

        inc esi
        inc Position
        mov eax, Position
    .ENDW
   
    mov byte ptr [edi], 0 ; final null of formatted string

    ret
   
FormatHexString ENDP

dedndave

adding the 0 is simple enough
for converting to hex, a look-up table is likely the fastest
i would think a 512-byte table would work well

if speed isn't that important, but you want UNICODE aware....

awDw2Hex PROC USES EDI dwVal:DWORD,lpBuf:LPSTR

;UNICODE aware Dword to Hex - DednDave, 3-2013

;  Returns: EAX = pointer to string buffer
;           ECX = length in characters (8)
;           EDX = original binary value

;the buffer must be large enough for at least 9 TCHAR's (includes null terminator)

;--------------------------------------------

    mov     ecx,8
    mov     edx,dwVal
    mov     edi,lpBuf
    push    ecx
    push    edx
    push    edi
    IFDEF __UNICODE__
        add     edi,16
        mov word ptr [edi],0
    ELSE
        add     edi,ecx
        mov byte ptr [edi],0
    ENDIF
    .repeat
        mov     eax,edx
        IFDEF __UNICODE__
            sub     edi,2
        ELSE
            dec     edi
        ENDIF
        and     eax,0Fh
        shr     edx,4
        cmp     al,0Ah
        sbb     al,69h
        das
        dec     ecx
        IFDEF __UNICODE__
            mov     [edi],ax
        ELSE
            mov     [edi],al
        ENDIF
    .until ZERO?
    pop     eax
    pop     edx
    pop     ecx
    ret

awDw2Hex ENDP


just modify that to add a 0, as required

guga

Thanks, guys...

I´ll give a try. The most important for me is speed.(That´s why i used bitraider´s algo), but i needed to create a "short" output, and not the whole 8 bytes long string.

Perhaps using a bswap at the beginning and doing what JJ suggested ?

I´ll give a try and test all of the algos to check for speed.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

#5
Check
\Masm32\m32lib\dw2hex.asm
\Masm32\m32lib\dw2h_ex.asm

P.S.: I have hacked together an algo using a table, here are some results.
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

4144    cycles for 100 * dw2hex
7838    cycles for 100 * MB Hex$
51698   cycles for 100 * CRT sprintf
1066    cycles for 100 * Bin2Hex
765     cycles for 100 * Bin2Hex2

4143    cycles for 100 * dw2hex
7854    cycles for 100 * MB Hex$
51713   cycles for 100 * CRT sprintf
1066    cycles for 100 * Bin2Hex
765     cycles for 100 * Bin2Hex2

4141    cycles for 100 * dw2hex
7848    cycles for 100 * MB Hex$
51750   cycles for 100 * CRT sprintf
1066    cycles for 100 * Bin2Hex
765     cycles for 100 * Bin2Hex2

4145    cycles for 100 * dw2hex
7884    cycles for 100 * MB Hex$
51708   cycles for 100 * CRT sprintf
1065    cycles for 100 * Bin2Hex
764     cycles for 100 * Bin2Hex2

20      bytes for dw2hex
17      bytes for MB Hex$
29      bytes for CRT sprintf
225     bytes for Bin2Hex
150     bytes for Bin2Hex2

00345678        = eax dw2hex
00345678        = eax MB Hex$
345678  = eax CRT sprintf
345678  = eax Bin2Hex
00345678        = eax Bin2Hex2


As you can see, both CRT sprintf and the first variant of my algo can handle the short form.

guga

Hi JJ

I analysed it and i´m trying to gain a bit more speed on dw2hex.asm and dw2hex_ex.asm

On my tests i made a faster variation of the dw2hex using a fixed table (instead of computing it as in your example)

Can you please tests it to check if it is really that fast ? On mine tests it s half of teh speed of dw2hex and 10-18% faster then Bin2Hex2 of yours.

The variation i made was:

[hex_table: B$ "000102030405060708090A0B0C0D0E0F"
            B$ "101112131415161718191A1B1C1D1E1F"
            B$ "202122232425262728292A2B2C2D2E2F"
            B$ "303132333435363738393A3B3C3D3E3F"
            B$ "404142434445464748494A4B4C4D4E4F"
            B$ "505152535455565758595A5B5C5D5E5F"
            B$ "606162636465666768696A6B6C6D6E6F"
            B$ "707172737475767778797A7B7C7D7E7F"
            B$ "808182838485868788898A8B8C8D8E8F"
            B$ "909192939495969798999A9B9C9D9E9F"
            B$ "A0A1A2A3A4A5A6A7A8A9AAABACADAEAF"
            B$ "B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF"
            B$ "C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF"
            B$ "D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF"
            B$ "E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF"
            B$ "F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF", 0]

Proc Bin2Hex6:
    Arguments @Input, @Output
    Local @DwordStorage
    Uses eax, edi ; preserves eax and edi on output

    mov eax D@Input
    mov edi D@Output

    mov D@DwordStorage eax
    movzx eax B@DwordStorage+3 | mov ax W$hex_table+eax*2 | stosw ; |  mov W$edi ax . Using stosw is faster then mov W$ on a I7
    movzx eax B@DwordStorage+2 | mov ax W$hex_table+eax*2 | stosw ; | mov W$edi+2 ax. Using stosw is faster then mov W$ on a I7
    movzx eax B@DwordStorage+1 | mov ax W$hex_table+eax*2 | stosw ; | mov W$edi+4 ax. Using stosw is faster then mov W$ on a I7
    movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw ; | mov W$edi+6 ax. Using stosw is faster then mov W$ on a I7
    mov B$edi 0

EndP


The above version does not produce the shorter string. I´m trying to speed it up 1st, because if i use "repeat+until" macros the code will slow down due to the loop.

I´ll try replacing it with a test opcode to see if it can speed it up a bit
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Hi Guga,

Of course, my algo uses the same identical table. If you have an algo that uses this table, please post it (in Masm syntax), and I will add it to the testbed.

Btw if Repeat ... Until loops slow down your code, check what RosAsm generates. In Masm, the macro produces the fastest possible version.

dedndave

i haven't looked at the current algorithms

but - if you can write the routine so the address is returned in EAX,
rather than left-justifying the string in a fixed buffer, it should help speed it up

dedndave

do you want a leading 0 when the first character is less than A ?

guga

Hi dave, yes. I need a leading 0 on all values starting from 0 to F. Ex: 00 , 01234A, 023, 0FFFF etc
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

JJ. i`ll try porting it to masm syntax for you test it
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

guga

Hi JJ, here is the masm syntax



hex_table       db '000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F2'
db '02122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F40'
db '4142434445464748494A4B4C4D4E4F505152535455565758595A5B5C5D5E5F606'
db '162636465666768696A6B6C6D6E6F707172737475767778797A7B7C7D7E7F8081'
db '82838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A'
db '2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2'
db 'C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E'
db '3E4E5E6E7E8E9EAEBECEDEEEFF0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF',0

; =============== S U B R O U T I N E =======================================

; Attributes: bp-based frame

Bin2Hex6 proc near

DwordStorage = dword ptr -4
Input = dword ptr  8
Output = dword ptr  0Ch

push ebp
mov ebp, esp
sub esp, 4
push eax
push edi
mov eax, [ebp+Input]
mov edi, [ebp+Output]
mov [ebp+DwordStorage], eax
movzx eax, byte ptr [ebp+DwordStorage+3]
mov ax, word ptr hex_table[eax*2]
stosw
movzx eax, byte ptr [ebp+DwordStorage+2]
mov ax, word ptr hex_table[eax*2]
stosw
movzx eax, byte ptr [ebp+DwordStorage+1]
mov ax, word ptr hex_table[eax*2]
stosw
movzx eax, byte ptr [ebp+DwordStorage]
mov ax, word ptr hex_table[eax*2]
stosw
mov byte ptr [edi], 0
pop edi
pop eax
mov esp, ebp
pop ebp
retn 8
Bin2Hex6 endp


Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Congrats, Guga, it's much faster than the standard Masm32 algo:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

4120    cycles for 100 * dw2hex
7851    cycles for 100 * MB Hex$
52652   cycles for 100 * CRT sprintf
1071    cycles for 100 * Bin2Hex
772     cycles for 100 * Bin2Hex2 cx
1656    cycles for 100 * Bin2Hex6

4115    cycles for 100 * dw2hex
7832    cycles for 100 * MB Hex$
52043   cycles for 100 * CRT sprintf
1072    cycles for 100 * Bin2Hex
771     cycles for 100 * Bin2Hex2 cx
1660    cycles for 100 * Bin2Hex6

4115    cycles for 100 * dw2hex
7820    cycles for 100 * MB Hex$
52089   cycles for 100 * CRT sprintf
1069    cycles for 100 * Bin2Hex
771     cycles for 100 * Bin2Hex2 cx
1658    cycles for 100 * Bin2Hex6

4115    cycles for 100 * dw2hex
7819    cycles for 100 * MB Hex$
52067   cycles for 100 * CRT sprintf
1070    cycles for 100 * Bin2Hex
773     cycles for 100 * Bin2Hex2 cx
1662    cycles for 100 * Bin2Hex6

20      bytes for dw2hex
17      bytes for MB Hex$
29      bytes for CRT sprintf
225     bytes for Bin2Hex
150     bytes for Bin2Hex2 cx
616     bytes for Bin2Hex6

00345678        = eax dw2hex
00345678        = eax MB Hex$
345678  = eax CRT sprintf
345678  = eax Bin2Hex
00345678        = eax Bin2Hex2 cx
12345678        = eax Bin2Hex6

guga

Thanks...can you tests it preserving the registers of the other algos ? I would like to compare the true speed.
For example, my version saves the used registers (eax and edx), so to make it While trhe orther versiosn does not saves anything. I would like to test the functions as their same functionality to compare the speeds.

For example, when i use your version of Bin2Hex3  and mine is still fast. I don´t understand the differents speeds.

For the benachmark tests i´m uysing teh gui version that Steve made. The one that uses GetTickCount and SleepEx apis as part of the calibration algo.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com