News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Fast DwordtoHex ?

Started by guga, November 27, 2015, 11:16:24 PM

Previous topic - Next topic

guga

Btw, dave and guys, i suceed to make the shorter version on output. And the speed was kept intact on my tests. It seems fast.


[hex_table: B$ "000102030405060708090A0B0C0D0E0F"
            B$ "101112131415161718191A1B1C1D1E1F"
            B$ "202122232425262728292A2B2C2D2E2F"
            B$ "303132333435363738393A3B3C3D3E3F"
            B$ "404142434445464748494A4B4C4D4E4F"
            B$ "505152535455565758595A5B5C5D5E5F"
            B$ "606162636465666768696A6B6C6D6E6F"
            B$ "707172737475767778797A7B7C7D7E7F"
            B$ "808182838485868788898A8B8C8D8E8F"
            B$ "909192939495969798999A9B9C9D9E9F"
            B$ "A0A1A2A3A4A5A6A7A8A9AAABACADAEAF"
            B$ "B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF"
            B$ "C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF"
            B$ "D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF"
            B$ "E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF"
            B$ "F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF", 0]

Proc Bin2Hex7:
    Arguments @Input, @Output
    Local @DwordStorage
    Uses eax, edi, ecx

    mov eax D@Input
    mov edi D@Output
    mov D@DwordStorage eax
    mov B$edi '0' | inc edi

    movzx eax B@DwordStorage+3
    Test_If eax eax
        On ax <= 0F, dec edi
        mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+2 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+1 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw
        mov B$edi 0
        ExitP
    Test_End

    movzx eax B@DwordStorage+2
    Test_If eax eax
        On ax <= 0F, dec edi
        mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+1 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw
        mov B$edi 0
        ExitP
    Test_End


    movzx eax B@DwordStorage+1
    On ax <= 0F, dec edi
    Test_If eax eax
        mov ax W$hex_table+eax*2 | stosw
    Test_End

    movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw
    mov B$edi 0

EndP
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

dedndave

add this one to your tests   :P

0
01
012
0123
01234
012345
0123456
01234567
012345678

Press any key to continue ...

guga

Thanks dave

It is quite close. The problem is that on eax it returns the original output variable is being forwarded, which results on zero bytes on the beginning if the  input is short. For example:
[OutputBuff: B$ 0 #12] ; 12 bytes long

call FastHex 01000, OutputBuff
eax = 0 0 0 0 0 + "01000" decimal strings. 5 leading zeros ate the start folloed by the converted data

Although the result is correct this may cause problems if the output is part of a string chain. For example

[OutputBuff: B$ "Test" 0#256]

mov edi OutputBuff
add edi 4 ;
call FastHex 01000, edi

The result will be:
[OutputBuff: B$ "Test" 0 0 0 0 0
                     B$ "01000"
                     B$ 0....]

instead of
[OutputBuff: B$ "Test01000"
                     B$ 0....]

Concerning the speed i made a couple of tests, here is the result:




Your code ported to RosAsm to it behave the same as the one i´m testing is:
Both preserves the registers they use internally (with the macro "uses" . Which is a simple push/pop operation). I´m testing this to be sure about the  speed of all functions working on the same conditions. The main difference that i´ll try is make yours output on eax the lenght of the converted data, to make sure both functions behaves and works exactly the same, so i can have a better idea in terms of speed.


Proc FastHex:
    Arguments @Input, @Output
    Uses ecx, edx

    mov eax D@Output | add eax 8
    mov ecx D@Input
    test ecx ecx
    mov D$eax 03030 | je P2>
   
FHex00: M6:
    movzx edx cl
    mov dx W$edx*2+hex_table
    mov W$eax  dx
    sub eax 02
    shr ecx 08 | jne M6<
    inc eax
    mov B$eax  030
   
FHex01: P2:
    cmp B$eax  030
    lea eax D$eax+01 | je P2<
    sub eax 02
EndP


Btw, i updated mine version


Proc dwtoHex_Ex2:
    Arguments @Input, @Output
    Local @DwordStorage
    Uses edi

    mov eax D@Input
    mov edi D@Output
    mov D@DwordStorage eax
    mov B$edi '0' | inc edi

    movzx eax B@DwordStorage+3
    Test_If eax eax
        On ax <= 0F, dec edi
        mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+2 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+1 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw
        mov B$edi 0
        sub edi D@Output | mov eax edi
        ExitP
    Test_End

    movzx eax B@DwordStorage+2
    Test_If eax eax
        On ax <= 0F, dec edi
        mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+1 | mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+0 | mov ax W$hex_table+eax*2 | stosw
        mov B$edi 0
        sub edi D@Output | mov eax edi
        ExitP
    Test_End

    movzx eax B@DwordStorage+1
    Test_If eax eax
        On ax <= 0F, dec edi
        mov ax W$hex_table+eax*2 | stosw
        movzx eax B@DwordStorage+0
    Test_Else
        movzx eax B@DwordStorage+0
        On ax <= 0F, dec edi
    Test_End

    mov ax W$hex_table+eax*2 | stosw
    mov B$edi 0
    sub edi D@Output | mov eax edi

EndP

Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Quote from: dedndave on November 29, 2015, 01:11:41 PM
add this one to your tests   :P

It gets a bit crowded now :biggrin:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

4166    cycles for 100 * dw2hex
7871    cycles for 100 * MB Hex$
52117   cycles for 100 * CRT sprintf
660     cycles for 100 * Bin2Hex
760     cycles for 100 * Bin2Hex2 cx
1635    cycles for 100 * Bin2Hex6
1981    cycles for 100 * FastHex

4217    cycles for 100 * dw2hex
7799    cycles for 100 * MB Hex$
52408   cycles for 100 * CRT sprintf
659     cycles for 100 * Bin2Hex
759     cycles for 100 * Bin2Hex2 cx
1645    cycles for 100 * Bin2Hex6
1763    cycles for 100 * FastHex

4180    cycles for 100 * dw2hex
7841    cycles for 100 * MB Hex$
52083   cycles for 100 * CRT sprintf
658     cycles for 100 * Bin2Hex
778     cycles for 100 * Bin2Hex2 cx
1656    cycles for 100 * Bin2Hex6
1904    cycles for 100 * FastHex

4214    cycles for 100 * dw2hex
7866    cycles for 100 * MB Hex$
52062   cycles for 100 * CRT sprintf
660     cycles for 100 * Bin2Hex
757     cycles for 100 * Bin2Hex2 cx
1647    cycles for 100 * Bin2Hex6
1995    cycles for 100 * FastHex

20      bytes for dw2hex
17      bytes for MB Hex$
29      bytes for CRT sprintf
138     bytes for Bin2Hex
150     bytes for Bin2Hex2 cx
616     bytes for Bin2Hex6
66      bytes for FastHex

00345678        = eax dw2hex
00345678        = eax MB Hex$
345678  = eax CRT sprintf
12345678        = eax Bin2Hex
00345678        = eax Bin2Hex2 cx
12345678        = eax Bin2Hex6
012345678       = eax FastHex

sinsi


AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G (SSE4)
11919   cycles for 100 * dw2hex
10902   cycles for 100 * MB Hex$
54302   cycles for 100 * CRT sprintf
742     cycles for 100 * Bin2Hex
932     cycles for 100 * Bin2Hex2 cx
2277    cycles for 100 * Bin2Hex6
2171    cycles for 100 * FastHex

Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (SSE4)
3566    cycles for 100 * dw2hex
7054    cycles for 100 * MB Hex$
53879   cycles for 100 * CRT sprintf
589     cycles for 100 * Bin2Hex
722     cycles for 100 * Bin2Hex2 cx
1430    cycles for 100 * Bin2Hex6
1497    cycles for 100 * FastHex

Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz (SSE4)
3077    cycles for 100 * dw2hex
5258    cycles for 100 * MB Hex$
44338   cycles for 100 * CRT sprintf
526     cycles for 100 * Bin2Hex
521     cycles for 100 * Bin2Hex2 cx
1157    cycles for 100 * Bin2Hex6
1261    cycles for 100 * FastHex

🍺🍺🍺

TWell

AMD Athlon(tm) II X2 220 Processor (SSE3)

8622    cycles for 100 * dw2hex
8122    cycles for 100 * MB Hex$
78598   cycles for 100 * CRT sprintf
902     cycles for 100 * Bin2Hex
901     cycles for 100 * Bin2Hex2 cx
4208    cycles for 100 * Bin2Hex6
2019    cycles for 100 * FastHex

Siekmanski

Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)

3956    cycles for 100 * dw2hex
7890    cycles for 100 * MB Hex$
50529   cycles for 100 * CRT sprintf
665     cycles for 100 * Bin2Hex
760     cycles for 100 * Bin2Hex2 cx
1550    cycles for 100 * Bin2Hex6
1952    cycles for 100 * FastHex

3955    cycles for 100 * dw2hex
7845    cycles for 100 * MB Hex$
50451   cycles for 100 * CRT sprintf
663     cycles for 100 * Bin2Hex
760     cycles for 100 * Bin2Hex2 cx
1553    cycles for 100 * Bin2Hex6
1974    cycles for 100 * FastHex

3953    cycles for 100 * dw2hex
7889    cycles for 100 * MB Hex$
50417   cycles for 100 * CRT sprintf
662     cycles for 100 * Bin2Hex
762     cycles for 100 * Bin2Hex2 cx
1561    cycles for 100 * Bin2Hex6
2015    cycles for 100 * FastHex

3933    cycles for 100 * dw2hex
7851    cycles for 100 * MB Hex$
50340   cycles for 100 * CRT sprintf
661     cycles for 100 * Bin2Hex
766     cycles for 100 * Bin2Hex2 cx
1584    cycles for 100 * Bin2Hex6
1967    cycles for 100 * FastHex

20      bytes for dw2hex
17      bytes for MB Hex$
29      bytes for CRT sprintf
138     bytes for Bin2Hex
150     bytes for Bin2Hex2 cx
616     bytes for Bin2Hex6
66      bytes for FastHex

00345678        = eax dw2hex
00345678        = eax MB Hex$
345678  = eax CRT sprintf
12345678        = eax Bin2Hex
00345678        = eax Bin2Hex2 cx
12345678        = eax Bin2Hex6
012345678       = eax FastHex
Creative coders use backward thinking techniques as a strategy.

guga

JJ, i´m trying to make the testing fucntions behave the same (I mean, all inside a regular proc, instead a void function ), but im having probklems wuth the syntax in masm.

I rebuild the function as:



Bin2Hex6 proc Input:DWORD, Output:DWORD
Local DwordStorage:DWORD

  push eax
  push edi

  mov eax, Input
  mov edi, Output
  mov DwordStorage, eax
  movzx eax, byte ptr [DwordStorage+3]
  mov ax, word ptr hex_table[eax*2]
  stosw
  movzx eax, byte ptr [DwordStorage+2]
  mov ax, word ptr hex_table[eax*2]
  stosw
  movzx eax, byte ptr [DwordStorage+1]
  mov ax, word ptr hex_table[eax*2]
  stosw
  movzx eax, byte ptr [DwordStorage]
  mov ax, word ptr hex_table[eax*2]
  stosw
  mov byte ptr [edi], 0

  pop edi
  pop eax

Bin2Hex6 endp

NameG equ <Bin2Hex6> ; assign a descriptive name here
TestG proc
  mov ebx, AlgoLoops-1 ; loop e.g. 100x
  align 4
  .Repeat
;push offset somestring
;push 12345678h
call Bin2Hex6 12345678h, offset somestring
dec ebx
  .Until Sign?
  mov eax, offset somestring
  ret
TestG endp


But, why masm can´t assembled it ? It says it have a symbol redefinton. Is this the proper syntax ?
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Well... using Input and Output as equates is kind of courageous ;)

guga

Equates ? I thought they were arguments of the function  :icon_mrgreen: It´s a long time since i last used masm, but i suceeded to port something more similar to the output results. Dispites the difference of the way the results are built. The timmings are close :)

I hope the syntax is ok now. I rebuilt the function bin2hex to it work as a proc with 2 arguments


Bin2Hex proc near
_Input = dword ptr 8
_Output = dword ptr 0Ch
push ebp
mov ebp, esp
push ecx
push edx
push edi
mov eax, [ebp+_Input]
mov edi, [ebp+_Output]
mov edx, offset hex_table ; "000102030405060708090A0B0C0D0E0F1011121"...
movzx ecx, al
movzx ecx, word ptr [edx+ecx*2]
mov [edi+6], cx
movzx ecx, ah
movzx ecx, word ptr [edx+ecx*2]
mov [edi+4], cx
shr eax, 10h
movzx ecx, al
movzx ecx, word ptr [edx+ecx*2]
mov [edi+2], cx
movzx ecx, ah
movzx ecx, word ptr [edx+ecx*2]
mov [edi], cx
mov byte ptr [edi+8], 0
lea eax, [edi]
pop edi
pop edx
pop ecx
mov esp, ebp
pop ebp
retn 8
Bin2Hex endp



Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (SSE4)

8867 cycles for 100 * dw2hex
7048 cycles for 100 * MB Hex$
48977 cycles for 100 * CRT sprintf
787 cycles for 100 * Bin2Hex
245 cycles for 100 * Bin2Hex2 cx
1243 cycles for 100 * Bin2Hex6
1679 cycles for 100 * FastHex

4721 cycles for 100 * dw2hex
9396 cycles for 100 * MB Hex$
65490 cycles for 100 * CRT sprintf
1332 cycles for 100 * Bin2Hex
558 cycles for 100 * Bin2Hex2 cx
1598 cycles for 100 * Bin2Hex6
1727 cycles for 100 * FastHex

4414 cycles for 100 * dw2hex
9983 cycles for 100 * MB Hex$
77615 cycles for 100 * CRT sprintf
1574 cycles for 100 * Bin2Hex
682 cycles for 100 * Bin2Hex2 cx
1934 cycles for 100 * Bin2Hex6
1990 cycles for 100 * FastHex

5609 cycles for 100 * dw2hex
11614 cycles for 100 * MB Hex$
81394 cycles for 100 * CRT sprintf
1708 cycles for 100 * Bin2Hex
760 cycles for 100 * Bin2Hex2 cx
1992 cycles for 100 * Bin2Hex6
2185 cycles for 100 * FastHex

20 bytes for dw2hex
17 bytes for MB Hex$
29 bytes for CRT sprintf
139 bytes for Bin2Hex
150 bytes for Bin2Hex2 cx
616 bytes for Bin2Hex6
66 bytes for FastHex

00345678 = eax dw2hex
00345678 = eax MB Hex$
345678 = eax CRT sprintf
12345678 = eax Bin2Hex
00345678 = eax Bin2Hex2 cx
12345678 = eax Bin2Hex6
012345678 = eax FastHex

--- ok ---



Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Quote from: guga on November 30, 2015, 02:28:00 AM
Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz (SSE4)
787 cycles for 100 * Bin2Hex
245 cycles for 100 * Bin2Hex2 cx

Your i7 is cheating, Guga :eusa_naughty:

My code is fast but 2.45 cycles is fake 8)

guga

Cheating ?  How is that possible ?
I didn´t touched  Bin2Hex2 cx


Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

dedndave

i suggest you write a little test piece, like i did for FastHex, to verify proper results
if it takes 2 or 3 clock cycles to finish, it's a good bet that it isn't working to begin with

guga

There is something very weird. I isolated JJ´s code Bin2hex2 cx to it displays only this algo. (I deleted all others), and i keep having different results whenever i click the app.

To achieve this different results, all i did was, open the app and close it, wait 3 to 5 seconds, open t again+close, and so on.
Coding in Assembly requires a mix of:
80% of brain, passion, intuition, creativity
10% of programming skills
10% of alcoholic levels in your blood.

My Code Sites:
http://rosasm.freeforums.org
http://winasm.tripod.com

jj2007

Hi Guga,

I was just making fun, but this is indeed weird! I thought it was just a strange outlier. These are my very stable results:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
3386    cycles for 100 * dw2hex
6473    cycles for 100 * MB Hex$
42045   cycles for 100 * CRT sprintf
1141    cycles for 100 * Bin2Hex Guga
613     cycles for 100 * Bin2Hex2 cx
1328    cycles for 100 * Bin2Hex6
1348    cycles for 100 * FastHex

3389    cycles for 100 * dw2hex
6370    cycles for 100 * MB Hex$
42029   cycles for 100 * CRT sprintf
1147    cycles for 100 * Bin2Hex Guga
612     cycles for 100 * Bin2Hex2 cx
1329    cycles for 100 * Bin2Hex6
1633    cycles for 100 * FastHex


Try setting AlgoLoops or TimerLoops ten times higher, sometimes this help to stabilise timings.