Ascii to DWORD replacement

hutch-- · January 25, 2013, 01:58:28 PM

What I am after is a good clean replacement for Iczelion's old algo that has known problems. I loath to touch heirlooms and while I am happy enough with the extended version, it would be useful to have a conventional version as well. What I would be interested in seeing is a modern version that does exactly the same thing and can be used as a drop in replacement for Iczelion's antique.

Requirement is no stack frame, no ancient string instructions and no high level masm operators, just genuine low level fast code.

This is the original spec.

atodw

atodw proc uses edi esi String:PTR BYTE

Description
atodw converts a decimal string to dword.
Note that the parameter String is an address of DWORD size.

Parameter
1. String The address of the decimal string to convert

Return Value
The DWORD value is returned in eax.

Example
invoke atodw,ADDR MyDecimalString

dedndave · January 25, 2013, 02:03:34 PM

Quoteno ancient string instructions

crap - i was gonna use LODSB, STOSB, and LOOP

jj2007 · January 25, 2013, 02:53:42 PM

Quote from: hutch-- on January 25, 2013, 01:58:28 PM
Requirement is no stack frame, no ancient string instructions and no high level masm operators, just genuine low level fast code.

What else? Should fit in one para aka 16 bytes, at least ten times as fast as the current version?
Be more specific, Hutch

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
atodwJJ proc Src
push esi
xor edx, edx ; will be integer value
mov esi, [esp+4]
lea ecx, [esi+9] ; max address
@@:
movzx eax, byte ptr [esi] ; load a char
lea edx, [edx+4*edx] ; edx=5*edx
lea edx, [2*edx+eax-48] ; edx=10*edx+(eax-48)
inc esi
cmp ecx, esi ; >9 digits means we need the FPU; 4294967295 aka 2^32 is the limit
jne @B
pop esi
xchg eax, edx
retn 4
atodwJJ endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

dedndave · January 25, 2013, 03:17:55 PM

here is my first whack at it
it will handle signed or unsigned
and, it's UNICORN aware

no error checking

Code Select

        OPTION  PROLOGUE:None
        OPTION  EPILOGUE:None

a2dw    PROC    lpAscStr:LPSTR

        push    esi
        mov     esi,[esp+8]
        xor     eax,eax
        movzx   edx,byte ptr [esi]
        mov     cl,10
        cmp     dl,'-'
        push    edx
        jnz     a2dw02

        IFDEF __UNICODE__
            add     esi,2
        ELSE
            inc     esi
        ENDIF
        jmp short a2dw01

a2dw00: mul     cl
        lea     eax,[eax+edx-30h]

a2dw01: mov     dl,[esi]

a2dw02:
        IFDEF __UNICODE__
            add     esi,2
        ELSE
            inc     esi
        ENDIF
        or      dl,dl
        jnz     a2dw00

        pop     edx
        cmp     dl,'-'
        jnz     a2dw03

        neg     eax

a2dw03: pop     esi
        ret     4

a2dw    ENDP

        OPTION  PROLOGUE:PrologueDef
        OPTION  PROLOGUE:EpilogueDef

it's untested, but gives you the concept

EDIT: removed an unneeded line of code
changed MOV CL,10 to MOV ECX,10
rearranged a couple instructions in the preamble

EDIT: oops - went back to MOV CL,10 and MUL CL
that's not going to work - lol
i need another register :(

dedndave · January 25, 2013, 03:53:33 PM

ok - tested it, this time - lol
signed or unsigned, UNICORN, no error checking

Code Select

;***********************************************************************************************

        OPTION  PROLOGUE:None
        OPTION  EPILOGUE:None

a2dwDave PROC   lpAscStr:LPSTR

        push    esi
        push    ebx
        mov     esi,[esp+12]
        xor     eax,eax
        movzx   ebx,byte ptr [esi]
        mov     ecx,10
        cmp     bl,'-'
        push    ebx
        jnz     a2dw02

        IFDEF __UNICODE__
            add     esi,2
        ELSE
            inc     esi
        ENDIF
        jmp short a2dw01

a2dw00: mul     ecx
        lea     eax,[eax+ebx-30h]

a2dw01: mov     bl,[esi]

a2dw02:
        IFDEF __UNICODE__
            add     esi,2
        ELSE
            inc     esi
        ENDIF
        or      bl,bl
        jnz     a2dw00

        pop     edx
        cmp     dl,'-'
        jnz     a2dw03

        neg     eax

a2dw03: pop     ebx
        pop     esi
        ret     4

a2dwDave ENDP

        OPTION  PROLOGUE:PrologueDef
        OPTION  EPILOGUE:EpilogueDef

;***********************************************************************************************

dedndave · January 25, 2013, 04:08:27 PM

i get about 130 cycles on my prescott for '4294967295',0 or '-2147483648',0
doesn't seem to care if it's UNICORN or not

dedndave · January 25, 2013, 04:33:30 PM

i changed this

Code Select

a2dw00: mul     ecx
        lea     eax,[eax+ebx-30h]

to this

Code Select

a2dw00: lea     eax,[4*eax+eax]
        lea     eax,[2*eax+ebx-30h]

almost twice as fast, 79 clock cycles
that means i don't need an extra register
new version coming...

dedndave · January 25, 2013, 04:41:02 PM

73 cycles for '4294967295',0 or '-2147483648',0

Code Select

;***********************************************************************************************

        OPTION  PROLOGUE:None
        OPTION  EPILOGUE:None

a2dwDave PROC   lpszAscStr:LPSTR

;Ascii to Dword, Signed or Unsigned, UNICODE Aware, No Error Checking
;DednDave, 1-2013

;-------------------------------------------------

        mov     edx,[esp+4]
        xor     eax,eax
        movzx   ecx,byte ptr [edx]
        cmp     cl,'-'
        push    ecx
        jnz     a2dw02

        IFDEF __UNICODE__
            add     edx,2
        ELSE
            inc     edx
        ENDIF
        jmp short a2dw01

a2dw00: lea     eax,[4*eax+eax]
        lea     eax,[2*eax+ecx-30h]

a2dw01: mov     cl,[edx]

a2dw02:
        IFDEF __UNICODE__
            add     edx,2
        ELSE
            inc     edx
        ENDIF
        or      cl,cl
        jnz     a2dw00

        pop     edx
        cmp     dl,'-'
        jnz     a2dw03

        neg     eax

a2dw03: ret     4

a2dwDave ENDP

        OPTION  PROLOGUE:PrologueDef
        OPTION  EPILOGUE:EpilogueDef

;***********************************************************************************************

hutch-- · January 25, 2013, 05:16:59 PM

I just found the version that was supposed to replace the old one. It has not been renamed to match the old one.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16

atou proc String:DWORD

; ------------------------------------------------
; Convert decimal string into UNSIGNED DWORD value
; ------------------------------------------------

mov edx, [esp+4]
xor ecx, ecx
movzx eax, BYTE PTR [edx]
test eax, eax
jz quit

lpst:
add edx, 1
lea ecx, [ecx+ecx*4] ; mul ecx * 5
lea ecx, [eax+ecx*2-48]
movzx eax, BYTE PTR [edx]
test eax, eax
jz quit

add edx, 1
lea ecx, [ecx+ecx*4] ; mul ecx * 5
lea ecx, [eax+ecx*2-48]
movzx eax, BYTE PTR [edx]
test eax, eax
jnz lpst

quit:
lea eax, [ecx]

ret 4

atou endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

hutch-- · January 26, 2013, 12:51:31 AM

JJ,

The new version is much faster than an old version you posted some time ago but the new version does not return the correct results.

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

atodJJ proc String:DWORD

   mov edx, [esp+4]
   xor eax, eax

@@:   movzx ecx, byte ptr [edx]
   inc edx
   lea eax, [eax+eax*4]
   cmp byte ptr [edx], ch
   lea eax, [ecx+eax*2-30h]
   jnz @b
   ret 4

atodJJ endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; OPTION PROLOGUE:NONE
; OPTION EPILOGUE:NONE
;
; atodJJ proc String:DWORD ; Src
;
; push esi
; xor edx, edx ; will be integer value
; mov esi, [esp+4]
; lea ecx, [esi+9] ; max address
;
; @@:
; movzx eax, byte ptr [esi] ; load a char
; lea edx, [edx+4*edx] ; edx=5*edx
; lea edx, [2*edx+eax-48] ; edx=10*edx+(eax-48)
; inc esi
; cmp ecx, esi ; >9 digits means we need the FPU; 4294967295 aka 2^32 is the limit
; jne @B
;
; pop esi
; xchg eax, edx
; retn 4
;
; atodJJ endp
;
; OPTION PROLOGUE:PrologueDef
; OPTION EPILOGUE:EpilogueDef

jj2007 · January 26, 2013, 01:28:42 AM

Quote from: hutch-- on January 26, 2013, 12:51:31 AM
JJ,

The new version is much faster than an old version you posted some time ago but the new version does not return the correct results.

Hutch,
I have not tested it much, little time for that, and there is no error checking yet, but the few strings I tried were ok... probably because they were exactly 8 characters long

Thanks,
JJ

qWord · January 26, 2013, 01:44:21 AM

maybe a bit off topic, but thus an short algo can also be simply inlined:

Code Select

; return: EAX = number
; tchr2d psz
; tchr2dw &sz
; tchr2dw ADDR sz
; tchr2dw Addr sz
; tchr2dw addr sz
tchr2dw macro psz:req
	LOCAL l1,l2
	%	FOR arg,<reparg(psz)>
			IF @InStr(1,<&arg>,<ADDR >) OR @InStr(1,<&arg>,<Addr >) OR @InStr(1,<&arg>,<addr >)
				lea ecx,@SubStr(<&arg>,5)
			ELSEIFIDN @SubStr(<&arg>,1,1),<!&>
				lea ecx,@SubStr(<&arg>,2)
			ELSE
				mov ecx,arg
			ENDIF
			EXITM
		ENDM
		
	   	xor eax,eax
	l1:	movzx edx,TCHAR ptr [ecx]
		test edx,edx
		jz l2
		lea eax,[eax+eax*4]
		lea eax,[eax*2+edx-'0']
		lea ecx,[ecx+SIZEOF TCHAR]
		jmp l1
	l2:
endm

dedndave · January 26, 2013, 01:56:22 AM

Jochen's routine does not check for null termination
looks ok, otherwise
although, you could swap usage of EAX and EDX and save the XCHG instruction :P

i like qWord's idea :P
saves the call/ret overhead
but, i would mod it a little

Code Select

        mov     edx,arg
        xor     eax,eax
        jmp short loop01

loop00: lea     eax,[eax+eax*4]
        lea     eax,[eax*2+ecx-'0']

loop01: movzx   ecx,TCHAR ptr [edx]
        test    ecx,ecx
        lea     edx,[edx+sizeof TCHAR]
        jnz     loop00

add a sign check and you've got a cool macro

jj2007 · January 26, 2013, 03:21:47 AM

Hi all,
Here is a new version with correct results and error checks. Of course, it's now slow and bloated :(

testing 1: 1 ok
testing -1: -1 ok
testing 123: 123 ok
testing 123456789: 123456789 ok
testing 1234567890: source too long
testing -9876x54321: x is an invalid character
testing -987.654321: . is an invalid character
testing -987654321: -987654321 ok
testing -9876543210: source too long
testing -987654h: h is an invalid character
48 bytes for atodw
11 bytes for Dec2Dword
6 bytes for Dec2Dword2
62 bytes for atodwJJ

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
++++++++++++++++++++
93 cycles for 10 * Dec2Dword
724 cycles for 10 * atodwJJ
3554 cycles for 10 * a2dw

93 cycles for 10 * Dec2Dword
724 cycles for 10 * atodwJJ
3558 cycles for 10 * a2dw

93 cycles for 10 * Dec2Dword
725 cycles for 10 * atodwJJ
3553 cycles for 10 * a2dw

92 cycles for 10 * Dec2Dword
725 cycles for 10 * atodwJJ
3556 cycles for 10 * a2dw

93 cycles for 10 * Dec2Dword
724 cycles for 10 * atodwJJ
3693 cycles for 10 * a2dw

@Dave: Eliminating xchg eax, edx is not a good idea. Two bytes longer and many cycles slower...

Gunther · January 26, 2013, 06:46:47 AM

My results:

Code Select


testing 1:	1	 ok
testing -1:	-1	 ok
testing 123:	123	 ok
testing 123456789:	123456789	 ok
testing 1234567890:	source too long
testing -9876x54321:	x is an invalid character
testing -987.654321:	. is an invalid character
testing -987654321:	-987654321	 ok
testing -9876543210:	source too long
testing -987654h:	h is an invalid character
48	bytes for atodw
11	bytes for Dec2Dword
6	bytes for Dec2Dword2
62	bytes for atodwJJ

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
loop overhead is approx. 35/10 cycles


23	cycles for 10 * Dec2Dword
1015	cycles for 10 * atodwJJ
1570	cycles for 10 * a2dw

24	cycles for 10 * Dec2Dword
407	cycles for 10 * atodwJJ
1615	cycles for 10 * a2dw

23	cycles for 10 * Dec2Dword
720	cycles for 10 * atodwJJ
1575	cycles for 10 * a2dw

23	cycles for 10 * Dec2Dword
1022	cycles for 10 * atodwJJ
2183	cycles for 10 * a2dw

24	cycles for 10 * Dec2Dword
1018	cycles for 10 * atodwJJ
2180	cycles for 10 * a2dw

--- ok ---

Gunther

The MASM Forum

News:

Ascii to DWORD replacement

hutch--

dedndave

jj2007

dedndave

dedndave

dedndave

dedndave

dedndave

hutch--

hutch--

jj2007

qWord

dedndave

jj2007

Gunther