News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

qword to ascii conversion

Started by allynm, June 06, 2012, 02:36:32 AM

Previous topic - Next topic

dedndave

well - that is not a great routine - lol
i wrote that long ago
since then, i have found that STD and CLD are very slow instructions
you could easily convert that code so that it did not use LODS/STOS and make a much nicer version

also - that routine does not follow the ABI
and - the values are passed in register
it needs to be re-written   :P

dedndave

here - this one is a little better
example of use:
        INVOKE  Asc64,LoDword,HiDword
;EAX = address of ASCII decimal string


Asc64   PROTO   :DWORD,:DWORD

        .DATA

AscBuf  DB      '01234567890123456789',0  ;20 ASCII digits

        .CODE

Asc64   PROC USES EBX ESI EDI dwLoDword:DWORD,dwHiDword:DWORD

;Convert 64-bit unsigned integer to ASCII decimal string
;
;Call With: dwLoDword = low DWORD of QWORD value to convert
;           dwHiDword = high DWORD of QWORD value to convert
;
;  Returns: EAX = Offset into AscBuf of first numchar

        mov     edi,offset AscBuf+18
        mov     ecx,dwHiDword
        mov     esi,dwLoDword
        mov     ebx,100

Asc64a: xor     edx,edx
        xchg    eax,ecx
        div     ebx
        xchg    eax,ecx
        xchg    eax,esi
        div     ebx
        xchg    eax,esi
        xchg    eax,edx
        aam
        xchg    al,ah
        or      ax,3030h
        mov     [edi],ax
        mov     eax,ecx
        sub     edi,2
        or      eax,esi
        jnz     Asc64a

        inc     edi
        inc     edi
        cmp byte ptr [edi],30h       ;leading 0 ?
        jnz     Asc64b               ;no - done

        inc     edi                  ;yes - supress it

Asc64b: xchg    eax,edi              ;return pointer in EAX
        ret

Asc64   ENDP


it could be modified so that you pass a pointer to the QWORD in memory instead of passing it directly

allynm

Hello everyo

Just an update.  After figuring out how to use the EAX register in Raymonds code, the program ran beautifully....as of course one would expect.  Dave mentioned "speed" several times.  Seems to me it's worth running MichaelW's timers on these programs.  I want to look also at Drizz's program too.

Mark

dedndave

a lot depends on how it is to be used...

if you are going to convert a few values during the program session - speed isn't much of an issue
if your program uses FPU/MMX or SSE code, you may not want the routine to disturb those registers
if your program uses FPU/MMX or SSE code, and you are converting a lot of values - one like Drizz's may be what you want
if your program doesn't use FPU/MMX or SSE code, and you are converting a lot of values - use one like Ray's, or perhaps, an SSE version

jj2007


allynm

Hi Dave and JJ,

When I mentioned MichaelW's timer programs I was thinking of just a single conversion, and as you say, slight differences with just a single conversion might not matter but if the conversions need to be done manifold, could be another story altogether.

JJ-Thanks for the link.  I will read it.

I should add that when I mentioned Raymond's code and the EAX register I should have made it clear that Raymond's documentation was very clear on the role EAX plays in his solution, it just took me awhile to recognize exactly what he meant.  My bad, as they say...

Regards,
Mark

jj2007

Quote from: allynm on June 08, 2012, 06:40:36 AM
When I mentioned MichaelW's timer programs...

I was trying to get the two algos by Paul Dixon and Lingo running but no luck. They are fast but don't produce the expected result - is that wrong usage?? ::)
mov esi, offset Src
mov edi, offset Dest
invoke uqword, edi, esi


Attention Lingo's b2a3264 may crash because he doesn't care for register preservation.

dedndave

Quote from: jj2007 on June 08, 2012, 06:31:40 AM
There is an incredible essay titled "TRANSFER OF AN INTEGER BINARY VALUE INTO DECIMAL ASCII STRING" by Andrija Radović. Must read :t

Converting from binary to decimal is simply a matter of base conversion. There are a few methods that can be used, as he pointed out. However, in the Horner's Rule method, he only mentioned dividing by 10. Furthermore, I don't think he noticed it was an application of Horner's Rule, or Ling Long Kai Fang.

A concept that I learned while writing the Ling Long Kai Fang routines was that the selection of bases is somewhat arbitrary. For example, many 32-bit routines might divide by 10,000, and convert 4 decimal digits to ASCII at once. Strictly speaking, this is not really conversion from binary (base 2) to decimal (base 10). Rather, it may be viewed as conversion from base 4,294,967,296 to base 10,000. While the values stored in a dword may indeed be binary, we can think of that dword as a single digit. If we were to apply the same line of thought to bytes, we might call it base 256, and words, base 65536.

I found it was important to realize the difference as I wrote the Ling Long Kai Fang routines. In the first version of the routines, I converted from base 4,294,967,296 to base 100,000,000. Each intermediate dword then held 8 decimal digits.

In the later version, I convert from base 4,294,967,296 to base 1,000,000,000. Now, each intermediate dword holds 9 decimal digits. It made the loop for conversion to ASCII considerably more tedious, but gave a signifigant performance improvement for very large integers.

KeepingRealBusy

#23
Allynm,

Here is my version:

I have included a zip my QWORD conversion code (signed and unsigned) and the
test code and the timing code (included this text also). The test code is for
reference only, you need the output routines which I have not included (no, I
don't use the masm32.lib, I roll my own). The timing and conversion PROCs should
be free standing.

The following timing is extracted from a console output for one pass of 8
conversions with output, followed by 1,000,000 (a million) passes of the 8
conversions only - all in just over 2 seconds. That is fast enough for any
output I need to display.

The current time is: 8:59:14.68

18,446,744,073,709,551,615
0
4,294,967,295
4,294,967,296
9,223,372,036,854,775,807
-9,223,372,036,854,775,808
0
-1

The current time is:  8:59:16.81
The current time is:  8:59:14.68
                      ----------
                      0:00:02.13


Dave.

jj2007

Hi Dave,
What is the correct usage? For...
.data
Src QWORD 1234567890123456
Dest db 100 dup(?)

.code
start:
mov edx, offset Src
mov edi, offset Dest
call UBTD

... I get 15,001,234,558,140,725,952 ::)

dedndave

Jochen,
after the routine has been called, EDI points to the first char
.data
Src QWORD 1234567890123456
Dest db 100 dup(?)

.code
start:
mov edx, offset Src
mov edi, offset Dest
call UBTD
        print edi


i get...
0
:biggrin:

full program...
include \masm32\include\masm32rt.inc

.data
Src QWORD 1234567890123456
Dest db 100 dup(?)

    ALIGN                   QWORD
;    qHugeWork               QWORD 0
    q16BillionBillion       QWORD 16*1000000000*1000000000
;    q8BillionBillion        QWORD 8*1000000000*1000000000
;    q4BillionBillion        QWORD 4*1000000000*1000000000
;    q2BillionBillion        QWORD 2*1000000000*1000000000
;    q1BillionBillion        QWORD 1*1000000000*1000000000
    ALIGN                   DWORD
    dHugeBillion            DWORD 1000*1000*1000
    dHugeBillions           DWORD 0
    dTop                    DWORD 0
    ALIGN                   WORD
    cbFirstTwo              BYTE "00","01","02","03","04","05","06","07","08","09"
                            BYTE "10","11","12","13","14","15","16","17","18"
;

.code
start:
mov edx, offset Src
mov edi, offset Dest
call UBTD
        print edi
        push  0A0Dh
        print esp
        pop   eax
        exit

;-------------------------------------------------------------------------------
;   UBTD - Convert unsigned 64 bit binary number to 26 digits with separators,
;   (edx = QWORD pointer, edi = message buffer, returns edi pointing to first
;   non-zero character or to the single zero character for a zero value).
;-------------------------------------------------------------------------------
ALIGN   OWORD
UBTD    PROC USES eax ebx ecx edx esi

    mov    eax,[edx]                        ;   Get Low.
    mov    edx,[edx+4]                      ;   Get High.
    mov    dTop,0                           ;   Clear high digits value.
    mov    ebx,16                           ;   Get the value to increment dTop.
    mov    ecx,5                            ;   Get the count of test values.
    mov    esi,OFFSET q16BillionBillion     ;   Point to the test values.
;
;   Correct the input value to be below 1 billion billion
;
CorrectTop:
    cmp    edx,[esi+4]                      ;   Is high too big?
    jb     Skip                             ;   No.
    sub    eax,[esi]                        ;   Correct the value.
    sbb    edx,[esi+4]
    add    dTop,ebx                         ;   Increment the high digit value.
Skip:
    shr    ebx,1                            ;   Correct the increment value.
    lea    esi,[esi+8]                      ;   Point to the next correction value.
    dec    ecx                              ;   Decrement the test value count.
    jnz    CorrectTop                       ;   Not all tested.
;
;   Split the remaining 60 bit number to 2 30 bit numbers.
;
    mov    ebx,dHugeBillion                 ;   Get 1 billion.
    div    ebx                              ;   Convert to billions (eax) and fraction (edx).
    mov    dHugeBillions,eax                ;   Save the billions.
    mov    eax,edx                          ;   Convert the Low value.
;
;   Sample number result (maximum possible 64 bit number value):
;
;   18,446,744,073,709,551,615 BYTES
;
;   Convert the fraction then the billions.
;
    mov    esi,eax                          ;   Convert the Low value.
    mov    ecx,25                           ;   Set the character position for the last digit.
    mov    ebx,3                            ;   Set the character count for 3 digits.
;
;   Convert by multiplying.
;
Cvt:
    mov    edx,3435973837                   ;   Get magic number for divisor of 10.
    mul    edx                              ;   edx = (quotient * 8) + garbage.
    shr    edx,3                            ;   edx = quotient.
    lea    eax,[edx+edx*4]                  ;   eax = quotient * 5.
    shl    eax,1                            ;   eax = quotient * 10.
    neg    eax                              ;   eax = - quotient * 10.
    lea    eax,[esi+eax+"0"]                ;   eax = LSD.
    mov    [edi+ecx],al                     ;   Save digit.
    mov    eax,edx                          ;   eax = quotient.
    mov    esi,eax                          ;   Save quotient.
    dec    ecx                              ;   Point to prior digit space.
    dec    ebx                              ;   Decrement 1000's
    jnz    Cnt                              ;   Not there.
    dec    ecx                              ;   Skip comma.
    mov    ebx,3                            ;   Set for the next 3 digits.
;
;   Check the end of the low 9 digits.
;
Cnt:
    cmp    ecx,13                           ;   Offset for the billions?
    jg     Cvt                              ;   No, not done with Low conversion.
    je     GetHigh                          ;   Yes, get the high value.
    cmp    ecx,1                            ;   Total conversion complete?
    jns    Cvt                              ;   No, keep converting High.
CvtTop:
    mov    eax,dTop                         ;   Get the value for the top 2 digits (0 to 18).
    mov    ebx,OFFSET cbFirstTwo            ;   Point to conversion characters.
    mov    ax,[ebx+eax*2]                   ;   Get the first two characters in ax (little endian will reverse them).
    mov    cl,','                           ;   Get a separator.
    mov    [edi],ax                         ;   Save them (little endian will reverse them back to the correct order).
    jmp    Separate                         ;   Go to add separators.
;
;   Get the high value (billions) to convert.
;
GetHigh:
    mov    eax,dHugeBillions                ;   Get billions value.
    mov    esi,eax                          ;   Convert the High value 4 digits.
    jmp    Cvt
;
;   Separate with commas.
;
Separate:
    mov    [edi+2],cl
    mov    [edi+6],cl
    mov    [edi+10],cl
    mov    [edi+14],cl
    mov    [edi+18],cl
    mov    [edi+22],cl
    mov    cl,' '                           ;   Get leading blank pad.
    jmp    ScanNonZero                      ;   Scan for decimal digit > 0.
;
;   Blank leading commas and leading zeros, not last zero.
;
BlankFill:
    mov    [edi],cl                         ;   Blank the character.
    inc    edi                              ;   Point to the next character.
;
;   Scan for additional digits.
;
ScanNonZero:
    mov    al,[edi]                         ;   Get the character.
    cmp    al,","                           ;   Is it a leading comma?
    jz     BlankFill                        ;   Yes, blank it.
    cmp    al,"0"                           ;   Is it a leading '0'?
    ja     Exit                             ;   No, a digit > '0'. done with blanking.
    or     al,al                            ;   Is it the trailing null at the end of the string?
    jnz    BlankFill                        ;   No, blank it.
    dec    edi                              ;   Point to the last character.
    mov    BYTE PTR [edi],"0"               ;   Force the single '0' back.
;
;   Exit UBTD.
;
Exit:
    ret                                     ;   Exit PROC UBTD.
UBTD    ENDP

;-------------------------------------------------------------------------------
;   End of PROC UBTD.
;-------------------------------------------------------------------------------

end start


jj2007

Daves,
Masm 6.14 and 6.15 don't like QWORD 1*1000000000*1000000000. It works fine with 8.0, 9.0 and JWasm. Workaround:
    qHugeWork               QWORD 0
if 1
    q16BillionBillion       QWORD 16000000000000000000
    q8BillionBillion        QWORD 8000000000000000000
    q4BillionBillion        QWORD 4000000000000000000
    q2BillionBillion        QWORD 2000000000000000000
    q1BillionBillion        QWORD 1000000000000000000
else
    q16BillionBillion       QWORD 16*1000000000*1000000000
    q8BillionBillion        QWORD 8*1000000000*1000000000
    q4BillionBillion        QWORD 4*1000000000*1000000000
    q2BillionBillion        QWORD 2*1000000000*1000000000
    q1BillionBillion        QWORD 1*1000000000*1000000000
endif

dedndave

 8)
just ling long kai fang it...
1234567890123456
15001234558140725952
20282409603651670423947251286015

KeepingRealBusy

Quote from: jj2007 on June 10, 2012, 06:59:53 AM
Daves,
Masm 6.14 and 6.15 don't like QWORD 1*1000000000*1000000000. It works fine with 8.0, 9.0 and JWasm. Workaround:
    qHugeWork               QWORD 0
if 1
    q16BillionBillion       QWORD 16000000000000000000
    q8BillionBillion        QWORD 8000000000000000000
    q4BillionBillion        QWORD 4000000000000000000
    q2BillionBillion        QWORD 2000000000000000000
    q1BillionBillion        QWORD 1000000000000000000
else
    q16BillionBillion       QWORD 16*1000000000*1000000000
    q8BillionBillion        QWORD 8*1000000000*1000000000
    q4BillionBillion        QWORD 4*1000000000*1000000000
    q2BillionBillion        QWORD 2*1000000000*1000000000
    q1BillionBillion        QWORD 1*1000000000*1000000000
endif


JJ,

Thank you for finding this. Another potential problem is the buffer. As I coded it, it expects at least a 26 character buffer with a zero terminator (and i never checked the terminator, just expected it to be a null terminator), could pose a problem when I go to print it expecting strlen to correctly size it. At a minimum I should zero the terminator, but this leaves me with the potential of a GPF for a memory access error. Oh, well! At least the code describes the buffer as 26 characters and a NULL.

"Other than that little problem, Mrs Lincoln, how did you enjoy the play?" Is the code adequately described? What about speed (not that it really matters)?

Dave.

dedndave

Dave,
if you really want some constructive critisism...
passing arguments to a function in registers is old school DOS stuff (not the same as CMPSB - lol)
i do this myself - but not for "reusable" functions - only for internal functions
the routine does not preserve EDI, and thus, does not follow the ABI
also - ABI-compliant routines would return a result in EAX
although - for assembly language routines, ECX and EDX may also be used (not much good in C-callable)
at any rate - no need to preserve EAX, ECX, EDX
it does not hurt to preserve ECX and EDX if speed is not an issue
in my ling long kai fang routine, i return
EAX = status
ECX = decimal string length
EDX = buffer address
the ECX and EDX values are intended to be convenient for the (assembly language) caller
however, they are not required to use the function
so, a C-callable OBJ could be made and linked with a C program

i fixed the issue that Jochen mentioned, and i get this
8,001,234,566,890,123,456
assuming that the first few digits are extraneous, it would be this
1,234,566,890,123,456
it should be this
1,234,567,890,123,456