qword to ascii conversion

jimg · July 08, 2017, 10:25:10 AM

Looking back over what's gone before, I find I was blinded by the apparent low number of lingo's routine, but as you said earlier, I think the Dixon routine is just as fast, and it is much better commented and logical, although they bear an uncanny resemblance to each other. I have no idea which came first, but I'd put my money on Dixon. I just wasted two days screwing around with Lingos routine but I'm switching to Dixon simply because of the documentation :)

jj2007 · July 08, 2017, 05:35:01 PM

Check here. It looks a bit garbled, but if you search Dixon inside the page, you'll find it.

Here is January 2004 code by Paul Dixon (Assembler embedded in PowerBasic). Looks different but gives you an idea how old the final version might be. Hutch converted a similar one to Masm, also in 2010: This post is mainly for Paul Dixon as it is a modified version of his conversion algo that Ian_B modified

Code Select

pushad                 'save registers
sub esp,12             'create a bit of workspace on the stack
mov edi,esp            'point edi at workspace
fild  n&&              'load the data into FPU
mov eax,n&&            'must dereference it since it was passed as a parameter
fild qword [eax]
fbstp tbyte [edi]      'convert data to BCD and save it
mov esi,xp&            'pointer to result string
mov ecx,1              'loop counter for the 2 DWORDS holding the result
mov edx,0              'need to count backwards too as string is stored the opposite way to integer
p:
mov eax,[edi+edx*4+1]  'do conversion 4 low nibbles at a time
and eax,&h0f0f0f0f
add eax,&h30303030
mov [ecx*8+esi+7],al
shr eax,8
mov [ecx*8+esi+5],al
shr eax,8
mov [ecx*8+esi+3],al
shr eax,8
mov [ecx*8+esi+1],al
shr eax,8
mov eax,[edi+edx*4+1]  'and 4 high nibbles at a time
and eax,&hf0f0f0f0
shr eax,4
add eax,&h30303030
mov [ecx*8+esi+6],al
shr eax,8
mov [ecx*8+esi+4],al
shr eax,8
mov [ecx*8+esi+2],al
shr eax,8
mov [ecx*8+esi],al
shr eax,8
inc edx
dec ecx                 'finished 2 DWORDs?
jns lp
mov eax,[edi]           'yes, now do the 2 left over digits that didn't fit (18 digits)
and eax,&h0f
add eax,&h30
mov [esi+17],al
mov eax,[edi]
and eax,&hf0
shr eax,4
add eax,&h30
mov [esi+16],al
add esp,12         'remove workspace from stack
popad              'restore registers

jimg · July 09, 2017, 03:17:49 AM

Thanks for everything, JJ.

Otherwise I'm done. I'm happy with my cleanup of the Dixon routine.
Hopefully, this is my last post on the topic and I can move up one level in my stack.

edit:
And of course, the first time I went to use it, I needed the size of the string, so I modified the attached code to return in in eax.

jj2007 · July 09, 2017, 04:24:49 AM

Quote from: jimg on July 09, 2017, 03:17:49 AM
Thanks for everything, JJ.

My pleasure, Jim :icon14:

raymond · July 10, 2017, 12:31:46 PM

@jimg
Maybe I should have mentioned this earlier.

The Dixon procedure is fine as long as you realize and understand the limitations of using the fbstp instruction for the conversion. See
http://www.ray.masmcode.com/tutorial/fpuchap6.htm#fbstp

The above link also has a sub-link to an explanation of the 'packed BCD format' used by the FPU. That may help you understand what those conversion procedures are attempting to do.

jj2007 · July 10, 2017, 03:04:04 PM

Ray,
Can you explain to us mere mortals where the BCD elements are in Dixon's code? I can't see them...

Code Select

uqword proc ; lpbuf:DWORD, lpNumber:DWORD    ;unsigned DWORD to ASCII, Paul Dixon
	mov ecx, [esp+2*4]		; lp qword number
	mov eax, [ecx]		; eax->low dword
	mov edx, [ecx+4]		; edx->high dword
	mov ecx, [esp+1*4]		; ecx, lpbuf
	or edx, edx		;if top word is not used then..
	jz udword		; .. use unsigned Dword routine as it;s faster
	push ebp		;save registers that need to be saved
	push esi
	mov ebp, eax		; save a copy of low word for later
	mov esi, edx		; save a copy of high word for later
	mov pAnswer, ecx		; save a copy of buffer pointer for later, don;t stack it or it;s awkward to get back
	; do 64 bit multiply by 2^110\1e14+1 to make it more likely I get no rounding errors = 0B424DC35 095CD810h
	; this is a 4 part operation, LOxLO, LOxHI, HIxLO, HIxHI and add the 4 results offset appropriately
	mov ecx, 095CD810h		; 2^110\1e14+1 low word
	mul ecx		;
	mov eax, esi		; get number high word, LSBs of MUL not needed so they;re ignored
	; nop
	push edi
	push ebx
	mov edi, edx		; save high word of result
	mul ecx		; now do high word mul
	mov ecx, 0B424DC35h		; get ready for other half of MUL
	add edi, eax		; Add low word into result
	adc edx, 0		; and handle the possible carry
	mov eax, ebp		; Get low word of number again
	mov ebx, edx		; save high word of answer
	mul ecx		; do next part
	add edi, eax		; add into answer
	mov eax, esi		; get high word of number
	adc ebx, edx		; add it in to answer #####? possible carry to higher word? probably not..
	mul ecx		; do final mul
	mov ecx, 1000000		; ready for later
	add ebx, eax		; add in result
	adc edx, 0		; and carry
	add edi, 16384		; round up last bit to decrease error to within that required.
	adc ebx, 0
	adc edx, 0
	shrd edi, ebx, 14		;correct for the 14 bit shift used to increase accuracy
	shrd ebx, edx, 14
	shr edx, 14		; edx contains top 6 digits, edi:ebx contain the information to get the next 6 digits
				; edx = 2D093h ebx= 70D42573h edi=603A5EDAh
; 64 bit multiply done
; result in edx:ebx:edi , original number in esi:ebp
; x 1000000 to get next 6 digits
	mov eax, edi
	mov esi, edx		;save top6 in esi
	mul ecx
	mov eax, ebx		;do low word x1 000 000
	mov ebx, edx		;
	mul ecx		;do high word x 1 000 000
	add eax, ebx		;add both together
	adc edx, 0
	mov ebx, edx		;save 2nd 6 in ebx
; now get ((top6 x 1e6) + next6)*1e8 and sub from original number to leave last 8 digits
; since the 8 digits we want are all contained in the low word we can completely ignore the high word
; this allows imul and does away with carries to the high word
	mov eax, esi		;get top 6
	imul eax, ecx		;top 6 x 1 000 000
	mov ecx, 100000000
	add eax, ebx		;top6 x 1 000 000 + next 6
	imul eax, ecx
	sub ebp, eax		;ebp=last 8 digits
; 20 digits broken into 6,6,8 now display them
; esi=top6, ebx=next 6, ebp=last 8
; do top 6 digits
	mov eax, esi		; get top 6 digits
	mov edi, 68DB9h		; =2^32\10000+1
	mul edi
	mov esi, pAnswer		;offset TimeBuffer		;get pointer into answer buffer
	mov ecx, 100		;multiplier for later
	jnc qnextrw1		;if zero, supress them by ignoring
	cmp edx, 9		;1 digit or 2?
	ja qZeroSupressedo		;2 digits, just continue with pairs of digits to the end
	mov edx, dword ptr chartab[edx+edx]	;look up 2 digits
	mov [esi], dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp QZoS1		;continue with pairs of digits to the end
qnextrw1:
	mul ecx		;get next 2 digits
	jnc qnextrw2		;if zero, supress them by ignoring
	cmp edx, 9		;1 digit or 2?
	ja QZoS1a		;2 digits, just continue with pairs of digits to the end
	mov edx, dword ptr chartab[edx+edx]	;look up 2 digits
	mov [esi], dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp QZoS2		;continue with pairs of digits to the end
qnextrw2:
	mul ecx		;get next 2 digits
	jnc qnextrw3		;if zero, supress them by ignoring
	cmp edx, 9		;1 digit or 2?
	ja QZoS2a		;2 digits, just continue with pairs of digits to the end
	mov edx, dword ptr chartab[edx+edx]		;look up 2 digits
	mov [esi], dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp QZoS3		;continue with pairs of digits to the end
; next 6 digits
qnextrw3:
	mov eax,ebx		;get 2nd 6 digits
	mov ebx, 28F5C29h		;=2^32\100+1 ready for later
	mul edi		;edi=2^32\10000+1
	jnc qnextrw4		;if zero, supress them by ignoring
	cmp edx, 9		;1 digit or 2?
	ja QZSo3a		;2 digits, just continue with pairs of digits to the end
	mov edx, dword ptr chartab[edx+edx]		;look up 2 digits
	mov [esi], dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp QZSo4		;continue with pairs of digits to the end
	qnextrw4:
	mul ecx		;get next 2 digits
	jnc QZSo5		;if zero, supress them by ignoring
	cmp edx, 9		;1 digit or 2?
	ja QZSo4a		;2 digits, just continue with pairs of digits to the end
	mov edx, dword ptr chartab[edx+edx]		;look up 2 digits
	mov [esi], dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp QZSo5		;continue with pairs of digits to the end
; done top 10 digits
; since we took a short cut to the DWORD routine at the start we can never get beyond here
; as the DWORD routine would handle it instead.
; At this point we are guaranteed to have exactly 10 digits to print so just jump to the relevant spot.
qZeroSupressedo:
	mov edx, dword ptr chartab[edx+edx]		;look up the 2 digits
	mov [esi], dx
	add esi, 2
QZoS1:
	mul ecx
QZoS1a:
	mov edx, dword ptr chartab[edx+edx]		;look up the 2 digits
	mov [esi], dx
	add esi, 2
QZoS2:
	mul ecx
QZoS2a:
	mov edx, dword ptr chartab[edx+edx]		;look up the 2 digits
	mov [esi], dx
	add esi, 2
sj:
QZoS3:
;do next 6 digits
	mov eax, ebx		;get 2nd 6 digits
	mov ebx, 28F5C29h		;=2^32\100+1 ready for later
	mul edi		;edi=2^32\10000+1
QZSo3a:
	mov edx, dword ptr chartab[edx+edx]
	mov [esi], dx
	add esi, 2
QZSo4:
	mul ecx
QZSo4a:
	mov edx, dword ptr chartab[edx+edx]
	mov [esi], dx
	add esi, 2
QZSo5:
	mul ecx
;QZSo5a:
	mov edx, dword ptr chartab[edx+edx]
	mov [esi], dx
	add esi, 2
;do final 8 digits
	mov eax, ebp		;get last 8 digits
	mul ebx		;ebx=2^32\100+1
	mov eax, edx
	mov ebx, edx
	mul edi		;edi=2^32\10000+1
	mov edx, dword ptr chartab[edx+edx]		;look up next 2 digits
	mov [esi], dx
	add esi, 2
	mul ecx
	mov edx, dword ptr chartab[edx+edx]		;look up next 2 digits
	mov [esi], dx
	add esi, 2
	mul ecx
	mov edx, dword ptr chartab[edx+edx]		;look up next 2 digits
	mov [esi], dx
	add esi, 2
	mov eax, ebx		;first 6 digits of last 8
	imul eax, ecx		;x100 to shift into place
	pop ebx
	pop edi
	mov edx, ebp		;last 8 - 6 just done gives final 2 digits
	sub edx, eax		;look up final 2 digits
	mov edx, dword ptr chartab[edx+edx]
	mov [esi], dx
	add esi, 2
	mov byte ptr [esi],0		;need to zero terminate
	pop esi
	pop ebp
AllDone:
	mov eax, [esp+4]
	ret 2*4
udword:
	push edi		;save registers that need to be saved
	push esi
	mov esi, ecx		; sptr
	mov edi,eax		;eax= x		;save a copy of the number
	mov edx, 0D1B71759h		;=2^45\10000 13 bit extra shift
	mul edx		;gives 6 high digits in edx
	mov eax,68DB9h		;=2^32\10000+1
	shr edx,13		;correct for multiplier offset used to give better accuracy
	jz short skiphighdigits		;if zero then don;t need to process the top 6 digits
	mov ecx,edx		;get a copy of high digits
	imul ecx,10000		;scale up high digits
	sub edi,ecx		;subtract high digits from original. EDI now = lower 4 digits
	mul edx		;get first 2 digits in edx
	mov ecx,100		;load ready for later
	jnc short next1		;if zero, supress them by ignoring
	cmp edx,9		;1 digit or 2?
	ja short ZeroSupressed		;2 digits, just continue with pairs of digits to the end
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp short ZS1		;continue with pairs of digits to the end
next1:
	mul ecx		;get next 2 digits
	jnc short next2		;if zero, supress them by ignoring
	cmp edx,9		;1 digit or 2?
	ja short ZS1a		;2 digits, just continue with pairs of digits to the end
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp short ZS2		;continue with pairs of digits to the end
next2:
	mul ecx		;get next 2 digits
	jnc short next3		;if zero, supress them by ignoring
	cmp edx,9		;1 digit or 2?
	ja short ZS2a		;2 digits, just continue with pairs of digits to the end
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp short ZS3		;continue with pairs of digits to the end
next3:
skiphighdigits:
	mov eax,edi		;get lower 4 ditigs
	mov ecx,100
	mov edx,28F5C29h		;2^32\100 +1
	mul edx
	jnc short next4		;if zero, supress them by ignoring
	cmp edx,9		;1 digit or 2?
	ja short ZS3a		;2 digits, just continue with pairs of digits to the end
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dh		;but only write the 1 we need, supress the leading zero
	inc esi		;update pointer by 1
	jmp short ZS4		;continue with pairs of digits to the end
next4:
	mul ecx		;this is the last pair so don;t supress a single zero
	cmp edx,9		;1 digit or 2?
	ja short ZS4a		;2 digits, just continue with pairs of digits to the end
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dh		;but only write the 1 we need, supress the leading zero
	mov byte ptr [esi+1],0		;zero terminate string
	jmp short xit		;all done
ZeroSupressed:
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dx
	add esi,2		;write them to answer
ZS1:
	mul ecx		;get next 2 digits
ZS1a:
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dx		;write them to answer
	add esi,2
ZS2:
	mul ecx		;get next 2 digits
ZS2a:
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dx		;write them to answer
	add esi,2
ZS3:
	mov eax,edi		;get lower 4 digits
	mov edx,28F5C29h		;2^32\100 +1
	mul edx		;edx= top pair
ZS3a:
	mov edx,dword ptr chartab[edx*2]		;look up 2 digits
	mov [esi],dx		;write to answer
	add esi,2		;update pointer
ZS4:
	mul ecx		;get final 2 digits
ZS4a:
	mov edx,dword ptr chartab[edx*2]		;look them up
	mov [esi],dx		;write to answer
	mov byte ptr [esi+2],0		;zero terminate string
xit:
	pop esi		;restore used registers
	pop edi
	jmp AllDone
uqword endp

raymond · July 11, 2017, 02:19:33 AM

Quotesub esp,12 'create a bit of workspace on the stack
mov edi,esp 'point edi at workspace
fild n&& 'load the data into FPU
mov eax,n&& 'must dereference it since it was passed as a parameter
fild qword [eax]
fbstp tbyte [edi] 'convert data to BCD and save it
mov esi,xp& 'pointer to result string
mov ecx,1 'loop counter for the 2 DWORDS holding the result
mov edx,0 'need to count backwards too as string is stored the opposite way to integer
p:
mov eax,[edi+edx*4+1] 'do conversion 4 low nibbles at a time

Above is part of Dixon's code which you posted previously.
The first red line shows that EDI is used for pointing to a workspace reserved on the stack by the previous instruction.
The following green line indicates that the target dword is loaded on the FPU.
The next red instruction specifies that the dword gets converted to the BCD format and stored in the reserved workspace on the stack.
Then, the BCD nibbles get recovered sequentially from the stack for unpacking, as pointed to by EDI/EDX in the next red instruction (and other subsequent similar instructions).

jj2007 · July 11, 2017, 04:07:43 AM

Quote from: raymond on July 11, 2017, 02:19:33 AMAbove is part of Dixon's code which you posted previously.

Oops, I forgot - the old routine he posted in 2004. Yes, that is BCD code, of course.

raymond · July 11, 2017, 11:24:24 AM

Kind of confusing when the same author issues separate procedures for the same purpose.

Anyway, my initial comment was primarily to warn jimg about the use of such instructions without knowing about its limitations, in addition to explaining what it does.

jimg · July 11, 2017, 03:19:39 PM

I started out trying to use the FPU, but quickly realized it takes twice as long to do a single FINIT than the whole non-FPU routine. Rather disheartening.

jj2007 · July 11, 2017, 06:01:35 PM

Quote from: jimg on July 11, 2017, 03:19:39 PM
I started out trying to use the FPU, but quickly realized it takes twice as long to do a single FINIT than the whole non-FPU routine.

You need finit only once in your program, or at the beginning of a loop. In this earlier post, there are three FPU routines that are, for example, faster than umqtoa. Problem is that a handful of values are, ehm, incorrect because of rounding errors. Tweaking might solve this issue. As I wrote earlier, this is quite experimental

jimg · July 11, 2017, 10:48:34 PM

So I can assume that if I call some other routine in some other library not written by me, that it will not screw up the FPU? Which implies by extension that if I write a general purpose routine, it is incumbent upon me to return the FPU in a clean state?

jj2007 · July 11, 2017, 11:29:22 PM

Good question, Jim :t

There is a thread on MSDN social saying "The ABI requires that the control word is preserved", and a Masm32 thread on "Application Binary Interface (ABI), calling conventions and the like". On SOF, they discuss Is it necessary to save the FPU state here?

If anybody has an official ABI info on the fpu in x86/x64, please post a link.

In practice, I never have seen the fpu control word change when calling a Windows function. If that function uses the fpu, and doesn't fully restore everything, then some register contents will be gone, of course.

The MasmBasic library has 65 occurrences of ffree st(7) - the usual way to ensure that the fpu behaves well when using it. Just tested with a Window application under Win7-64, and it seems that Windows doesn't touch the fpu at all. Even after the WM_PAINT handler, all content is still there. Don't rely on that, it may differ between Windows versions.

jimg · July 12, 2017, 12:37:36 AM

In a program I've been working on, I always do a FINIT at the start of the several procs that use the FPU, and set the control word to truncate. If the ABI requires that the control word be preserved, then I need to restore it before leaving each of these procs?

How does ffree st(7) make the fpu behave? The description says it just sets it to empty.

Sounds like you have to worry about both ends. You can't count on what state you get, but have to leave in a clean state.

RuiLoureiro · July 12, 2017, 12:56:16 AM

Quote from: jimg on July 12, 2017, 12:37:36 AM
In a program I've been working on, I always do a FINIT at the start of the several procs that use the FPU, and set the control word to truncate. If the ABI requires that the control word be preserved
...

TheCalculator starts with FINIT and uses FINIT in the conversion routines only - string to real10 and real10 to string. And control word is preserved. It seems it works correctly always. Whenever it finds an invalid operation it does FCLEX and exit with an error code.

The MASM Forum

News:

qword to ascii conversion

jimg

jj2007

jimg

jj2007

raymond

jj2007

raymond

jj2007

raymond

jimg

jj2007

jimg

jj2007

jimg

RuiLoureiro