Let's say I want to convert 123, a decimal number, to base 5.
Here's how I do it on paper:
Quote
123
5------- 3
24
5------- 4
4
5------- 4
0
That means 123 in base 5 is 443. And this is my naive code:
include \masm32\include\masm32rt.inc
.686
.code
start:
xor edx,edx
mov eax, 123 ; number to convert
mov ecx, 5 ; base
L1:
cmp eax, 0 ; if number is still > 0?
jg L2 ; yes. keep dividing
jl L3 ; done
L2:
div ecx ; number = number / base
print str$(edx) ; print remainder
jmp L1
L3:
exit
end start
After some experimentations, I found that:
1.
div doesn't always behave like the typical division operation found in HLLs. For example: 6 / 37 in HLLs is 0. But
div gives you 6.
That's why
cmp eax, 0 part is not sufficient.
2.
str$ macro trashes the EDX register, so I have to preserve it first.
So my questions are:
1. What should I do so eax can be divided repeatedly until it reaches 0?
2. How to print the result in reverse order? Assume the code above calculates the division properly, then it will print 344, which should be 443.
Look at the Axd2a_based algo.
First param - dwDWORD - is a number to convert.
Second param - dwBase - may be any base to convert to in a range 2...10 (i.e. the output string-number will be in a range of bases binary...decimal).
Third param - lpBuf - is an address to the output buffer for the string with number converted.
(Since the algo due its universal nature - it accepts any base to convert to - anyway uses the DIV instruction, which is very slow, the entire algo was written as a "size-optimized" and thus uses other more or less slow instructins as well, but they all in a bunch have no speed-decreasing value comparing with the DIV)
include \masm32\include\masm32rt.inc
.686
.mmx
.xmm
;.model flat,stdcall
.code
option prologue:none
option epilogue:none
Axd2a_based proc dwDWORD:DWORD, dwBase:DWORD, lpBuf:DWORD
mov eax,[esp+4]
xor ecx,ecx
@@:
xor edx,edx
div dword ptr [esp+8+ecx*4]
inc ecx
push edx
test eax,eax
jnz @B
mov edx,[esp+12+ecx*4]
@@:
pop eax
add eax,"0"
mov [edx],ax
inc edx
loop @B
ret 12
Axd2a_based endp
option prologue:PrologueDef
option epilogue:EpilogueDef
start proc
LOCAL buf[128]:BYTE
invoke Axd2a_based,123,5,addr buf
invoke crt_printf,CTXT("Value: %s",10),addr buf
invoke crt__getch
invoke crt_exit,0
start endp
end start
Hi
anta40 :t
Quote from: anta40 on May 18, 2013, 01:10:52 PM
After some experimentations, I found that:
1. div doesn't always behave like the typical division operation found in HLLs. For example: 6 / 37 in HLLs is 0. But div gives you 6.
That's why cmp eax, 0 part is not sufficient.
2. str$ macro trashes the EDX register, so I have to preserve it first.
So my questions are:
1. What should I do so eax can be divided repeatedly until it reaches 0?
2. How to print the result in reverse order? Assume the code above calculates the division properly, then it will print 344, which should be 443.
DIV returns you quotient of a division in an EAX, and the remainder in EDX, so, having division 6 by 37, you get 0 as a quotient and 6 as a remainder (since 6 cannot be divided by 37 with an integral quotient). HLLs just usually aren't designed to return you both values from one division. But you may use some other function or operator, like "MOD", or "%" (in C it's like value=6%37 - value will hold 6).
As for questions. The code I posted above contains solution for both questions:
1. yes, you should divide eax to the base you want to convert to, and every division save the remainder as a one digit of the target converted value. But the order is reversed.
2. to reverse the order back to normal, you may use different ways, let's say, store numbers in a buffer and then shuffle bytes from one side to another, but the code I posted uses simpler way - it just stacks the numbers, and then retrieves them again, in reversed order. Then decimal correction - add eax,"0" - then put it in an output buffer.
Here a soluce who had been tested on the old forum.
I is just an apply of what is a base.
.data
hextable BYTE "0123456789ABCDEF",0
Chiffre db 50 dup (0)
.code
invoke BaseN,123,5,addr Chiffre
;dword number to chain in any base
;################################################################
BaseN PROC uses esi edi ebx value:DWORD, base:DWORD, pout:DWORD
mov ebx,0
lea edi,Chiffre
add edi,LENGTHOF Chiffre - 2
lea esi,hextable
;début
mov eax,value
mov ecx,base ;975 = 9*10!2 + 7*10!1 + 5*10!0 10!0 = 1
newpowerofbase: ; ! = power of
xor edx,edx
div ecx ;divide by the base!1 in loop
;le reste donne
mov dl,[esi+edx]
mov [edi],dl
inc ebx
.if eax != 0 ;dividend not null,use the next power of base
;ascii numbers are written right to left
dec edi ;pointer of ascii number -1
jmp newpowerofbase
.endif
mov esi,edi
mov ecx,ebx
inc ecx ;add the zero
mov edi,pout
rep movsb
FindeBaseN:
mov eax,ebx ;return lenght of ascii chiffre without the zero
ret
BaseN endp
ToutEnMasm's variation is even more universal - it can have a base range up to 16 - due to lookup table usage instead of simple decimal "add eax,"0"" correction (but you may use eax in Axd2a_based as an index in a lookup table, so it's not a big deal to add it). This way you even may implement it to the base64 or something such - just use appropriate lookup table :t
you are using Horner's Rule for base conversion, whether you know it or not
aka Ling Long Kai Fang :biggrin:
to store the numbers, start at the end of the buffer and work toward the beginning
return the address of the first digit
Hi alex,
You're right.
mov eax, 6
mov ecx, 37
div ecx
gives EAX = 0, and EDX = 6, as expected. I guess I'm running the wrong code :redface:
So I changed the code a bit:
include \masm32\include\masm32rt.inc
.686
.code
start:
xor edx,edx
mov eax, 123
mov ecx, 5
L1:
cmp eax, 0
jg L2
je L3
L2:
div ecx
push edx
print str$(edx)
pop edx
jmp L1
L3:
exit
end start
Now it prints lots of numbers :redface:
Guess I need some break, and later examine this under ollydbg
«div ecx» divide edx:eax by ecx
so do this
xor edx, edx
div ecx
-----------------
and do this also
push ecx
print .... ; <---destroys ecx
pop ecx
it is safe if we use ebx or esi or edi
Quote from: RuiLoureiro on May 18, 2013, 09:11:58 PM
and do this also
push ecx
print .... ; <---destroys ecx
pop ecx
it is safe if we use ebx or esi or edi
That's a good hint, because it's not only a beginners fault. I've forgotten that a few months ago and had to pay with some debugger sessions.
Gunther
start:
xor edx,edx
mov eax, 123 ; number to convert
mov ecx, 5 ; base
one thing here is, that you are not converting a decimal number to base 5
MOV EAX,123 - 123 is a binary number - not decimal
furthermore, the base is an abitrary choice, really
it is binary, meaning base 2
but, it resides in a register that has 32 bits, so it can be considered a base 4294967296 number
Ling Long Kai Fang: The Rule
The rule basically states that any Nth order polynomial of a single unknown written as:
(eq. 1) f(x) = A + Bx + Cx^2 + Dx^3 + ... Nx^n
may be re-written as:
(eq. 2) f(x) = A + x(B + x(C + x(D + ... x(N))))
or:
(eq. 3) f(x) = x(x(x(x(N) ... + D) + C) + B) + A
This identity can greatly simplify solving for the roots of a polynomial.
Base Conversion Example
But, we can use this same equation to convert numbers between any two radices. Any number in any base radix may be expressed as just such a polynomial:
(eq. 4) 987654 (base 10) = 9 x^5 + 8 x^4 + 7 x^3 + 6 x^2 + 5 x + 4, where x = 10
or:
(eq. 5) 76032 (base 8 ) = 7 x^4 + 6 x^3 + 0 x^2 + 3 x + 2, where x = 8
If we want to convert that number to a different base, we apply ling long kai fang to re-write the equation and use arithmetic rules from the target radix to evaluate it:
(eq. 6) 76032 (base 8 ) = x(x(x(x(7) + 6) + 0) +3) + 2, where x = 8
The simplest way to evaluate this equation is to start from the innermost set of parenthesis and work out. If we want the result to be expressed as a base 10 value, we simply use base 10 arithmetic rules to evaluate it:
(ex. 1) 76032 (base 8 ) converted to base 10
add source digit 1 + 7 = 7
multiply by source base × 8 = 56
add source digit 2 + 6 = 62
multiply by source base × 8 = 496
add source digit 3 + 0 = 496
multiply by source base × 8 = 3968
add source digit 4 + 3 = 3971
multiply by source base × 8 = 31768
add source digit 5 + 2 = 31770 = base 10 result
Dave,
Quote from: dedndave on May 18, 2013, 11:15:51 PM
Ling Long Kai Fang: The Rule
The rule basically states that any Nth order polynomial of a single unknown written as:
(eq. 1) f(x) = A + Bx + Cx^2 + Dx^3 + ... Nx^n
may be re-written as:
(eq. 2) f(x) = A + x(B + x(C + x(D + ... x(N))))
or:
(eq. 3) f(x) = x(x(x(x(N) ... + D) + C) + B) + A
yes, it's the simple Horner scheme. Anyway, your idea is good. :t
Gunther
this seems to work
many ways to speed it up, though :P
;###############################################################################################
.XCREF
.NoList
INCLUDE \Masm32\Include\Masm32rt.inc
.List
;###############################################################################################
Base5 PROTO :DWORD,:LPSTR
;###############################################################################################
; .DATA
;***********************************************************************************************
.DATA?
buff db 16 dup(?)
;###############################################################################################
.CODE
;***********************************************************************************************
_main PROC
INVOKE Base5,123,offset buff
print eax,13,10
INVOKE Base5,4294967295,offset buff
print eax,13,10
inkey
INVOKE ExitProcess,0
_main ENDP
;***********************************************************************************************
Base5 PROC USES EDI dwValue:DWORD,lpBuffer:LPSTR
;the buffer pointed to by lpBuffer should be at least 15 bytes long
;a pointer to the first base 5 ASCII digit is returned in EAX
mov edi,lpBuffer
mov ecx,5
add edi,14
mov eax,dwValue
mov [edi],ch
Base50: xor edx,edx
dec edi
div ecx
or dl,30h
or eax,eax
mov [edi],dl
jnz Base50
xchg eax,edi
ret
Base5 ENDP
;###############################################################################################
END _main
Base5b is a bit faster :P
it uses multiply-to-divide instead of DIV
and - it divides by 25, then gets 2 digits from a look-up table
Base5a
443
32244002423140
906 909 894 910 905 900 907 891 905 901 clock cycles
Base5b
443
32244002423140
34 25 28 33 20 27 43 32 24 40 clock cycles
EDIT: oops - fixed a slight bug in the buffer offset
Hi Dave,
Quote from: dedndave on May 19, 2013, 11:13:37 PM
Base5b is a bit faster :P
it uses multiply-to-divide instead of DIV
and - it divides by 25, then gets 2 digits from a look-up table
yes it is:
Base5a
443
32244002423140
276 258 270 268 271 267 270 268 276 270 clock cycles
Base5b
443
32244002423140
35 34 37 33 34 38 34 34 36 34 clock cycles
Press any key to continue ...
Gunther
Quote from: RuiLoureiro on May 18, 2013, 09:11:58 PM
«div ecx» divide edx:eax by ecx
so do this
xor edx, edx
div ecx
-----------------
and do this also
push ecx
print .... ; <---destroys ecx
pop ecx
it is safe if we use ebx or esi or edi
Hmm I think I stil don't get it right...
L2:
xor edx, edx
div ecx
push ecx
push edx
print str$(edx)
pop edx
pop ecx
jmp L1
:redface:
these functions may (and usually do) destroy EAX, ECX, EDX
they preserve EBX, EBP, ESI, and EDI
L2:
xor edx, edx
div ecx
push eax
push ecx
push edx
print str$(edx)
pop edx
pop ecx
pop eax
jmp L1
anta40,
This is your code
Quote
include \masm32\include\masm32rt.inc
.686
.code
start:
;xor edx,edx
mov eax, 123 ; number to convert
;mov ecx, 5 ; base
mov ebx, 5 ; base ebx=5 ebx is safe
L1:
cmp eax, 0 ; if number is still > 0?
;jg L2 ; yes. keep dividing
jl L3 ; done
L2:
xor edx, edx
div ebx ; number = number eax / base ebx
push eax ; preserve the quotient EAX
;
print str$(edx) ; show the remainder edx
;
pop eax ; preserve the quotient EAX
jmp L1
L3:
exit
end start
what you are probably missing is to convert the binary remainder to ASCII before storing it
Base5 PROC USES EDI dwValue:DWORD,lpBuffer:LPSTR
;the buffer pointed to by lpBuffer should be at least 15 bytes long
;a pointer to the first base 5 ASCII digit is returned in EAX
mov edi,lpBuffer
mov ecx,5
add edi,14
mov eax,dwValue
mov [edi],ch
Base50: xor edx,edx
dec edi
div ecx
or dl,30h ;make it an ASCII char
or eax,eax
mov [edi],dl
jnz Base50
xchg eax,edi
ret
Base5 ENDP
of course, if you use the str$ macro, it does that for you
if you want to see the dividend, quotient, and remainder at each step...
Base50: push ecx
push eax
print str$(eax),32
pop eax
pop ecx
xor edx,edx
dec edi
div ecx
push ecx
push eax
push edx
push edx
print str$(eax),32
pop edx
print str$(edx),13,10
pop edx
pop eax
pop ecx
or dl,30h ;make it an ASCII char
or eax,eax
mov [edi],dl
jnz Base50
Dave,
This is a work to you
For base x digits a,b,c,d,e we have
(eq. 5) abcde (base x ) = a x^4 + b x^3 + c x^2 + d x + e
=>
(eq. 6) abcde (base x ) = x(x(x(x(a) + b) + c) +d) + e
Whats your general magic rule to get the digits p,q,r,s,t,u,v... of base y.
that equation looks worse than it is - lol
it's really simple, once you get the hang of it
i wrote this document a few years ago, when i was working on the bignum to ascii routines
hope it helps :t
i might add.....
ling long kai fang is most efficient when working with very large values
for smaller values, like single dwords, the "divide-and-conquer" method can be more efficient
well - just don't use the DIV instruction - lol
for an example of what i mean, have a look at Base5b posted above
it uses divide-and-conquer, but uses mutliply-to-divide and a larger base (25)
if we really wanted it to be fast, we might divide by 625 and use a larger look-up table :P