OK, here's yet another implementation of a 64-by-64-bit division routine for 32-bit (x86) code. Tested, works. I got this from, of all places, a Wikipedia article on division methods (https://en.wikipedia.org/wiki/Division_algorithm). I really hate Wikipedia, for many reasons, and usually try to avoid it if at all possible. But for technical stuff they're sometimes not bad.
Anyhow, the article included the following method () in pseudocode:
Q := 0 -- Initialize quotient and remainder to zero
R := 0
for i := n − 1 .. 0 do -- Where n is number of bits in N
R := R << 1 -- Left-shift R by 1 bit
R(0) := N(i) -- Set the least-significant bit of R equal to bit i of the numerator
if R ≥ D then
R := R − D
Q(i) := 1
end
end
where N is the dividend, D is the divisor and R is the remainder.
Here's my implementation. It's reasonably fast, fast enough for my purposes. (Anyone want to race this against other methods?) My previous attempt was a successive-subtraction divider, which worked but it was s-o-o s-l-o-w as to be unusable (unless you want a division function that doubles as a delay timer!).
Ironically, since I ran out of registers here, this would be an ideal candidate for 64-bit code, except that it would be totally unnecessary there. I like this as a 32-bit function. (I pass the dividend in EDX:EAX and the divisor in a small structure.)
;============================================
; Div64x64 (&divStruct)
;
; 64-bit by 64-bit successive-subtraction division.
; Quotient and remainder are 64 bits.
;
; On entry,
; EDX:EAX = dividend
; divisor in divStruct
;
; Returns:
; EDX:EAX = quotient
; remainder in divStruct
;============================================
Div64x64 PROC divStructPtr:DWORD
LOCAL divisorLo:DWORD, divisorHi:DWORD, dividendLo:DWORD, dividendHi:DWORD
LOCAL bitIndex:DWORD
; If dividend is zero, skip all this stuff, just return zero for quotient & remainder:
MOV ECX, EAX
OR ECX, EDX
JNZ @F
MOV EDX, divStructPtr
MOV [EDX].DIVSTRUCT.remainderLo, 0
MOV [EDX].DIVSTRUCT.remainderHi, 0
XOR EAX, EAX
XOR EDX, EDX
RET
@@: PUSH EBX
PUSH ESI
PUSH EDI
MOV dividendLo, EAX
MOV dividendHi, EDX
MOV ECX, divStructPtr
MOV EAX, [ECX].DIVSTRUCT.divisorLo
MOV divisorLo, EAX
MOV EAX, [ECX].DIVSTRUCT.divisorHi
MOV divisorHi, EAX
XOR ESI, ESI
XOR EDI, EDI ;EDI:ESI = quotient
XOR ECX, ECX
XOR EBX, EBX ;EBX:ECX = remainder
XOR EAX, EAX ;Clear bit carrier.
MOV bitIndex, 63
; Left-shift R:
dvloop: SHLD EBX, ECX, 1
SHL ECX, 1
; R(0) = N(i):
PUSH ECX
MOV EDX, 1 ;"Seed" for bitmask.
MOV ECX, bitIndex
CMP ECX, 32
JB doLow1
SUB ECX, 32
SHL EDX, CL ;EDX now is mask for the bit we want.
TEST dividendHi, EDX ;Is N[i] set?
JMP @F
doLow1: SHL EDX, CL
TEST dividendLo, EDX ;Is N[i] set?
@@: SETNZ AL
POP ECX
AND ECX, DWORD PTR NOT 1 ;Clear bit[0].
OR ECX, EAX ;Set in bit[0] (or not).
; If R >= D:
CMP EBX, divisorHi
JB next
JA @F
; Must be equal--need to compare low DWORDs:
CMP ECX, divisorLo
JB next
; R = R - D:
@@: SUB ECX, divisorLo
SBB EBX, divisorHi
; Q[i] = 1:
PUSH ECX
MOV EDX, 1
MOV ECX, bitIndex
CMP ECX, 32
JB doLow2
SUB ECX, 32
SHL EDX, CL ;EDX now is mask for the bit we want.
OR EDI, EDX
JMP SHORT @F
doLow2: SHL EDX, CL
OR ESI, EDX
@@: POP ECX
next: DEC bitIndex
JNS dvloop ;Make sure we travel through loop on 0.
MOV EDX, divStructPtr
MOV [EDX].DIVSTRUCT.remainderLo, ECX
MOV [EDX].DIVSTRUCT.remainderHi, EBX
MOV EAX, ESI
MOV EDX, EDI
POP EDI
POP ESI
POP EBX
RET
Div64x64 ENDP
I give it a try, but the numbers don't convince me. What am I doing wrong?
mov divstruct.divisorLo, 123
mov eax, 123456789
cdq
push edx
push eax
invoke Div64x64, addr divstruct ; expected: 123456789/123=1003713.7
deb 4, "Out div64", divstruct.remainderLo, divstruct.remainderHi, divstruct.divisorLo, divstruct.divisorHi
fild qword ptr [esp]
fild qword ptr divstruct.divisorLo
fdiv
fistp qword ptr divstruct.remainderLo
deb 4, "Out fpu", divstruct.remainderLo, divstruct.remainderHi, divstruct.divisorLo, divstruct.divisorHi
Out div64
divstruct.remainderLo 90
divstruct.remainderHi 0
divstruct.divisorLo 123
divstruct.divisorHi 0
Out fpu
divstruct.remainderLo 1003714
divstruct.remainderHi 0
divstruct.divisorLo 123
divstruct.divisorHi 0
Looks like you're not setting .divisorHi in the divstruct, so you're getting random garbage there. And the quotient comes back in EDX:EAX, not in the structure.
Quote from: NoCforMe on August 22, 2023, 03:12:14 PMLooks like you're not setting .divisorHi in the divstruct, so you're getting random garbage there.
divisorHi is zero, not a problem here.
QuoteAnd the quotient comes back in EDX:EAX, not in the structure.
That was the missing information, now it works, thanks:
Print Str$("Result: %i\n", edx::eax)
Output: Result: 1003713
The fpu wins, as usual :azn:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Dividing 123456789 by 123, 5000000 times
85 ms for fpu fdiv, result=1003714
1146 ms for Div64x64, result=1003713
89 ms for fpu fdiv, result=1003714
1137 ms for Div64x64, result=1003713
87 ms for fpu fdiv, result=1003714
1154 ms for Div64x64, result=1003713
Thanks for those sobering results. Speed isn't an issue in my application, but it's always a nice thing to have.
So if a guy wanted to use the FPU in 32-bit code, could he do this to load up the FPU?
fild qword ptr [some 64-bit var]
Never used qwords before in X86.
Then there's the problem of getting the remainder ...
Quote from: NoCforMe on August 23, 2023, 09:00:37 AMSo if a guy wanted to use the FPU in 32-bit code, could he do this to load up the FPU?
fild qword ptr [some 64-bit var]
Yes. The fild instruction works for DWORDs, QWORDs and even WORDs.
QuoteThen there's the problem of getting the remainder ...
frndint is your friend.
David you been too late,this was solution to solve division on old z80,6502 without division opcode,there was asm code articles in computer magazine for multiply proc
Fastest 64 bit div might be special case where you can use sse2 64 bit shift /2,/4/6,/8
After seen 32 bit example program 2^x ,i made sse2 64bit shift 2^x
Quote from: daydreamer on August 23, 2023, 04:25:43 PMDavid you been too late,this was solution to solve division on old z80,6502 without division opcode
Oh, I know. This is part of my hobby, reinventing wheels that were invented decades ago. I'd actually like to do some Z80 programming someday; something about that incredibly underutilized machine just seems so attractive. (Love that context-switching mechanism to handle interrupts!) I did write a divide routine for another
uprocessor, the SX-28 which I used to make a couple projects including a (very accurate) camera shutter-speed tester.
If you keep develop it to work with bigger numbers = bigintiger division that can be used test very high prime numbers
My mindset is trained old-school Algos/cheating with luts and asm,so using SSE/fpu/gp asm is my skill doing things,sorry can't erase that to replace with pixelshader skill,demos bruteforcing in real-time with help of very fast gpus, also mindset need to understand the inverse ways of per pixel coding,instead of used to solve things with loop y and innerloop x and write pixels
Dos 16 bit is much more challenging than being spoiled by 3+ghz 64bit cpu,64GB memory, fast gpu
So mother of inventions when you dont have things is no more,but MHz cpu without SIMD opcodes ,MB memory and no gpu, so it motivate do my best to run program faster
Quote from: NoCforMe on August 22, 2023, 08:02:41 AMOK, here's yet another implementation of a 64-by-64-bit division routine for 32-bit (x86) code. Tested, works. I got this from, of all places, a Wikipedia article on division methods (https://en.wikipedia.org/wiki/Division_algorithm). I really hate Wikipedia, for many reasons, and usually try to avoid it if at all possible. But for technical stuff they're sometimes not bad.
Anyhow, the article included the following method () in pseudocode:
Q := 0 -- Initialize quotient and remainder to zero
R := 0
for i := n − 1 .. 0 do -- Where n is number of bits in N
R := R << 1 -- Left-shift R by 1 bit
R(0) := N(i) -- Set the least-significant bit of R equal to bit i of the numerator
if R ≥ D then
R := R − D
Q(i) := 1
end
end
where N is the dividend, D is the divisor and R is the remainder.
R := R << 1 that algorithm is just a shift and subtract method, right?
what's different from your previous attempt was a successive-subtraction divider
I think that it should be possible to implement a shift and subtract division by shifting 4 bits at a time
here's some public domain C code that might be of interest https://github.com/glitchub/arith64/tree/master
Quote from: jack on August 25, 2023, 08:57:33 PMwhat's different from your previous attempt was a successive-subtraction divider
Oh, that one was a really dumbass routine that simply subtracted the divisor from the dividend repeatedly until it went too far, yielding the quotient (and remainder). It works, but gawd is it slow! This one's much faster. I actually implemented a division routine that way for a microprocessor (SX-28) that had no multiply or divide instructions, which was acceptable in that application.
:smiley:
QuoteOh, I know. This is part of my hobby, reinventing wheels that were invented decades ago. I'd actually like to do some Z80 programming someday; something about that incredibly underutilized machine just seems so attractive.
you should see if you can find an old
SORD M68K :smiley:
https://www.sord.co.nz/sord-history.html (https://www.sord.co.nz/sord-history.html)
https://www.sord.co.nz/collection-sord-m68.html (https://www.sord.co.nz/collection-sord-m68.html)
We acquired one for some engineering development work we were doing waaay back, around the time IBM released it's First 'PC' !!
The SORD, had Both a Z80 A AND a Motorola 68000 on board. When using the 68000, the Z80, handled all the I/O functions.
It came with it's own Colour Graphics Programming language and ran an Excellent Compilable BASIC for the Z80.
It was an Excellent machine and realistically far more advanced than Any of the Early PC's
In an Era when people were always suggesting the Japanese made great hardware but Not software, the SORD BASIC and a SORD Database/Spreadsheet application called
PIPS and the SORD Word Processor, were 3 very good examples that disproved that theory :smiley:
Interesting; I had no idea such a system existed. Was this mainly a NZ/Aus. thing? I don't think this was ever sold widely in the US.
:smiley:
QuoteWas this mainly a NZ/Aus. thing?
No, not particularly, (as far as I know), they Do seem to be well known in Ireland & the UK but there does seem to be some very keen supporters now and then in New Zealand, with a guy from NZ actually re-writing the PIPS application around 1986 or so.
They were handled by the Mitsui Corporation, which had large business operations both here and in NZ, which may account for the machines popularity and proliferation in the 2 x countries, but it was never aimed at or marketed as a "Personal Desktop" machine, (well, at least NOT in the manner of IBM's or Compaq's marketing campaigns).
I know that the Military here and the local GM based car manufacturer Holden used SORD's.
QuoteI don't think this was ever sold widely in the US
No, but that would Not have been uncommon, considering the 'Market' and general business attitude in the computing world in the US back then, especially as IBM had only just released it's PC.
There is quite a bit of information about both SORD and it's 'Leader' Takayoshi Shiina, it was eventually purchased by and became "Toshiba Personal Computer System Corporation" in 1999.
I actually still have one of the small
M5 units, which we were given way back for some Sales we generated :smiley:
BTW: I'll delete this stuff to keep your thread clean after you have read it if you wish.
Just let me know :wink2: :smiley:
Hi,
Quote from: stoo23 on August 27, 2023, 12:52:24 PMBTW: I'll delete this stuff to keep your thread clean after you have read it if you wish.
Just let me know
Actually it was a nice diversion to a bit of history.
Cheers,
Steve
:smiley:
Well, if you liked the History and the Machine,.. this should give you the :smiley: Full details of the M68 and it's capabilities as well as PIPS,.. which was quite simply an amazing piece of Software.
https://www.sord.co.nz/sordworld/sordworld03web.pdf (https://www.sord.co.nz/sordworld/sordworld03web.pdf)
PIPS had heaps of BASIC like functions that could be used and incorporated into your applications.
Mitsui finally gave us the Extended PIPS 'goodies', that allowed us to in effect create 'stand-alone' Apps' as executables, which allowed me to build an in house accounting/record keeping system, that emulated the Filing System of my somewhat Avant Garde bosses lol but which incorporated Macro routines that assembled appropriate outputs that his accountants wanted.
It was a fun time, a great machine, SO Fast and in Colour !!! IBM XT's Sucked extremely Loudly by comparison,.. as did all of the early software as well compared to what we had on the SORD,.. We were in effect Spoiled for years waiting for alternate machines and software to eventually be as good as what we had on the SORD's :rofl: