News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

BINARY INTEGER SQUARE ROOTS

Started by raymond, January 31, 2025, 02:07:46 PM

Previous topic - Next topic

raymond

Some testing has been performed lately to compare the speed of different procedures to extract square roots. Since one of the purposes was to search for prime numbers, and prime numbers are integers, this prompted me to revive an old procedure to extract the 'integer' square root of 'integer' numbers with 'integer' code instructions, and compare its speed to the general use of the FPU.

Although it may be three times slower than the FPU for the largest 31-bit integers, some may be interested in seeing this simple algo. It uses only the EAX, ECX and EDX registers (and it could be easily adapted to the 64-bit system by using the three equivalent 64-bit registers, as long as the bsr and bts instructions are still in use).

This is an adaptation of a procedure which was described in an IBM user's manual for their mechanical calculator marketed in the mid-50s. Not being a mathematician, I don't know where they may have obtained such a procedure nor can I explain why it works. I would assume that it may have been developed by the ancients to use with an "abacus".

;this "getsqr" procedure would be called with the target integer already in the EAX register
;the programmer would be responsible to check it's validity
;'negative' integers would simply be handled as positive 32-bit integers

getsqr:
      bsr   edx,eax     ;get position number of most significant digit in EAX
      xor   ecx,ecx
      bts   ecx,edx     ;ECX to be used for trailing bit position
      shr   edx,1 ;check if position was in even or odd number bit
      .if   CARRY?      ;if EDX was odd, the target has even number of significant bits
            shr   ecx,1 ;shift ECX, it must start at the next even position
      .endif
      xor   edx,edx     ;EDX to be used for sqr

sq1:
      add   edx,ecx
      .if   edx > eax
            sub   edx,ecx
            jmp   sq2
      .endif
      sub   eax,edx
      add   edx,ecx
sq2:
      shr   edx,1
      shr   ecx,1
      jc    sq3         ;last integer bit of root has been set
      shr   ecx,1       ;shift under next even position
      jmp   sq1

sq3:

;at this point, the EDX register would contain the truncated square root result.
;if a rounded result would be preferred, the following code must then be used
; .if eax > edx
; inc edx
; .endif

      ret               ;sqr answer is returned in the EDX register
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

Biterider

Hi raymond
Very interesting  :thumbsup:
Thank you.

Regards, Biterider

HSE

Hi Raymond!

Quote from: raymond on January 31, 2025, 02:07:46 PMand it could be easily adapted to the 64-bit system by using the three equivalent 64-bit registers, as long as the bsr and bts instructions are still in use

That is correct  :thumbsup:

Regards, HSE
Equations in Assembly: SmplMath

jj2007

Quote from: raymond on January 31, 2025, 02:07:46 PMNot being a mathematician

I always thought you were the only mathematician in this forum, Raymond :cool:

daydreamer

Thanks, great raymond  :thumbsup:
It Shows similarity to a division proc based on how human calculate many digits with pen & paper in school
Héctor 64 bit conversion :thumbsup:

my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

raymond

Just thought of a detail which I had not mentioned in my original post. Using the given code, regardless if you modify it or not to get a "rounded" square root, the EAX register would not be altered. Then, upon return, if EAX=0, the result in the EDX register would be the exact square root.

Without the rounding code, the resulting content of the EAX register would be equivalent to the remainder of a division of the target number by the truncated square root.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

ognil

Hi,
To compute the square root of an integer using SSE instructions in MASM64 assembly, follow these steps:

Load the integer into a general-purpose register (e.g., RAX).
Convert the integer to a double-precision float using CVTSI2SD.
Compute the square root using the SQRTSD instruction.
Result is stored in an XMM register as a double-precision float.

Here's the code example:

.data
num QWORD 25  ; Example integer (64-bit)
.code
main proc
    ; Load the integer into RAX
    mov rax, num
    ; Convert RAX to double-precision float in XMM0
    cvtsi2sd xmm0, rax
    ; Compute square root of XMM0
    sqrtsd xmm0, xmm0
    ; The result is now in XMM0 as a double-precision float
    ; Optionally convert back to integer (truncates decimal part)
    ; cvttsd2si rax, xmm0
    ret
main endp

Explanation:
cvtsi2sd xmm0, rax : Converts the 64-bit integer in RAX to a double-precision float in XMM0.
sqrtsd xmm0, xmm0 : Computes the square root of the value in XMM0, storing the result back in XMM0.

Optional Conversion : Use cvttsd2si to convert the result back to an integer if needed, but note this truncates the decimal part.
This approach handles 64-bit integers accurately using double-precision floating-point operations.
For 32-bit integers, the code remains valid as the upper bits are zero-extended. :biggrin:
"Not keeping emotions under control is another type of mental distortion."

NoCforMe

Hey, since we're now posting totally trivial code examples (yours boils down to really just one instruction: sqrtsd xmm0, xmm0), I'll offer my contribution:

    fsqrt

See how simple that is?
Assembly language programming should be fun. That's why I do it.

jj2007

Quote from: ognil on March 06, 2025, 12:13:49 PMCompute the square root using the SQRTSD instruction.

Eleven years ago, in the Lab

Quote from: NoCforMe on March 06, 2025, 05:16:55 PMyours boils down to really just one instruction: sqrtsd xmm0, xmm0

Right.