## News:

Message to All Guests
NB: Posting URL's See here: Posted URL Change

## Python anyone?

Started by raymond, November 28, 2023, 11:30:19 AM

#### raymond

Got into a brief discussion recently in another forum about possibilities of assembly vs other languages. Following is one answer I got:

Quoterayfil wrote:
"Show me any other language which could extract the square root of any number with a precision of 1000 decimal digits within a split second."

Following Python snippet can extract the square root with a precision of 10000 decimal digits in 0.01 second.

CODE
from decimal import *
from time import time

def sq(n, pr):
start = time()
getcontext().prec = pr
print(Decimal(n).sqrt()
print(time() - start)

sq(5, 10000)

Mathematica is even faster than Python.

I would be VERY curious to see a finished program which could deliver such a feat, i.e. 0.01 second for 10,000 digits.

Is there anyone
a) sufficiently familiar with Python to understand the offered snippet,
b) capable of generating a working exe to confirm its speed and its output

If proven, its algo would be extremely interesting.

Thanks
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

#### jack

hi Raymond
on my PC python takes 0.035 seconds
I have been playing with MP floating point arithmetic for a while in FreeBasic, my algorithms are very simple, the following little program builds an mp float from a 10000 characters long string and and takes the square root, squares it and gets the relative error, the average time per loop is 0.011 seconds
mind you that there's overhead in building the number from string and then squaring it to get the relative error
the same algorithms in C took 0.0054 seconds
dim as decfloat x, y
dim as long i
dim as double t

t=timer
for i=1 to 9
x=trim(str(i))+string(NUMBER_OF_DIGITS-1, trim(str(i))) ' build a 10000 digits number from string
y=sqr(x) 'square root
y=(y*y-x)/x 'calculate the relative error
print fp2str(y, 20)
next
t=timer-t
print t/9
<obviously the decimal floating point package is not included, if you are interested I can upload it and post a link>

#### TimoVJL

QuoteStefan Krah's mpdecimal package (libmpdec): a complete implementation of the General Decimal Arithmetic Specification that will – with minor restrictions – also conform to the IEEE 754-2008 Standard for Floating-Point Arithmetic. Starting from Python-3.3, libmpdec is the basis for Python's decimal module.
mpdec

This fixed version run, but don't print nothing
from decimal import *
from time import time

def sq(n, pr):
start = time()
getcontext().prec = pr
print(Decimal(n).sqrt())
print(time() - start)

sq(5, 10000)
With x64 version:
0.18700003623962402

May the source be with you

#### jack

for people on Windows I recommend WinPython the download is huge but includes the most popular libraries like SciPy and many others

#### jj2007

Quote from: jack on November 28, 2023, 06:43:20 PMI recommend WinPython

See Using a Masm DLL from Python or this:

Py_Initialize PROTO C
Py_Finalize PROTO C
PyRun_SimpleString PROTO C :DWORD

Init
call Py_Initialize
fn PyRun_SimpleString, "print('Hello World')"
call Py_Finalize
Exit
end start

That was over 8 years ago, and now intense googling has not produced any python*.lib. They seem to fumble a lot with this language; version 3.8, for example, is the last one that works on Windows 7: "Last WinPython version that is said to still work on Windows 7 should be WinPython64-3.8.9.0"

#### HSE

In the phone program is a little slow:
0.221851

Equations in Assembly: SmplMath

#### GoneFishing

#6
python 3.9(64-bit)
Quote0.04675626754760742

Quote from: jj2007 on November 28, 2023, 08:34:51 PMThat was over 8 years ago, and now intense googling has not produced any python*.lib.

python*.lib resides in libs folder

#### jj2007

python*.lib resides in libs folder

I've done that in the meantime, using this archive. There's a problem, though:
include \masm32\include\masm32rt.inc

.data
hDll dd ?
hPyList_New dd ?
.code
start:
print hex\$(eax), 9, "LoadLib", 13, 10
print hex\$(rv(GetLastError)), 9, "last error", 13, 10
print hex\$(hPyList_New), 9, "PyList_New", 13, 10
exit
end start

Output:
000000C1        last error
00000000        PyList_New

Googling 0xc1 loadlibrary python yields thousands of hits. Error 0xC1 = dec 193 means bad exe format. So it loads the library correctly as 32-bit code but fails to get the address of PyList_New, which is according to Timo's TLPEView, part of python3.dll

You can solve the problem with
invoke SetCurrentDirectory, chr\$("\Python\")
Apparently, python3.dll loads libraries not from its own folder but rather 1. from the executable's folder and then 2. from Windows\System32, where it finds 64-bit DLLs. It's probably a feature

#### GoneFishing

all worked fine:
00000000        last error
715A3A30        PyList_New

#### jj2007

Quote from: GoneFishing on November 29, 2023, 02:53:00 AMI copied python*.dlls to my test project folder

And all worked fine, sure. But you would have to do that for all your projects, thus wasting an awful lot of disk space.

Next version of MasmBasic will have this macro, so that you can access the DLL from any folder:
SetDllFolder MACRO arg:=<0>
ifndef sdfOld\$
.DATA?
sdfOld\$ db 260 dup(?)    ; MAX_PATH
.CODE
endif
ifdif <arg>, <0>
.if 1
push repargA(arg)
call SetCurrentDirectory
else
.endif
endif
ENDM

Usage:
SetDllFolder "\Python"
...
SetDllFolder

#### GoneFishing

I prefer pure MASM32.
And you're right python wants to know the path to its folder which contains not only dlls but also  python*.zip with all needed modules.
Your MasmBasic HelloWorld proggie outputs a lot of internal configuration info when it doesn't find any modules:
QuotePython path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = 'python'
isolated = 0
environment = 1
user site = 1
import site = 1
sys._base_executable = 'C:\\masm32\\projects\\python\\39x32\\t2.exe'
sys.base_prefix = ''
sys.base_exec_prefix = ''
sys.platlibdir = 'lib'
sys.executable = 'C:\\masm32\\projects\\python\\39x32\\t2.exe'
sys.prefix = ''
sys.exec_prefix = ''
sys.path = [
'C:\\masm32\\projects\\python\\39x32\\python39.zip',
'.\\DLLs',
'.\\lib',
'C:\\masm32\\projects\\python\\39x32',
]

#### raymond

Quote from: jack on November 28, 2023, 12:55:11 PMhi Raymond
on my PC python takes 0.035 seconds

Many thanks to all those having shown interest in this subject.

Jack indicates that the quoted 0.01 second may be quite possibly achievable.

My main interest was to find out if there is a significantly different way of extracting a square root compared to all those I have learned in the past.

The most promising one for speed would have to be using logarithms. But then, are there means to convert those rapidly back and forth with sufficient precision? Or is it something totally different? A self-contained working program would certainly be useful to run under ollydbg and get some of the details.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

#### jack

#12
my algorithm uses log and exp evaluated using double precision as a first approximation, thereafter I use the Newton-Raphson method, something like this
x0 = x scaled so it's between 1 and 10 but less than 10
ex = the exponent in base 10 of x
y0 = first approximation
y0 = exp(log(x0)/2)*exp(2.302585092994046*ex/2) ; the second exp is mp-exp evaluated only to 16 digits
prec=32
then do the Newton-Raphson calculations in a loop doubling the precision each time in the loop
the calculations inside the loop are only evaluated to prec precision
Paul Dixon posted an implementation of the square root in PowerBasic with inline asm https://forum.powerbasic.com/forum/user-to-user-discussions/programming/816462-arbitrary-length-number-math?p=816479#post816479
but the Newton-Raphson  method is several orders of magnitude faster than his approach

#### jack

just in case that someone would like to have a look at my humble implementation of multiple precision decimal floating-point I attach the code in FreeBasic DecFloat-FB

#### GoneFishing

hi Raymond,
QuoteA self-contained working program would certainly be useful
In theory the small python script in your post can be converted to exe format.
I've used Pyinstaller for that purpose but the result doesn't look "debuggable" because of its size :
exe = 1018KB + dependency folder = 10.2MB

BTW performance improved to 0.031305789947509766