News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

How to optimize execution speed

Started by kcvinu, June 09, 2024, 07:59:47 AM

Previous topic - Next topic

kcvinu

Hi all,
I started my masm64 journey with a dll which creates a simple window. This dll is called from Python with ctypes library. So far so good. But it takes almost 3 times more than my other dll which created in C3 programming language. For those of you not familiar with C3, it's a C like language. The c3 dll is taking 11-15 ms. But teh dll in masm64 is taking 40-45 ms. What should I do to get the optimal speed ?
Here is my asm code.
; Program : A dll to make a window from python.
; Author : kcvinu

include C:\masm64\include64\masm64rt.inc

.data?
hInstance     dq ?
hIcon         dq ?
hCursor       dq ?
hBrush        dq ?
     
    .data
      classname db "KCV_Window",0
      caption db "മാസം വിൻഡൊ", 0 ;  This is my native language MALAYALAM!

.CODE

; This will be exported.
NewForm Proc
mov hInstance, rv(GetModuleHandle,0)
    mov hIcon,     rv(LoadIcon,hInstance,10)
    mov hCursor,   rv(LoadCursor,0,IDC_ARROW)
    mov hBrush,    rv(CreateSolidBrush,00EEEEEEh)
    mov rax, rv(makeWindow)
RET
NewForm endp

LibMain proc instance:DWORD, reason:DWORD, unused:DWORD
    ret
LibMain endp

makeWindow proc
    LOCAL wc      :WNDCLASSEX
   
    mov wc.cbSize,         SIZEOF WNDCLASSEX
    mov wc.style,          CS_BYTEALIGNCLIENT or CS_BYTEALIGNWINDOW
    mov wc.lpfnWndProc,    ptr$(WndProc)
    mov wc.cbClsExtra,     0
    mov wc.cbWndExtra,     0
    mrm wc.hInstance,      hInstance
    mrm wc.hIcon,          hIcon
    mrm wc.hCursor,        hCursor
    mrm wc.hbrBackground,  hBrush
    mov wc.lpszMenuName,   0
    mov wc.lpszClassName,  ptr$(classname)
    mrm wc.hIconSm,        hIcon

    invoke RegisterClassEx, ADDR wc
   
    invoke CreateWindowEx, 0, \
                          ADDR classname, addr caption, \
                          WS_OVERLAPPEDWINDOW or WS_VISIBLE,\
                          100, 100, 500, 400, 0,0,hInstance,0   
    ret
makeWindow endp

; This function also exported
showForm proc handle:HWND
invoke ShowWindow, handle, 5
invoke UpdateWindow, handle
    call msgloop
ret
showForm endp

; Shamelessly copied from Examples section
msgloop proc
    LOCAL msg    :MSG
    LOCAL pmsg   :QWORD
    mov pmsg, ptr$(msg)                     ; get the msg structure address
    jmp gmsg                                ; jump directly to GetMessage()
  mloop:
    invoke TranslateMessage,pmsg
    invoke DispatchMessage,pmsg
  gmsg:
    test rax, rv(GetMessage,pmsg,0,0,0)     ; loop until GetMessage returns zero
    jnz mloop
    ret
msgloop endp

WndProc proc hWin:QWORD,uMsg:QWORD,wParam:QWORD,lParam:QWORD
.switch uMsg
        .case WM_DESTROY
            invoke PostQuitMessage, NULL
.endsw
invoke DefWindowProc, hWin, uMsg, wParam, lParam
ret
WndProc endp
End

This is my command to assemble and link.
ml64 /c /nologo py1.asm && link /ENTRY:LibMain /DLL /DEF:py1.def /OUT:py1.dll py1.obj
Please feel free to ask anything I missed in this post. I didn't include attachments. Please comment if you want my files.
 

NoCforMe

Assembly language programming should be fun. That's why I do it.

kcvinu


NoCforMe

It looks like it should be plenty fast.
The only things I can think of are the calls to LoadIcon(), LoadCursor() and CreateSolidBrush(), all of which will take some time. (The other DLL undoubtedly has to call LoadCursor()).
Does the other DLL load an icon?
Assembly language programming should be fun. That's why I do it.

kcvinu

QuoteDoes the other DLL load an icon?

Yes! And it is doing some more tasks as it is a working library.

NoCforMe

Just shooting in the dark here:
You can probably eliminate the calls to ShowWindow() and UpdateWindow(). I've found these to be completely unnecessary, as long as you include the WS_VISIBLE style when you create the window.
Assembly language programming should be fun. That's why I do it.

kcvinu

Let me try that. But there is a question. I tested without using those two functions and it ran without any problem. But it was a single window. There was no controls. What if we create some controls after creating the window handle ? Then, do we need to call ShowWindow & UpdateWindow ?

NoCforMe

I don't think so, but very easy to find out:
I would code it without those calls and see if the controls show up. Again, so long as they have the WS_VISIBLE style, they should show up right away.
If not, just put a call to UpdateWindow() in. But that shouldn't be necessary.

I've noticed that a lot of programmers tend to be superstitious about this, and I see needless calls like these sprinkled here and there, like some kind of charm or spell ...
Assembly language programming should be fun. That's why I do it.

zedd151

One thing caught my eye, kcvinu...
; This will be exported.   
NewForm Proc
    mov hInstance, rv(GetModuleHandle,0)
    mov hIcon,    rv(LoadIcon,hInstance,10)
    mov hCursor,  rv(LoadCursor,0,IDC_ARROW)
    mov hBrush,    rv(CreateSolidBrush,00EEEEEEh)
    mov rax, rv(makeWindow)
    RET
NewForm endp

How often is this called? Are you constantly creating the same brush, or loading the same icon?
Or is that call only made once? I assume only called once, but I have to ask to be sure.

Quote; Shamelessly copied from Examples section
:biggrin:  Many of us have used code from there  at one point or another. Totally legit usage.


Another thing I had noticed are you sure that these (arguments) are all dwords?
LibMain proc instance:DWORD, reason:DWORD, unused:DWORD
    ret
LibMain endp
And no return value?
I often see 'mov rax, 1' or 'mov rax, TRUE' before the return.

From a masm64 sdk example
LibMain proc instance:QWORD,reason:QWORD,unused:QWORD

    .if reason == DLL_PROCESS_ATTACH
      mov rax, TRUE                         ; return TRUE so DLL will start

    .elseif reason == DLL_PROCESS_DETACH

    .elseif reason == DLL_THREAD_ATTACH

    .elseif reason == DLL_THREAD_DETACH

    .endif

    ret

LibMain endp
The .if block can be omitted since we always want the dll to start.

kcvinu

@sudoku,
Thanks for the reply.

QuoteHow often is this called?

Only one time. Right before registering the window. And FYI, I just omitted entry point function with /NOENTRY,

zedd151

If you are only calling it once, I don't see how 30-ish milliseconds makes a lot of difference overall. Would be much different if you were making recursive calls to it, (where the extra waiting time is cumulative) in my opinion.

Maybe the coding gurus here have a way to speed up your code, or can offer other suggestions.

kcvinu

Okay. I have made some changes as per your suggestions and now ended up with this.
; Program : A dll to make a window from python.
; Author : kcvinu

include C:\masm64\include64\masm64rt.inc

.data?
hInstance     dq ?
hIcon         dq ?
hCursor       dq ?
hBrush        dq ?
     
.data
  classname db "KCV_Window", 0
  caption db "Just a window", 0

.CODE

registerClass proc
LOCAL wc      :WNDCLASSEX
mov hInstance, rv(GetModuleHandle, 0)
    mov hIcon,     rv(LoadIcon,hInstance, 10)
    mov hCursor,   rv(LoadCursor, 0, IDC_ARROW)
    mov hBrush,    rv(CreateSolidBrush,00EEEEEEh)
   
    mov wc.cbSize,         SIZEOF WNDCLASSEX
    mov wc.style,          CS_OWNDC or CS_HREDRAW or CS_VREDRAW or CS_BYTEALIGNCLIENT or CS_BYTEALIGNWINDOW
    mov wc.lpfnWndProc,    ptr$(WndProc)
    mov wc.cbClsExtra,     0
    mov wc.cbWndExtra,     0
    mrm wc.hInstance,      hInstance
    mrm wc.hIcon,          hIcon
    mrm wc.hCursor,        hCursor
    mrm wc.hbrBackground,  hBrush
    mov wc.lpszMenuName,   0
    mov wc.lpszClassName,  ptr$(classname)
    mrm wc.hIconSm,        hIcon
    invoke RegisterClassEx, ADDR wc
ret
registerClass endp

NewForm Proc
invoke registerClass   
invoke CreateWindowEx, 0, \
                          ADDR classname, addr caption, \
                          WS_OVERLAPPEDWINDOW or WS_VISIBLE,\
                          100, 100, 500, 400, 0,0,hInstance,0

ret
NewForm endp

LibMain proc instance:QWORD,reason:QWORD,unused:QWORD
    .if reason == DLL_PROCESS_ATTACH
mov rax, TRUE                         ; return TRUE so DLL will start
.endif
ret
LibMain endp


showForm proc
    LOCAL msg    :MSG
    LOCAL pmsg   :QWORD
    mov pmsg, ptr$(msg)                     ; get the msg structure address
    jmp gmsg                                ; jump directly to GetMessage()
  mloop:
    invoke TranslateMessage,pmsg
    invoke DispatchMessage,pmsg
  gmsg:
    test rax, rv(GetMessage,pmsg,0,0,0)     ; loop until GetMessage returns zero
    jnz mloop
    ret
showForm endp

WndProc proc hWin:QWORD,uMsg:QWORD,wParam:QWORD,lParam:QWORD
.switch uMsg
        .case WM_DESTROY
            invoke PostQuitMessage, NULL
.endsw
invoke DefWindowProc, hWin, uMsg, wParam, lParam
ret
WndProc endp
End

An this is the cmd
ml64 /nologo /c py1.asm && link /ENTRY:LibMain /DLL /DEF:py1.def /nologo /SUBSYSTEM:windows /OUT:py1.dll py1.obj
Now the speed is 30+ ms

zedd151

30+ milliseconds relative to the first version, or 30 ish milliseconds total?

Try only
--------------
mov rax, TRUE
ret
--------------

Without the .if statement... in LibMain

NoCforMe

Quote from: kcvinu on June 09, 2024, 11:49:43 AM; Program : A dll to make a window from python.
LibMain proc instance:QWORD,reason:QWORD,unused:QWORD
    .if reason == DLL_PROCESS_ATTACH
        mov rax, TRUE                        ; return TRUE so DLL will start
    .endif
    ret
LibMain endp
Just one little thing: what is RAX if reason != DLL_PROCESS_ATTACH?
Answer: undefined.
You might want to set it to zero in that case. (Remember, you're going to return something no matter what, meaning whatever is in RAX.)
Assembly language programming should be fun. That's why I do it.

NoCforMe

Quote from: sudoku on June 09, 2024, 11:54:29 AM30+ milliseconds relative to the first version, or 30 ish milliseconds total?

Try only
--------------
mov rax, TRUE
ret
--------------

Without the .if statement... in LibMain
Oh, come on: you can't be serious.
A simple comparison isn't going to take any time at all.
Assembly language programming should be fun. That's why I do it.