News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Profiling aka performance monitoring

Started by jj2007, February 15, 2022, 09:58:35 PM

Previous topic - Next topic

jj2007

I've improved my StackWalk macro, adding the capacity to "profile" procedures. Typical output for RichMasm, immediately after loading the 43k lines MasmBasic source:
Name              #called   total time   per call
SubRE                1720      1464 ms     851 us
ReOpenFile              1      1254 ms    1253 ms
WndProc              1295      1104 ms     852 us
CbTimer                24       856 ms      35 ms
SetLcBufferP            6       532 ms      88 ms
ConvertLinks            1       455 ms     455 ms
RestoreSel              3       220 ms      73 ms
SetSelFromB             3       219 ms      73 ms
ShowLineCol             1       156 ms     155 ms
CrtRichEdit             1        75 ms      74 ms


Interesting that the mai WndProc and the RichEdit subclass (SubRE) need almost exactly the same time :rolleyes:

jj2007

#1
New version. With a simple StackWalk profile, trace after the include line, plus a StackWalk show, I get this:
#count  proc             ...called by  args
114223  WndProc       7742630A      00441FAE  00000006  00000000  00000000
114224  SetTTP        WndProc
114225  ClosePopup    WndProc
114226  DispMenu      ClosePopup
114227  SetTTP        ClosePopup
114241  WndProcMenu   7742630A      002A05F2  00000088  00000004  00000000

Name              #called   total time   per call
AssRunThread            2        40 s    20110 ms
SubRE                5102      7194 ms    1410 us
OpenLink                3      4547 ms    1515 ms
AssRun                  2      4159 ms    2079 ms
FindOptions             2      3741 ms    1870 ms
FindTextP             410      3729 ms    9095 us
FiWo                   15      2602 ms     173 ms
MyTest                  2      1980 ms     990 ms
DrawPics               62      1473 ms      23 ms
WndProc              3594      1370 ms     381 us
SetLcBufferP           66      1352 ms      20 ms
ReOpenFile              1      1212 ms    1211 ms
CbTimer              1057      1112 ms    1052 us
ConvertLinks            1       440 ms     439 ms
ReSaveFile              3       438 ms     146 ms
SetSelFromB            18       238 ms      13 ms


Of course, this is useful only for serious applications: RichMasm is a 22k lines source. Performance is hardly affected, per call to a proc the profiling costs a handful of microseconds. The second "time eater" in the list above, SubRE, takes 1410 microseconds, so it's definitely not an issue.

Test version attached, extract both files to a folder and launch RichMasm.exe. Then press F9, hold Ctrl and click "No".
The legal stuff: use at your own risk.

jj2007

After some testing, I included the StackWalk macro into the MasmBasic package (updated today after exactly one year). I attach here only the latest RichMasm executable and the guide.

You can use StackWalk in your own non-MasmBasic projects:
useMB=1
if useMB
include \masm32\MasmBasic\MasmBasic.inc
StackWalk
else
include \masm32\include\masm32rt.inc
endif

.code
TheSleep proc arg1
  invoke Sleep, arg1
  ret
TheSleep endp
MyTest proc uses edi arg
  xor edi, edi
  .Repeat
inc edi
print str$(edi), " "
invoke TheSleep, edi
  .Until edi>=20
  print chr$(13, 10)
  ret
MyTest endp
start:
  xor esi, esi
  .Repeat
invoke MyTest, 123
inc si
  .Until esi>=3
  if useMB
StackWalk show
StackWalk end
  else
inkey "hit any key"
  endif
  exit
end start


The trick is to make them temporarily MasmBasic projects, with useMB=1. Once you are satisfied, put useMB=0, and your source is again pure Masm32 SDK :cool:

jj2007

Just a little warning: StackWalk won't work with AsmC.

EDIT: From MB version 28.2.22 onwards it does work also with AsmC :thumbsup:

jj2007

It seems that Nidud has left for good, he deletes all his posts, apparently. So I needed a workaround for the AsmC bug that produces a crash in the PROLOGUE macro.

It seems to work, see attachment (pure Masm32 SDK code), but I have not yet updated the StackWalk macro.

Prolo MACRO procname, flags, argbytes, localbytes, reglist, userparms
  if localbytes or argbytes
push ebp
ifdef __ASMC__
tmp$ CATSTR <defined: >, @Environ(oAssembler)
% echo tmp$
ife argbytes
nop
sub esp, 8
nop
endif
endif
mov ebp, esp
if localbytes
sub esp, localbytes
endif
  endif
  EXITM %argbytes
ENDM

jj2007

Apparently the whole AsmC sub-board has disappeared. Bad luck...

We solved the PROLOGUE issue in the thread that disappeared with it, but now, trying to apply the corrected StackWalk macro in my RichMasm source, I ran into another issue with AsmC (normally I use UAsm):
.if edx==Mirror$("data")
m2m ecx, BaseMenu+4 ; tabs data (the other cases will not be modified)
call SetTabs
.elseif (al=="." && edx!=Mirror$("unti") && edx!=Mirror$("brea") && dx!="ne") || dword ptr [esi-1]==Mirror$("else") || word ptr [esi-1]=="fi"
; starts with dot except .Until; also for macro if else endif endw
mov edx, esi
sub edx, txrg.lpstrText
mov dword ptr [esi-1], 9 ; add a tab where the dot sits
.if edx==3 && byte ptr [esi-3]==32
mov dword ptr [esi-3], 9 ; replace 2 leading spaces with a single tab
.endif
inc esi
.endif


Looks innocent, but later it throws an error, and only by using option /Fl to create a listing file I discovered this mess (you have to scroll right to see it in all its beauty):
0000CE56                        .elseif (al=="." && edx!=Mirror$("unti") && edx!=Mirror$("brea") && dx!="ne") || dword ptr [esi-1]==Mirror$("else") || word ptr [esi-1]=="fi"mov edx, esisub edx, txrg.lpstrTextmov dword ptr [esi-1], 9.if edx==3 && byte ptr [esi-3]==32mov dword ptr [esi-3], 9.endifinc esi.endif.if gks(VK_SHIFT)mov al, [esi-2].if al==32.if al==[esi-3]dec esidec esi.endif.elseif al==9dec esi.if !byte ptr [esi-3].if byte ptr [esi-2]!=9mov byte ptr [esi-1], " "mov byte ptr [esi], " "inc esiinc esi.endif.endif.elseif !almov byte ptr [esi-4], " "mov byte ptr [esi-3], " "dec esi.endif.endifmov byte ptr [esi-1], 0pop esidec esimov byte ptr [esi], 10dec txrg.chrg.cpMinExSetSel txrgcall GetDefColForceTextCol eaxinvoke SetSelFormat, 127, CFM_BACKCOLORinvoke RepSel, addr txrg, esixchg edx, Len(esi)sub edx, RetKeyDowndec edxadd PrevLen, edxmov eax, txrg.chrg.cpMincall SaveHistoryPT.endifClr RetKeyDownRefresh_ON.endif.elseif uMsg==WM_CHARmov eax, wParam.if eax==40 || eax==VK_SPACE || eax==VK_RETURN.if eax==VK_SPACE.if signed CharsPushPop>=3 && ExtAscand CharsPushPop, 0ForceTextCol Font_RED.endif.endifinc tpf.tfCtChgExGetSel  addr txrgmov edi, txrg.chrg.cpMinmov eax, edisub eax, txrg.chrg.cpMaxinc eaxbadsel: .if eax<=1lea esi, tmpBufferkmax=24mov eax, edisub eax, kmaxjs badselmov txrg.chrg.cpMin, eaxmov txrg.lpstrText, esiGetTextRangedeb 4, "gtr", eax, txrg.chrg.cpMin, txrg.chrg.cpMax, $esimov edi, esiadd edi, kmax-1xor ecx, ecxmov cpos, ecxmov setblue, ecxmov hascomment, ecxmov byte ptr [edi+2], cl.if wParam==VK_RETURNadd ecx, 32mov [edi], ecxxor ecx, ecx.endif.Repeatinc ecxdec edimov al, [edi].Until al<="," || al=="=" ; don't trigger My0left$(  or MyStruc.left$( mov fslen, ecxpush edixor ecx, ecxxor edx, edx.Repeatmov al, [edi].if al==9inc ecxadd al, 99.elseif al==" "inc edx.endifdec edi.Until al<=13mov cspaces, edxmov ctabs, ecxpop edimov byte ptr [edi], 10mov ecx, wParam.if ecx==VK_RETURNdec esidec txrg.chrg.cpMin.elsemov [esi+kmax], cl.endifFindString UserSC$, edi, 1deb 4, "FSA", $edi, $eax, $UserSC$.if !eaxFindString Keywords$, edi, 1deb 4, "FSB", $edi, $eax, $Keywords$:200.endifmov ecx, txrg.chrg.cpMin.if edx==OldPosKey && ecx==OldPosTextxor edx, edx.endif.if fslen<3 && wParam==VK_RETURNxor edx, edx.endif.if edxmov OldPosKey, edxmov OldPosText, ecxmov edx, edisub edx, esiinc edxadd txrg.chrg.cpMin, edxadd eax, esiadd eax, kmax+1sub eax, edimov cl, [eax].if cl==124push edimov esi, eaxinc esi.Repeatlodsb.if al==59or hascomment, -1.endif.if al=="|"m2m eax, 13stosbsub eax, 4mov ecx, ctabsdec ctabs.if byte ptr [esi]!=168mov edx, cspaces.if byte ptr [esi]==60inc esijmp noTabs.endifdec edx.if edx<8mov al, 32mov ecx, edx.endiftest ecx, ecxjl noTabsrep stosbinc ctabs.endif@@: stosbnoTabs: dec ecxjge @Blodsb.endifstosb.Until al==10 || al==13 || !aldec edi.if byte ptr [edi-1]==167inc setbluedec edi.endifmov byte ptr [edi], 0mov ecx, edipop edisub ecx, edidec ecxmov fslen, ecxmov esi, edi.Repeatlodsb.if al==168sub esi, edidec esimov cpos, esi.break.endifdec ecx.Until Sign?.else.if cl==167inc setblue.endifinc edi.if byte ptr [edi]>="a"and byte ptr [edi], 255-32.endifmov ecx, fslen.endifcrashjj 2003Refresh_OFFmov edx, fslen.if wParam==VK_RETURN && edx>2m2m dword ptr [edi+edx+2], 10.endifinvoke RepSel, addr txrg, edimov ecx, txrg.chrg.cpMinadd ecx, fslenmov txrg.chrg.cpMax, ecxcrashjj 2004.if setbluecrashjj 2005push   txrg.chrg.cpMinpush ecxxor ecx, ecx.Repeatinc ecxmov al, [edi+ecx].Until al<"A" || ecx>=fslenadd ecx, txrg.chrg.cpMinmov txrg.chrg.cpMax, ecxpush ecxExSetSel txrginvoke SetSelFormat, Font_BLUE, CFM_COLORmov ecx, fslensub ecx, 2.Repeatmov al, [edi+ecx]dec ecx.Until al<"A" || ecx<=0add ecx, txrg.chrg.cpMinpop eaxif 0cmp ecx, eaxjl @Finc hascommentje @Fmov eax, txrg.chrg.cpMinmov txrg.chrg.cpMin, ecxadd eax, fslenmov txrg.chrg.cpMax, eaxExSetSel txrginvoke SetSelFormat, Font_RED, CFM_COLOR@@:endifpop txrg.chrg.cpMaxpop txrg.chrg.cpMincrashjj 2006.endif.if cposmov ecx, txrg.chrg.cpMinadd ecx, cposmov txrg.chrg.cpMin, ecxinc ecxmov txrg.chrg.cpMax, ecx.elseinc txrg.chrg.cpMaxmrm txrg.chrg.cpMin, txrg.chrg.cpMax.endifcrashjj 2007ExSetSel txrgif 0.if wParam==VK_RETURNEatKey wParam.endifendifRefresh_ONcrashjj 2008pop edipop esijmp @RetNull.endif.endif.elseif eax=="p" || eax=="o" || eax=="u"  || eax=="s" || eax=="h"inc CharsPushPop.elseif eax==9crashjj 2090inc CharsTabExGetSel  addr txrgLineFromChar -1mov tabsL1, eaxCharFromLine eaxlea edx, [eax+3].if edx==txrg.chrg.cpMinmov txrg.chrg.cpMin, eax.endifLineFromChar txrg.chrg.cpMaxsub eax, tabsL1.if signed eax>0mov tabslc, eaxxor ebx, ebxmov CharsTab, ebx.if gks(VK_SHIFT)inc ebx.endiflea esi, tmpBuffermov txrg.lpstrText, esipush txrg.chrg.cpMinmov edi, txrg.chrg.cpMaxRefresh_OFFinc tpf.tfCtChg.Repeatdec tabslc.Break .if Sign?mov eax, tabslcadd eax, tabsL1CharFromLine eaxmov txrg.chrg.cpMin, eaxinc eaxinc eaxadd eax, ebxmov txrg.chrg.cpMax, eaxExSetSel txrgsm hRE_sub, EM_GETTEXTRANGE, 0, addr txrginvoke GetKeyState, VK_RSHIFTmov rShift, eaxsub txrg.chrg.cpMax, 2movzx eax, word ptr [esi]mov edx, offset tx9or ecx, -1.if ebxinc ecx.if al==9 || al==32 || sword ptr rShift<0dec ecxmov edx, chr$("  ").if ah<=32 || al!=9dec edidec edimov edx, offset Null$.endif.endif.elseif al==32dec edi.if ah==32dec ecxdec edi.endifsub txrg.chrg.cpMax, ecx.endifjecxzero @Finc ediinvoke RepSel, addr txrg, edx@@:.Until 0pop txrg.chrg.cpMinmov txrg.chrg.cpMax, edicrashjj 2091ExSetSel txrgRefresh_ONcrashjj 2092jmp @RetNull.endif.elseif eax==59crashjj 2099dec CharsTab.if !Sign?ForceTextCol Font_GREY.endif.elseand CharsPushPop, 0and CharsTab, 0.endifcrashjj 2100.elseif uMsg==WM_LBUTTONDBLCLKcall GetTickCountmov LbDbl, eax.elseif uMsg==WM_LBUTTONDOWN.if IsRgnMove==2inc IsRgnMovecall DrawPicsClr IsRgnMove.elseif IsRegn.if IsRgnMove==0mov eax, Img2Movemov ecx, [pImages+4*eax].if ecx && !gks(VK_SHIFT)mov IsRgnMove, ecxSetMyTimer TimerRegion, 100, CbTimer.elsemovzx edx, word ptr lParammov RgnDx0, edxmovzx eax, word ptr lParam+2mov RgnDy0, eaxinc IsRgnMove.endifjmp @RetNull.endif.endif.elseif uMsg==WM_LBUTTONUPmov eax, IsAnchor.if eaxmov ecx, eaxClr IsAnchorand eax, 00ffffffhshr ecx, 24add ecx, eaxSetSelB eax, ecx, distBM, anchor.endif.if IsRgnMove==1inc IsRgnMove.endif.endifDWP2:invoke CallWindowProc, opRE, hwnd, uMsg, wParam, lParam@RetEax:ret@RetNull:xor eax, eaxjmp @RetEaxCtrlF12Win:.if uMsg==WM_SIZINGinc yPosAuxBlock.elseif uMsg==WM_SIZEcall @Fand fgPipeShow, 0.elseif uMsg==WM_CAPTURECHANGED.if yPosAuxBlockcall @Fcall AdjRichEditm2m yPosAuxBlock, -123.endif.endifjmp DWP2@@:invoke GetClientRect, hwnd, addr rcmovlps xmm0, qword ptr rc.rightmovlps qword ptr xPosAux, xmm0add yPosAux, 3retnSubRE endpSaveUserPrefs procOpen "O", #7, UserPref$.if eax==INVALID_HANDLE_VALUEMsgBox 0, "Can't save", "Hi", MB_OK.elseStore #7, userprefsClose #7.endif@MbRetSaveUserPrefs endpDelRegions procpush esimov esi, offset hRegions.While NumRegionsinvoke DeleteObject, [esi] add esi, 4dec NumRegions.Endwpop esi@MbRetDelRegions endp; deb 4, "SubRE out", xPosAux, yPosAux
                           tmp_file.asm(13827) : error A2157: missing right parenthesis


Yep, there might be a right parenthesis missing... kind of :rolleyes:

Trial & error helps: when transforming the edx!=Mirror$("unti") into the simpler edx!="itnu", the error disappears.

HSE

Quote from: jj2007 on February 27, 2022, 08:02:14 PM
It seems that Nidud has left for good, he deletes all his posts, apparently.

Look like Nidud have becoming "incompatible"  :sad:

I hope he will write AsmC book, because all references are gone.
Equations in Assembly: SmplMath

jj2007

Yes, indeed. I had my little fights with him over the use of Utf-8, but that was just fun. He is a very good coder, and an asset for this forum. It would be sad if he left.

In the meantime, I've adapted the StackWalk macro, and it's now online with the MasmBasic package of 28 February. Check above how to use profiling in your own non-MasmBasic projects.

LiaoMi

Quote from: jj2007 on February 28, 2022, 11:27:54 AM
Apparently the whole AsmC sub-board has disappeared. Bad luck...

Hi,

what happened?! What are the reasons? Questions and bugs can always be posted here - https://github.com/nidud/asmc/issues

hutch--

Nidud deleted all of his postings and made a mess of the sub forum so I deleted it.

jj2007

More fun with the StackWalk macro: it can tell you now which OS DLL called your proggie. _User32 is typically a callback like WndProc, _Kernel32 could be the start of a thread, see MbSwShow below.

Besides, there is now a watch function for strings and DWORD variable. If, for example, you wonder who changes your string buffer "FindBuf", just insert on top of the source MbWatch equ offset FindBuf, and from then on every proc that writes to this buffer will be reported in the console. Isn't that fun? StackWalk adds, in the case of RichMasm, about 7.5% to the exe, apparently without any significant performance loss (and it can be easily switched off - no code generated).

I wonder if anybody has a major source to test this feature. Some reasonably complex GUI application, with well over 1000 lines? StackWalk works fine with the RichMasm source (23,000+ lines), see attachment. To see the output, drag the *.asc over RichMasm.exe, play a little bit, then hit F9, hold Ctrl and click NO.

#count  proc          ...called by      args
1621    SubRE         _User32           01272210  00000047  00000000  0018FA88
1622    GetCS8        Copy2Pinboard     0018FB5C  00000008  00000000
1624    SubRE         _User32           01370282  0000000B  00000000  00000000
1626    SaveSel       GetCS8
1631    SubRE         _User32           01370282  0000045E  0018FA98  00449FBC
1633    RestoreSel    GetCS8
1634    SetSelFromB   RestoreSel        0000025A  0000025A  00000011
1638    SubRE         _User32           01370282  00000437  00000000  0018FA54
1639    SetMyTimerP   SetSelFromB       FFFFFF8D  00000001
1640    SubRE         _User32           01370282  000000D6  0018FB64  0000025A
1641    GetRtfStream  Copy2Pinboard     00008003
1648    SubRE         _User32           01272210  000004E1  0000006E  00000064
1649    AdjRichEdit   Copy2Pinboard
1657    SubRE         _User32           01370282  0000000B  FFFFFFFF  00000000
1659    RestoreSel    Copy2Pinboard
1660    WndProc       _User32           01161FB8  00000138  D401017C  01D703A8
1661    SetSelFromB   RestoreSel        0000025A  0000025A  00000011
1665    SubRE         _User32           01370282  00000437  00000000  0018FA84
1666    SetMyTimerP   SetSelFromB       00000040  005919CC  04000004  01370282
1688    DrawPics      SubRE
1691    SubRE         _User32           01370282  000000D7  00000000  0018F978
1692    DelRegions    DrawPics
1693    SetLcBufferP  DrawPics
1694    gtbP          SetLcBufferP
1698    SubRE         _User32           01370282  0000044B  00000000  0018F98C
1699    GetNum        DrawPics          00000002  0000000B
1700    GetNum        DrawPics          00000001  00000004
1701    SubRE         _User32           01370282  000000D6  0018F9A0  000002E6
1702    GetNum        DrawPics          00000002  00000005
1703    GetNum        DrawPics          00000002  00000007
1704    GetNum        DrawPics          00000002  00000009
1709    CbTimer       _User32           01161FB8  00000113  FFFFFF81  D8DBF0A8
1710    DrawBookmark  CbTimer
1723    WndProc       _User32           01161FB8  00000111  0001007F  00000000
1724    MyTest        WndProc
1727    WndProc       _User32           01161FB8  00000006  00000000  02032248
1728    SetTTP        WndProc
1729    ClosePopup    WndProc
1730    DispMenu      ClosePopup
1731    SetTTP        ClosePopup
1792    WndProcMenu   _User32           00F52278  0000000F  00000000  00000000
1793    MbSwShow      _Kernel32         01161FB8
1798    TextOutPlus   WndProcMenu       00000040  00591DC4  00000108  00000003  00591DC4  00000006
1810    WndProcMenu   _User32           00F52278  0000000F  00000000  00000000
1815    TextOutPlus   WndProcMenu       00000040  00591E44  00000108  00000003  00591E44  00000006
1816    CbTimer       _User32           01161FB8  00000113  FFFFFF81  D8DBFD74
1817    DispMenu      CbTimer
1829    WndProcMenu   _User32           00F52278  0000000F  00000000  00000000
1834    TextOutPlus   WndProcMenu       00000040  00593C80  00000108  00000003  00593C80  00000006
1836    SubRE         _User32           01370282  00000100  00000011  401D0001

Name              #called   total time   per call
MyTest                  1      1956 ms    1956 ms
WndProc               616       734 ms    1191 us
CbTimer                39       211 ms    5402 us
FindBookmark            3        94 ms      31 ms
CrtRichEdit             1        72 ms      71 ms
cwex                  414        58 ms     141 us
InitBMetc               1        15 ms      14 ms
ReOpenFile              1        12 ms      11 ms
WndProcMenu            55        12 ms     223 us
SubMenuWin             40        12 ms     305 us
TextOutPlus            30         8 ms     267 us
SubRE                 243         8 ms      35 us
UserMenus               1         8 ms    8341 us
DrawPics                4         6 ms    1494 us
LoadBubbles             1         6 ms    5825 us
SubList               104         4 ms      36 us
## Hit any key. Ctrl C may cause data loss. Press 'h' to hide the console, 'r' to reset ##
## offset FindBuf changed by StartFind       -> macro.AC..
## offset FindBuf changed by StartFind       -> Version...
## offset FindBuf changed by StartFind       -> EXITM.n...

jj2007

I can't find the thread where we argued about how to exit (localbytes vs parmbytes) a PROLOGUE macro, but I just found this and would like to share it here:
Q94942: DOCERR: Value Returned By Prologue Macro Documented Incorrectly

QuoteAn application developed with the Microsoft Macro Assembler (MASM) can define a
macro that controls the code generated for the procedure prologue and epilogue.
The prologue macro must return a value.

Page 206 of the Microsoft Macro Assembler "Programmer's Guide" for MASM 6.0, and
page 202 of the same manual for MASM 6.1 includes the following statement
regarding the return value for the prologue macro:

  Your macro function must return the parmbytes parameter.

This statement is incorrect and should be corrected to read as follows:

  Your macro function should return the localbytes parameter.

The sample prologue macro provided on page 207 for MASM 6.0 and page 203 for MASM
6.1 also contains an error because it returns the %bytecount parameter instead
of the %numlocals parameter. Correct the EXITM statement in the macro to read as
follows:

  EXITM %numlocals


jj2007

The profiling stuff is now integrated in MasmBasic version 3 April 22. Here is an example on how to use StackWalk and SetWatch in a non-MasmBasic project.