News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

New timing macros

Started by jj2007, May 16, 2022, 09:42:21 AM

Previous topic - Next topic

jj2007

Quote from: felipe on May 24, 2022, 07:59:48 AMif you can see the program running fast then you finished the job.

Sure, but between starting the work and finishing, there may be weeks or years. In the meantime, you might want to know which instructions are the fastest in the 100+ innermost loops of your program. For example, are inc, add and lea any different? Should you prefer one of them? Does one of them really suck on AMD cpus, or on the cheap Celerons that your company uses?

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
once:
0        Cycles for inc eax
0        Cycles for add eax,1
0        Cycles for lea eax,[eax+1]
0        Cycles for inc eax
0        Cycles for add eax,1
0        Cycles for lea eax,[eax+1]
0        Cycles for inc eax
0        Cycles for add eax,1
0        Cycles for lea eax,[eax+1]
0        Cycles for inc eax
0        Cycles for add eax,1
0        Cycles for lea eax,[eax+1]

100x:
87       Cycles for 100*inc eax
96       Cycles for 100*add eax,1
84       Cycles for 100*lea eax,[eax+1]
84       Cycles for 100*inc eax
81       Cycles for 100*add eax,1
83       Cycles for 100*lea eax,[eax+1]
84       Cycles for 100*inc eax
85       Cycles for 100*add eax,1
84       Cycles for 100*lea eax,[eax+1]
84       Cycles for 100*inc eax
78       Cycles for 100*add eax,1
84       Cycles for 100*lea eax,[eax+1]
1 bytes for inc eax
3 bytes for add eax, 1
3 bytes for lea eax, [eax+1]
100 bytes for 100*inc eax
300 bytes for 100*add eax, 1
300 bytes for 100*lea eax, [eax+1]

felipe

Actually that's a good point jj. You will need to test during the development. How you test performance specifically i suppose will always be a function (at least) of the type of program you are doing.