Here is a fairly straightforward example on how to time your code. Source and exe attached - you need the Masm64 SDK (https://masm32.com/board/index.php?topic=10052.0) to build it.
It compares two instructions, using two different but equivalent settings:
a) 1024000 tests of 1024 mov... instructions
b) 1024000/2 tests of 1024*2 of the same mov... instructions
1000 mega iterations, 1024 instructions
284 megacycles for mov eax,123
276 megacycles for mov rax,123
284 megacycles for mov eax,123
292 megacycles for mov rax,123
278 megacycles for mov eax,123
277 megacycles for mov rax,123
1000 mega iterations, 2048 instructions
277 megacycles for mov eax,123
324 megacycles for mov rax,123
279 megacycles for mov eax,123
330 megacycles for mov rax,123
274 megacycles for mov eax,123
327 megacycles for mov rax,123
Here is the source. On top is a macro that...
- with no args, i.e. tCycles, loads the initial cycle count into rbx
- with a string arg, i.e. tCycles mov eax, 123, subtracts rbx from the final count and displays the difference
include \masm64\include64\masm64rt.inc ; *** standard Masm64 SDK code ***
tCycles MACRO arg:VARARG
rdtsc
shl rdx, 32
or rax, rdx
ifb <arg>
mov rbx, rax
else
sub rax, rbx
sar rax, 20
invoke __imp__cprintf, cfm$("%i megacycles for &arg&\n"), rax
endif
ENDM
.code
entry_point proc
instructions=1024
tests=1024000
invoke __imp__cprintf, cfm$("%i mega iterations, %i instructions\n"), instructions*tests/1048576, instructions
REPEAT 3
tCycles
xor ecx, ecx
align 4
@@: REPEAT instructions
mov eax, 123456789 ; 5 bytes
ENDM
inc ecx
cmp ecx, tests
jnz @B
tCycles mov eax, 123 ; end of test, print "xx cycles for mov eax, 123"
tCycles
xor ecx, ecx
align 4
@@: REPEAT instructions
mov rax, 123456789 ; 7 bytes
ENDM
inc ecx
cmp ecx, tests
jnz @B
tCycles mov rax, 123
invoke __imp__cprintf, cfm$("\n")
ENDM
instructions=instructions*2
tests=tests/2
invoke __imp__cprintf, cfm$("%i mega iterations, %i instructions\n"), instructions*tests/1048576, instructions
REPEAT 3
tCycles
xor ecx, ecx
align 4
@@: REPEAT instructions
mov eax, 123456789 ; 5 bytes
ENDM
inc ecx
cmp ecx, tests
jnz @B
tCycles mov eax, 123 ; end of test, print "xx cycles for mov eax, 123"
tCycles
xor ecx, ecx
align 4
@@: REPEAT instructions
mov rax, 123456789 ; 7 bytes
ENDM
inc ecx
cmp ecx, tests
jnz @B
tCycles mov rax, 123
invoke __imp__cprintf, cfm$("\n")
ENDM
invoke __imp_MessageBoxA, 0, chr$("Now guess why the second run is slower for mov rax"), chr$("Mysteries of the cpu:"), MB_OK
invoke ExitProcess, 0 ; terminate process
entry_point endp
end
Now the question is obviously, "why is the second run slower for mov rax, 123?"
The answer is simple, but you won't find it on the Internet ;-)
Interesting. However, I wonder if you'd be so kind as to maybe post some code that doesn't include fancy macros and other baggage, for those among us, like myself, who are macro-averse.
I see that the most important part of this is simply the RDTSC instruction, which reads the current time-stamp counter. So this could be vastly simplified to make it more understandable, I think.
Folks,
I am really surprised that my post has been moved. I wrote a very, very simple program showing how to time code. It's a 100 times simpler that some other recent stuff I've seen in the Campus. Hutch would never have moved such simple stuff away from the Campus.
And yes, it does contain a macro. N00bs working with a
macro assembler should see what they are good for - and here they have a very clear and simple function. If you feel challenged by 10 lines of a simple macro: NASM and FASM are good bare metal assemblers. No need for MASM.
Quote from: NoCforMe on August 25, 2023, 05:56:31 AMI see that the most important part of this is simply the RDTSC instruction
No.
QuoteFolks,
I am really surprised that my post has been moved.
Everyone that has been following this saga can clearly see that
this topic is part of your feud with mineiro from here: https://masm32.com/board/index.php?topic=11176.0 Which was split off from a topic: https://masm32.com/board/index.php?topic=11165.0
within The Campus.
As is
mineiro's topic, testp (https://masm32.com/board/index.php?topic=11173.0) also part of that feud and I moved that one also from the Campus, to the Laboratory.
Word your comments about the move any way you like, but it does not change what has already transpired in the Campus between yourself and mineiro. Moving these topics was done to curtail the continuation of that feud
within the Campus.
I have made an effort to provide a simple example,
with source, and using Hutch' Masm64 SDK, how to time code. It was meant for the Campus. I am sorry that it didn't find the approval of Hutch 2.0 :cool:
Quote from: jj2007 on August 24, 2023, 11:26:55 PMNow the question is obviously, "why is the second run slower for mov rax, 123?"
It is interesting that nobody seems interested to find out or discuss why
mov rax,
under certain conditions, is slower than
mov eax.
Well, whilst obviously Not being a Programmer, (and with the Greatest of respect), may I suggest that even as a mere 'observer', perhaps the Campus is NOT the Correct place to have 'Placed' your Code example, no matter How useful and/or correct it may be.
One could argue the same 'Usefulness', about Many Code fragments and examples placed throughout this Forum.
From the 'Descriptor' for the Campus:
QuoteThe Campus
A protected forum where programmers learning assembler can ask questions in a sensible and safe atmosphere without being harassed or insulted. This is also targeted at experienced programmers from other languages learning assembler that don't want to be treated like kids.
I would suggest the Campus, is NOT the Correct Board to (in effect), 'Present' Code or Programs. Perhaps, in reply to a Question or request etc but NOT 'In the First Instance'.
I have had a cursory look here and on the Old UK site for examples of where this may have occurred (and am Not suggesting there aren't Any), but have not actually found any.
As the Code you have 'Presented' and specifically, from it's Title; "Timing your code: what's faster" it would seem to Me to be well placed / moved to the Laboratory.
QuoteThe Laboratory
This is the place to post assembler algorithms and code design for discussion, optimization and any other improvements that can be made on it. Post code here to be beaten to death to make it better, smaller, faster or more powerful.
Even as a Non Skilled Newcomer here, I am somewhat intrigued by much of the recent activity (not only by yourself), that has occurred in the Campus and can only suggest (from my own understanding), let alone what others have suggested, would NOT have been tolerated by Hutch.
Please understand I don't want to offend or upset anyone here, but am simply offering MY observations :smiley:
Quote from: stoo23 on August 25, 2023, 04:01:53 PMI would suggest the Campus, is NOT the Correct Board to (in effect), 'Present' Code or Programs. Perhaps, in reply to a Question or request etc but NOT 'In the First Instance'.
Point taken. Hutch most probably would not have moved a simple "teaching" example, but your logic is correct, Stewart.
:smiley: Thanks for your understanding JJ' :wink2: :thumbsup: