oh - and REP STOSD may still not be the fastest way to 0 the memory
It is, it is, at least for large buffers and for most CPUs - that's pretty obvious
we still have an issue to deal with, as far as timing tests:
once the stack space has been committed the first time, it's already committed on any successive pass
if you really want to know how many cycles it takes, you have to force the OS to "un-commit" before the next timing pass
Using StackBuffer will happen somewhere between "proc" and "endp". There are two extreme cases:
1. You use this proc once - then the handful of nanoseconds lost in committing will not matter.
2. You use this proc a Million times - then you will not want the OS to uncommit and re-commit that stack space every time you call the proc.
So in effect the timings are extremely valid as they are...
try not to REP STOSD with ECX = 0 :lol:
See source:
add ecx, 3 ; bufsize might be badly aligned
shr ecx, 2 ; stosD
xor eax, eax
rep stosdFor ecx=0, rep stosd does absolutely nothing... caution, though, passing negative buffer sizes might result in unexpected behaviour
