Timings for bswap, ror, imul, push+pop vs mov [esp+x], nnn, lodsd vs mov eax,..

Started by jj2007, November 16, 2021, 08:57:25 AM

Previous topic - Next topic

jj2007

Quote from: FORTRANS on December 03, 2021, 12:51:20 AM
  Well, the idea was that Intel had improved LODSB in some of their CPUs.

Hi Steve,

I haven't tested lodsb, but lodsd is clearly a fast instruction on recent Intel CPUs, in comparison with the expanded mov eax, [esi] plus add esi, 4. What is striking, though, is that rep lodsd is often slower than the equivalent loop.

FORTRANS

Hi,

   Memory error.  Faster REP MOVS/STOS, not LODS can
be identified with CPUID.  REP LODS is a bit useless anyway,
overwriting previously loaded values.  Oh well, maybe next
time.

Regards,

Steve N.