Print Page - Processor differences with different instructions.

Title: Processor differences with different instructions.
Post by: hutch-- on February 12, 2018, 11:34:52 AM

One of the sobering facts writing code for x86 hardware is performance on any given procedure varies from one processor to another. On this HASWELL I am using I just did some speed tests on simple byte copy and the combination "rep movsb" is the fastest on small memory copy tasks. On very large data the SSE and AVX versions are faster but interestingly enough the historical "rep movsd" hybrid with "rep movsb" is actually slower on all data sizes.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

NOSTACKFRAME

bcopy proc

; rcx = src
; rdx = dst
; r8 = count

mov r11, rsi
mov r10, rdi

mov rsi, rcx
mov rdi, rdx
mov rcx, r8

rep movsb

mov rsi, r11
mov rdi, r10

ret

bcopy endp

STACKFRAME

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Title: Re: Processor differences with different instructions.
Post by: jj2007 on February 12, 2018, 11:39:33 AM

What about the "native" size for x64, rep movsq?

Title: Re: Processor differences with different instructions.
Post by: hutch-- on February 12, 2018, 02:35:15 PM

Its worth a try, I will have to do one at some stage. Long ago Intel made special provisions for using the old REP MOVS instructions and they were always reasonably fast but this stuff tends to vary from processor to processor. This is what I am getting timing wise with different instructions.

bcopy 7141
mcopy 7281
xmmcopya 5375
ymmcopya 5407
Press any key to continue...

The MASM Forum

Microsoft 64 bit MASM => MASM64 SDK => Topic started by: hutch-- on February 12, 2018, 11:34:52 AM