Hmmmm... I am getting similar results with NOTHING and a single NOP... for hutchs last posting here

Back to the drawing board....

`  2418 nothing  2714 1 single nop  2481 nothing  2589 1 single nop  2528 nothing  2823 1 single nop  2153 nothing  2293 1 single nop  2075 nothing  2231 1 single nop  2168 nothing  2418 1 single nop  2496 nothing  2699 1 single nop  2746 nothing  3074 1 single nop  Results  2383 nothing average  2605 single nop averagePress any key to continue...`
`; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤    include \masm32\include64\masm64rt.inc    .code; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤entry_point proc    call testproc    waitkey    invoke ExitProcess,0    retentry_point endp; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤    iterations equ <1000000000> testproc proc    LOCAL mem1  :QWORD    LOCAL mem2  :QWORD    LOCAL mem3  :QWORD    LOCAL mem4  :QWORD    LOCAL mem5  :QWORD    LOCAL mem6  :QWORD    LOCAL mem7  :QWORD    LOCAL cnt1  :QWORD    LOCAL cnt2  :QWORD    LOCAL time  :QWORD    LOCAL rslt1 :QWORD    LOCAL rslt2 :QWORD    USING rsi, rdi, rbx, r12, r13, r14, r15    SaveRegs    HighPriority    mov rslt1, 0    mov rslt2, 0    mov cnt2, 8  loopstart:  ; ------------------------------------    cpuid    call GetTickCount    mov time, rax    mov cnt1, iterations  @@:    sub cnt1, 1    jnz @B    call GetTickCount    sub rax, time    add rslt1, rax    conout "  ",str\$(rax)," nothing",lf  ; ------------------------------------    cpuid    call GetTickCount    mov time, rax    mov cnt1, iterations  @@:    nop    sub cnt1, 1    jnz @B    call GetTickCount    sub rax, time    add rslt2, rax    conout "  ",str\$(rax)," 1 single nop",lf  ; ------------------------------------    sub cnt2, 1    jnz loopstart    shr rslt1, 3    shr rslt2, 3    conout lf,"  Results",lf,lf    conout "  ",str\$(rslt1)," nothing average",lf    conout "  ",str\$(rslt2)," single nop average",lf,lf    NormalPriority    RestoreRegs    ret testproc endp; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤    end`
Seems the timing  code is taking up most of the time, or the reg/mem moves are as fast as nothing or a single nop.

later I tried 100 nop's and the 100 nop's were faster than nothing. No extra alignment, same code otherwise.