News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Array Reverse with SSE

Started by frktons, December 04, 2012, 10:15:28 AM

Previous topic - Next topic

frktons

I think I reached the bottom line of SSSE3 code:
Quote
------------------------------------------------------------------------
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz

Instructions: MMX, SSE1, SSE2, SSE3, SSSE3
------------------------------------------------------------------------
1.191  cycles for Reverse Array with PSHUFB
2.064  cycles for Reverse Array with MOV/BSWAP
1.617  cycles for Reverse Array with MOV/BSWAP with 4 GPRs
   565  cycles for Reverse Array with PSHUFB using 4 xmm unrolled 4
   594  cycles for Reverse Array with PSHUFB using 4 xmm
------------------------------------------------------------------------
   794  cycles for Reverse Array with PSHUFB
2.064  cycles for Reverse Array with MOV/BSWAP
1.639  cycles for Reverse Array with MOV/BSWAP with 4 GPRs
   565  cycles for Reverse Array with PSHUFB using 4 xmm unrolled 4
   596  cycles for Reverse Array with PSHUFB using 4 xmm
------------------------------------------------------------------------

565 CPU Cycles seems to be the lowest reachable value on my system.
While other routines change performance show every time, using 4 xmm
unrolled 4 times tends to give always the same value.
There are only two days a year when you can't do anything: one is called yesterday, the other is called tomorrow, so today is the right day to love, believe, do and, above all, live.

Dalai Lama