Hi sinsi,
your memcpy application brings:
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
----------------------------------------------
-- aligned strings --
498064 cycles - 10 ( 0) 0: crt_memcpy
890775 cycles - 10 ( 38) 1: movsd - mov eax,ecx
892888 cycles - 10 ( 37) 2: movsd - push ecx
353318 cycles - 10 ( 27) 3: movsb
-- unaligned strings --
1006514 cycles - 10 ( 0) 0: crt_memcpy
1033525 cycles - 10 ( 38) 1: movsd - mov eax,ecx
1033580 cycles - 10 ( 37) 2: movsd - push ecx
377061 cycles - 10 ( 27) 3: movsb
-- short strings 15 --
175505 cycles - 8000 ( 0) 0: crt_memcpy
335538 cycles - 8000 ( 38) 1: movsd - mov eax,ecx
344226 cycles - 8000 ( 37) 2: movsd - push ecx
291953 cycles - 8000 ( 27) 3: movsb
-- short strings 271 --
1033175 cycles - 8000 ( 0) 0: crt_memcpy
952811 cycles - 8000 ( 38) 1: movsd - mov eax,ecx
959677 cycles - 8000 ( 37) 2: movsd - push ecx
566948 cycles - 8000 ( 27) 3: movsb
-- short strings 2014 --
3224879 cycles - 4000 ( 0) 0: crt_memcpy
3153708 cycles - 4000 ( 38) 1: movsd - mov eax,ecx
3151176 cycles - 4000 ( 37) 2: movsd - push ecx
930276 cycles - 4000 ( 27) 3: movsb
--- ok ---
Gunther