Author Topic: Win64 memory copy benchmark.  (Read 1350 times)

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4866
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Win64 memory copy benchmark.
« on: July 08, 2016, 09:23:16 PM »
Atached is a benchmark for 3 64 bit memory copy algos. One using REP MOVSQ, the other two are respectively an aligned and unaligned XMM version. The two XMM versions do not have a tail trimmer for uneven byte counts as this is not what I am testing. It is a normal window app with the tests being run from the menu.
« Last Edit: July 09, 2016, 04:10:22 PM by hutch-- »
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

mineiro

  • Member
  • ***
  • Posts: 365
Re: Win64 memory copy benchmark.
« Reply #1 on: July 09, 2016, 06:14:28 AM »
I was not able to run your example, so much memory need, my machine have only 1GB, so to avoid swap I have tried assemble source code but no lucky, received errors:
mrm macro gives: error A2108: use of register assumed to ERROR (when filling wc structure)
movq mm7, lParam and movq lParam, mm7 gives: error A2222: x87 and MMX instructions disallowed; legacy FP state not saved in Win64
Used Microsoft (R) Macro Assembler (AMD64) Version 8.00.40310.39

I have done some tests, moving source/destination to rsp and using pop/push but rep movsq is more quickly on my machine. My test was done on ideal situation, so source,destination and sizeof are all multiple of 16.
I think the only change need is about a head and tail on rep movsq to deal with unaligned data.

Used serialized rdtsc.

Code: [Select]
rep movsq
time: 0 1219200426
time: 0 592314840
time: 0 577661148
time: 0 574660377
time: 1 592705800
time: 0 593109693
mov rsp,rsi   shr rcx,4   @again: pop qword ptr [rdi]     pop qword ptr [rdi+8]    add rdi,2*8   sub rcx,1   jnz @again
time: 0 732068685
time: 0 761895693
time: 0 744006060
time: 0 748483425
time: 1 755325054
time: 0 755273367
When I have time I'll check xmm version.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

  • Member
  • *****
  • Posts: 7627
  • Assembler is fun ;-)
    • MasmBasic
Re: Win64 memory copy benchmark.
« Reply #2 on: July 09, 2016, 06:44:07 AM »
There is an old memcpy thread somewhere, here is a bit about movlps etc.

Siekmanski

  • Member
  • *****
  • Posts: 1109
Re: Win64 memory copy benchmark.
« Reply #3 on: July 10, 2016, 03:12:23 PM »
A thread on Code Project....
Apex memmove - the fastest memcpy/memmove on x86/x64 ... EVER, written in C

http://www.codeproject.com/Articles/1110153/Apex-memmove-the-fastest-memcpy-memmove-on-x-x-EVE

jj2007

  • Member
  • *****
  • Posts: 7627
  • Assembler is fun ;-)
    • MasmBasic
Re: Win64 memory copy benchmark.
« Reply #4 on: July 10, 2016, 03:36:46 PM »
Quote
8 ) An optimized assembler version of these algorithms WILL be faster (I know because I have built assembler versions)

So that EVER refers to non-assembler code?  ::)

Siekmanski

  • Member
  • *****
  • Posts: 1109
Re: Win64 memory copy benchmark.
« Reply #5 on: July 10, 2016, 03:47:38 PM »
Funny article....

Code: [Select]
In late 2013, my OCD took over and I became totally obsessed with writing the fastest memcpy/memmove function in the world; which took over my work and life.
I became so obsessed that I wrote 80,000 lines of code in over 140 variations of memmove, mostly copies with small variations and tweaks.


hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4866
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Win64 memory copy benchmark.
« Reply #6 on: July 10, 2016, 06:30:02 PM »
From motor racing, "when the flag drops, the bullsh*t stops". Lets see what it clocks like.  :biggrin:
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

Jokaste

  • Regular Member
  • *
  • Posts: 18
  • Never be pleased, always improve
    • Grincheux's Tools
Re: Win64 memory copy benchmark.
« Reply #7 on: November 15, 2017, 04:56:49 AM »
Many years ago I had a 750GSX Suzuki Inazuma, a real pleasure.
Driving between the cars in Paris!

Here are my results:

1-10 000
2-10 655
3-9 655

In the order of the menu

Infos on my cpu

Socket 1         ID = 0
   Number of cores      2 (max 2)
   Number of threads   2 (max 2)
   Name         Intel Mobile Core 2 Duo T6570
   Codename      Penryn
   Specification      Intel(R) Core(TM)2 Duo CPU     T6570  @ 2.10GHz
   Package (platform ID)   Socket P (478) (0x7)
   CPUID         6.7.A
   Extended CPUID      6.17
   Core Stepping      R0
   Technology      45 nm
   Core Speed      2094.6 MHz
   Multiplier x Bus Speed   10.5 x 199.5 MHz
   Rated Bus speed      798.0 MHz
   Stock frequency      2100 MHz
   Instructions sets   MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, EM64T, VT-x
   L1 Data cache      2 x 32 KBytes, 8-way set associative, 64-byte line size
   L1 Instruction cache   2 x 32 KBytes, 8-way set associative, 64-byte line size
   L2 cache      2048 KBytes, 8-way set associative, 64-byte line size
   Max CPUID level      0000000Dh
   Max CPUID ext. level   80000008h
   Cache descriptor   Level 1, D, 32 KB, 1 thread(s)
   Cache descriptor   Level 1, I, 32 KB, 1 thread(s)
   Cache descriptor   Level 2, U, 2 MB, 2 thread(s)
   FID/VID Control      yes
   FID range      6.0x - 10.5x
   Max VID         1.150 V
Kenavo
---------------------------
Grincheux / Jokaste

Raistlin

  • Member
  • **
  • Posts: 246
Re: Win64 memory copy benchmark.
« Reply #8 on: November 15, 2017, 04:17:10 PM »
....and Symantec Endpoint protection strikes again = SONAR.Heur.RGC!g171
Quote
Reputation was not used in this detection.
and deletes the exe <sniff> - g0d I hate AV scanners that think they know viruses....
Sorry @hutch--  ca'nt run at work - will need to do so from home.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4866
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Win64 memory copy benchmark.
« Reply #9 on: November 15, 2017, 04:55:32 PM »
Its a malicious plot, there is not enough code in the benchmark to fit a virus into, let along a trojan.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

jj2007

  • Member
  • *****
  • Posts: 7627
  • Assembler is fun ;-)
    • MasmBasic
Re: Win64 memory copy benchmark.
« Reply #10 on: November 15, 2017, 07:57:30 PM »
Reputation was not used in this detection.

I think the correct wording should be "Symantec reputation was damaged in this detection" ;)

Apparently, you can fumble something (source):
Quote
This is a Heuristic (SONAR) detection. This is likely coming from the fact that in your SONAR policy under System Change events you have the options for 'DNS Change detected' and 'Host file change detected' set to Log. Check the policy to verify