News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Benchmarking #1

Started by LordAdef, November 08, 2018, 09:51:15 AM

Previous topic - Next topic

LordAdef

Hi!
Benchmark threads are always fun!
This one compares 6 slight variations of the same thing.
The code is self explanatory and simple.
Please run more than one time.
(credits: MultiTimers.asm is a Marinus library, I changed it a bit for my needs)

Any input == very welcomed!

Below some runs, win10 64 bits, with mixed results:run #1:

Test A:  11164.696405
Test A2:  10969.563025

Test B1:  11163.659643
Test B2:  7418.603591  <=
Test B3:  8528.724294
Test B4:  7515.563650

run #2:

Test A:  7426.077177
Test A2:  7309.214785

Test B1:  7353.330968
Test B2:  5535.438268
Test B3:  6036.998152
Test B4:  5116.531635 <=

run #3:

Test A:  6920.856352
Test A2:  6881.980504

Test B1:  6897.736518
Test B2:  6877.552658
Test B3:  7867.744807
Test B4:  6859.148761 <=

run #4:

Test A:  5252.094767
Test A2:  5409.126498

Test B1:  5227.336097
Test B2:  5169.841561 <=
Test B3:  5940.916224
Test B4:  5178.427761

run #5:

Test A:  7407.886978
Test A2:  7352.514103

Test B1:  7413.644783
Test B2:  7349.053366 <=
Test B3:  8465.262267
Test B4:  7550.733337

jj2007

Performance test begins:

Test A:  16.096475
Test A2:  15.952794

Test B1:  16.059118
Test B2:  16.068971
Test B3:  14.664583
Test B4:  12.828328

Performance test begins:

Test A:  16.499195
Test A2:  16.025456

Test B1:  16.015603
Test B2:  16.436386
Test B3:  15.005725
Test B4:  16.929009

hutch--

Haswell E/EP

Performance test begins:

Test A:  6342.841800
Test A2:  6264.233800

Test B1:  6189.494200
Test B2:  6232.060000
Test B3:  5614.326400
Test B4:  5057.885700
Press any key to continue ...

Siekmanski

Win 8.1 i7-4930K

Performance test begins:

Test A:  20.877793
Test A2:  25.706060

Test B1:  13.619329
Test B2:  13.611995
Test B3:  12.385303
Test B4:  11.139614
Creative coders use backward thinking techniques as a strategy.

jimg

This is strange results.   I ran it eight times and here is what I got in this order-

i7-6700K @ 4.00GHz  16GB ram

Performance test begins:

Test A:  6347.129157
Test A2:  6281.266327

Test B1:  6281.467653
Test B2:  6276.904104
Test B3:  7174.972835
Test B4:  6278.010372
Press any key to continue ...

Performance test begins:

Test A:  6969.436843
Test A2:  6867.588864

Test B1:  6943.325842
Test B2:  7140.264365
Test B3:  7801.162425
Test B4:  6823.255602
Press any key to continue ...

Performance test begins:

Test A:  131.221323
Test A2:  129.574439

Test B1:  130.243055
Test B2:  131.238441
Test B3:  147.944116
Test B4:  130.590775
Press any key to continue ...

Performance test begins:

Test A:  6452.746624
Test A2:  6428.123592

Test B1:  6430.699946
Test B2:  6425.102177
Test B3:  7346.866769
Test B4:  6427.041340
Press any key to continue ...

Performance test begins:

Test A:  6766.527791
Test A2:  6820.317221

Test B1:  6823.176912
Test B2:  6828.999001
Test B3:  7792.153619
Test B4:  6830.869949
Press any key to continue ...

Performance test begins:

Test A:  1467.332506
Test A2:  1456.524033

Test B1:  1455.340607
Test B2:  1457.027347
Test B3:  1672.604833
Test B4:  1455.453789
Press any key to continue ...

Performance test begins:

Test A:  4020.751092
Test A2:  4014.235095

Test B1:  4014.843159
Test B2:  4011.761959
Test B3:  4593.479812
Test B4:  4017.827018
Press any key to continue ...

Performance test begins:

Test A:  2292.837180
Test A2:  2293.784790

Test B1:  2291.060508
Test B2:  2292.725020
Test B3:  2616.233174
Test B4:  2291.601123
Press any key to continue ...

aw27

There is something wrong with the tests.
Too much variance.



LordAdef

Quote from: AW on November 08, 2018, 05:51:22 PM
There is something wrong with the tests.
Too much variance.
Hi Jose, you are right! I've noticed this a bit after i posted but didn't have time to correct it.


Simple: Marinus routine was trashing ecx, which is used for the loop. I invert the order and everything is ok now. I'm "printfing" esi along with the results, only to make sure the values match.

Also, I introduced a "testC", where I replace the memory "cmp" with a bitmask for a single dd. I was actually surprise this method isn't much faster than test B4 (in many runs B4 actually wins)

testA == .IF macro
testB == Asm cmp
testC == Asm cmp with bit mask


Zip file attached here (I also changed it in the OP)
Performance test begins:

Test A:  5375.086564     esi: -589934592
Test A2:  5349.759736     esi: -589934592

Test B1:  5545.249401     esi: -589934592
Test B2:  3735.930867     esi: -589934592
Test B3:  4284.832457     esi: -589934592
Test B4:  3749.020400     esi: -589934592

Test C:  3740.795590     esi: -589934592 <==
Test C2:  4284.017780     esi: -589934592



Performance test begins:

Test A:  5392.448228     esi: -589934592
Test A2:  5430.023656     esi: -589934592

Test B1:  5407.402329     esi: -589934592
Test B2:  3760.736944     esi: -589934592
Test B3:  4279.407962     esi: -589934592
Test B4:  3738.908048     esi: -589934592 <==

Test C:  3752.077080     esi: -589934592
Test C2:  4292.570795     esi: -589934592


Performance test begins:

Test A:  5651.308023     esi: -589934592
Test A2:  5824.626002     esi: -589934592

Test B1:  5341.540761     esi: -589934592
Test B2:  3734.443370     esi: -589934592
Test B3:  4272.311083     esi: -589934592
Test B4:  3755.025452     esi: -589934592

Test C:  3742.148158     esi: -589934592 <==
Test C2:  4265.206180     esi: -589934592

hutch--

Haswell E/EP

Performance test begins:

Test A:  9534.766500     esi: -589934592
Test A2:  9507.666400     esi: -589934592

Test B1:  9375.177200     esi: -589934592
Test B2:  10077.460000     esi: -589934592
Test B3:  8395.685400     esi: -589934592
Test B4:  8371.843400     esi: -589934592

Test C:  8674.787200     esi: -589934592
Test C2:  8812.217000     esi: -589934592
Press any key to continue ...

aw27

It is better now.  :t

i7-8700K @3.70GHz

Performance test begins:

Test A:  4630.867087     esi: -589934592
Test A2:  4605.566564     esi: -589934592

Test B1:  4592.344042     esi: -589934592
Test B2:  3218.725584     esi: -589934592
Test B3:  3700.482799     esi: -589934592
Test B4:  3209.868642     esi: -589934592

Test C:  3214.837650     esi: -589934592
Test C2:  3694.835833     esi: -589934592
Press any key to continue ...

LiaoMi

i7-4810mq

Performance test begins:

Test A:  5785.226788     esi: -589934592
Test A2:  5919.552373     esi: -589934592

Test B1:  5890.797644     esi: -589934592
Test B2:  6444.327186     esi: -589934592
Test B3:  5861.268009     esi: -589934592
Test B4:  5852.370886     esi: -589934592

Test C:  5294.469393     esi: -589934592
Test C2:  4645.750682     esi: -589934592
Press any key to continue ...

Siekmanski

Performance test begins:

Test A:  5928.559356     esi: -589934592
Test A2:  5968.172701     esi: -589934592

Test B1:  6011.132141     esi: -589934592
Test B2:  6515.751723     esi: -589934592
Test B3:  5909.522649     esi: -589934592
Test B4:  5915.338751     esi: -589934592

Test C:  5960.283151     esi: -589934592
Test C2:  4739.045465     esi: -589934592
Press any key to continue ...
Creative coders use backward thinking techniques as a strategy.

FORTRANS

F:\TEMP\TEST>testbed
Performance test begins:

Test A:   30718.174796     esi: -589934592
Test A2:  38910.549805     esi: -589934592

Test B1:  35904.093118     esi: -589934592
Test B2:  35919.995418     esi: -589934592
Test B3:  33645.956399     esi: -589934592
Test B4:  32922.090657     esi: -589934592

Test C:   29274.593838     esi: -589934592
Test C2:  34299.185790     esi: -589934592
Press any key to continue ...

LordAdef

Hi,
I managed to consistently beat the previous times, by using jecxz (the jcxz algo is slightly slower, in my machine). This algo is winning in every run.
testD == jcxz / jecxz

Performance test begins:

Test A:  3782.529372     esi: -589934592
Test A2:  3851.802446     esi: -589934592

Test B1:  3759.764728     esi: -589934592
Test B2:  3870.786900     esi: -589934592
Test B3:  4400.107068     esi: -589934592
Test B4:  3789.277990     esi: -589934592

Test C:  3927.359910     esi: -589934592
Test C2:  4303.804513     esi: -589934592

Test D:  2968.206445     esi: -589934592
Test D2:  2552.859739     esi: -589934592 <==

Performance test begins:

Test A:  3797.827358     esi: -589934592
Test A2:  3845.992493     esi: -589934592

Test B1:  3785.149904     esi: -589934592
Test B2:  3762.598958     esi: -589934592
Test B3:  4386.555860     esi: -589934592
Test B4:  3757.113564     esi: -589934592

Test C:  3922.216578     esi: -589934592
Test C2:  4299.821201     esi: -589934592

Test D:  2906.951777     esi: -589934592
Test D2:  2505.042136     esi: -589934592 <==

Siekmanski


Performance test begins:

Test A:  6950.205613     esi: -589934592
Test A2:  6839.243535     esi: -589934592

Test B1:  6901.606908     esi: -589934592
Test B2:  6285.683585     esi: -589934592
Test B3:  5846.524628     esi: -589934592
Test B4:  4973.496701     esi: -589934592

Test C:  5590.548869     esi: -589934592
Test C2:  6348.380381     esi: -589934592

Test D:  3791.305599     esi: -589934592
Test D2:  3245.861136     esi: -589934592
Press any key to continue ...

Performance test begins:

Test A:  6885.879560     esi: -589934592
Test A2:  6869.180720     esi: -589934592

Test B1:  6890.368748     esi: -589934592
Test B2:  6103.885550     esi: -589934592
Test B3:  5580.385077     esi: -589934592
Test B4:  4904.580051     esi: -589934592

Test C:  5620.213463     esi: -589934592
Test C2:  6236.643973     esi: -589934592

Test D:  3740.016888     esi: -589934592
Test D2:  3049.097511     esi: -589934592
Press any key to continue ...
Creative coders use backward thinking techniques as a strategy.

six_L

QuotePerformance test begins:

Test A:  7568.845361     esi: -589934592
Test A2:  7395.525548     esi: -589934592

Test B1:  7401.951468     esi: -589934592
Test B2:  6732.240622     esi: -589934592
Test B3:  6061.375539     esi: -589934592
Test B4:  5388.420513     esi: -589934592

Test C:  6074.134579     esi: -589934592
Test C2:  6076.328869     esi: -589934592

Test D:  3519.020742     esi: -589934592
Test D2:  3055.109561     esi: -589934592
Press any key to continue ...
Say you, Say me, Say the codes together for ever.