News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

SwitchToThread vs Sleep(0)

Started by jj2007, May 11, 2013, 08:17:48 AM

Previous topic - Next topic

dedndave

it's attached in reply #57, Alex

here is version 1.0, which has the serialization code
but, version 1.1 is a little better, i think

Antariy

Ah, then results in my previous message was for right thing - those results are for dTime2.zip, but I thought maybe I missed something.

Antariy

Results for dTime1.zip:

4 34 0 0 10 7 7 20 33 13
27 32 51 82 21 8 22 24 24 26
65 67 79 104 104 92 74 135 36 74
Press any key to continue ...

dedndave

thanks, Alex   :t

as i suspected, dTime2 is better
it still needs work, but i like certain features

1) the code to be timed is in a PROC, rather than using macros before and after
    and, the timing function is a PROC
2) the function allows adjustment of thread priority level
3) the loop count has been eliminated
    we know from experience that a run of ~0.5 seconds yields good results
    so, the dTime function runs until time has elapsed, rather than some pre-determined loop count
4) we try to eliminate the use of CPUID to serialize code

FORTRANS

Hi Dave,

   Dtime2 results from the oldies.

Cheers,

Steve N.


P-III
4 3 2 3 5 0 0 0 3 2
99 97 95 95 95 96 102 98 93 100
117 114 110 120 122 118 120 121 117 121
Press any key to continue ...

P-MMX
4 3 4 4 4 5 4 4 4 4
19 19 60 9 19 20 21 19 19 19
137 130 126 130 130 133 130 130 130 130
Press any key to continue ...

Pentium M
3 3 3 0 5 3 4 3 3 0
118 118 126 135 113 124 119 118 118 118
32 34 34 35 34 34 34 34 34 34
Press any key to continue ...

dedndave

thanks Steve
the results are fairly stable
it's cool to see the differences, as processors evolved   :P

jj2007

Hi Dave,
You will love this one - AMD Athlon:
39 40 40 40 40 40 40 40 40 40
80 80 80 80 80 80 80 80 80 80
120 120 120 120 120 127 120 120 120 120

dedndave

those are great numbers, Jochen   :biggrin:

still, some issues to work out
i will play with it more tomorrow

what do you think of the code ?

jj2007

Quote from: dedndave on May 17, 2013, 11:43:19 PM
what do you think of the code ?

Looks ok, but seems not to like some other CPUs ;-)

In the meantime, I've worked out something based on Michael's idea. Here are some results for the timeslice length:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)
6       valid tests, 116714     avg kCycles
7       valid tests, 116547     avg kCycles
6       valid tests, 116796     avg kCycles
7       valid tests, 116774     avg kCycles
7       valid tests, 116523     avg kCycles
5       valid tests, 116784     avg kCycles
7       valid tests, 116728     avg kCycles
6       valid tests, 116711     avg kCycles
6       valid tests, 116722     avg kCycles
8       valid tests, 116493     avg kCycles

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
9       valid tests, 49989      avg kCycles
10      valid tests, 49811      avg kCycles
9       valid tests, 50027      avg kCycles
9       valid tests, 49934      avg kCycles
9       valid tests, 49973      avg kCycles
9       valid tests, 50028      avg kCycles
10      valid tests, 49887      avg kCycles
9       valid tests, 49899      avg kCycles
9       valid tests, 50019      avg kCycles
10      valid tests, 49805      avg kCycles


Gunther

Jochen,

my results:


Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
10      valid tests, 158126     avg kCycles
8       valid tests, 158434     avg kCycles
8       valid tests, 158700     avg kCycles
9       valid tests, 158526     avg kCycles
5       valid tests, 158665     avg kCycles
10      valid tests, 137386     avg kCycles
8       valid tests, 158749     avg kCycles
8       valid tests, 158870     avg kCycles
9       valid tests, 158754     avg kCycles
8       valid tests, 158651     avg kCycles
ok


Gunther
You have to know the facts before you can distort them.

dedndave

that seems to look pretty good on my prescott   :biggrin:

Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
9       valid tests, 93738      avg kCycles
9       valid tests, 93644      avg kCycles
9       valid tests, 93766      avg kCycles
10      valid tests, 93547      avg kCycles
9       valid tests, 93741      avg kCycles
9       valid tests, 93710      avg kCycles
9       valid tests, 93745      avg kCycles
9       valid tests, 93782      avg kCycles
9       valid tests, 93754      avg kCycles
9       valid tests, 93799      avg kCycles

Antariy

Hi Jochen :t

Intel(R) Celeron(R) CPU 2.13GHz (SSE3)
9       valid tests, 65206      avg kCycles
9       valid tests, 66479      avg kCycles
9       valid tests, 66365      avg kCycles
9       valid tests, 66431      avg kCycles
10      valid tests, 66133      avg kCycles
10      valid tests, 66256      avg kCycles
9       valid tests, 66600      avg kCycles
9       valid tests, 66179      avg kCycles
10      valid tests, 65903      avg kCycles
9       valid tests, 66016      avg kCycles
ok

Obivan

Hi Jochen,

here my results:
Intel(R) Xeon(R) CPU E31230 @ 3.20GHz (SSE4)
10      valid tests, 102589     avg kCycles
9       valid tests, 103229     avg kCycles
10      valid tests, 102377     avg kCycles
10      valid tests, 102594     avg kCycles
9       valid tests, 102947     avg kCycles
9       valid tests, 102224     avg kCycles
9       valid tests, 102925     avg kCycles
10      valid tests, 102337     avg kCycles
10      valid tests, 103004     avg kCycles
10      valid tests, 102270     avg kCycles

jj2007

Thanks, Obivan, Alex, Dave and Gunther.

Results look pretty stable now, the next step would be to design a timer macro that starts at the beginning of the time slice and stops shortly before... if I find time ;-)

dedndave

from what i am seeing, we don't violate that rule if we use 0.5 seconds as a loop count target
or, am i droping a decimal point someplace - lol