The MASM Forum

General => The Laboratory => Topic started by: jj2007 on March 07, 2014, 02:21:09 PM

Title: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 07, 2014, 02:21:09 PM
I've done some timings for pattern find algos, and stumbled over a problem with BMBinSearch:

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
++++++++++++++++++++
41877   kCycles for 10 * MB Instr_
58159   kCycles for 10 * InString
57885   kCycles for 10 * crt_strstr
27089   kCycles for 10 * BMBinSearch
30342   kCycles for 10 * BMHBinsearch
14482   kCycles for 10 * InstrJJ

41882   kCycles for 10 * MB Instr_
58191   kCycles for 10 * InString
57842   kCycles for 10 * crt_strstr
27063   kCycles for 10 * BMBinSearch
30348   kCycles for 10 * BMHBinsearch
14515   kCycles for 10 * InstrJJ

41941   kCycles for 10 * MB Instr_
58122   kCycles for 10 * InString
57875   kCycles for 10 * crt_strstr
27137   kCycles for 10 * BMBinSearch
30288   kCycles for 10 * BMHBinsearch
14887   kCycles for 10 * InstrJJ

2110    kCycles for 10000 * MB Instr_
2540    kCycles for 10000 * InString
2224    kCycles for 10000 * crt_strstr
4144    kCycles for 10000 * BMBinSearch
3784    kCycles for 10000 * BMHBinsearch
788     kCycles for 10000 * InstrJJ

2110    kCycles for 10000 * MB Instr_
2546    kCycles for 10000 * InString
2218    kCycles for 10000 * crt_strstr
4245    kCycles for 10000 * BMBinSearch
3794    kCycles for 10000 * BMHBinsearch
789     kCycles for 10000 * InstrJJ

2188    kCycles for 10000 * MB Instr_
2541    kCycles for 10000 * InString
2218    kCycles for 10000 * crt_strstr
4133    kCycles for 10000 * BMBinSearch
3787    kCycles for 10000 * BMHBinsearch
795     kCycles for 10000 * InstrJJ

13      bytes for MB Instr_
10      bytes for InString
15      bytes for crt_strstr
19      bytes for BMBinSearch
19      bytes for BMHBinsearch
311     bytes for InstrJJ

39      = eax MB Instr_
39      = eax InString
39      = eax crt_strstr
-1      = eax BMBinSearch
38      = eax BMHBinsearch
39      = eax InstrJJ


Problem:

39      = eax crt_strstr
-1      = eax BMBinSearch
38      = eax BMHBinsearch


What am I doing wrong here?
   invoke BMBinSearch,0, SearchString, Len(SearchString), chr$(Pattern), 5

The Boyer-Moore-Horspool variant returns a valid match, using the same invoke line.
Title: Re: Instr timings: Boyer-Moore not working
Post by: hutch-- on March 07, 2014, 06:19:37 PM
You may not be doing anything wrong, the BM variants are very hard to exhaustively verify and this is after testing them on hundreds of million of string variations.
Title: Re: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 07, 2014, 08:38:07 PM
OK. I added an old algo of mine above, a bit bloated but the timings look competitive ;-)

EDIT: The attached new version yields some surprising results... ::)

Inter alia, in round 2 ("contains TXFS_") all algos return the correct result, but InString and the two Boyer-Moore variants need a lot of time for doing so... a factor 25 for Masm32 InString vs CRT strstr!

I have no clue why that is the case. Any ideas, or different results?

The comment * -= is Windows.inc and WinExtra.inc combined - more than 2 MB to mitigate cache effects.

AMD Athlon(tm) Dual Core Processor 4450B (SSE3)
++++++++++++++++++++
Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_DO_NOT_RESET_RM_AT_NE
XT_START
31716   kCycles for 3 * MB Instr_
46678   kCycles for 3 * InString
45015   kCycles for 3 * crt_strstr
15563   kCycles for 3 * BMBinSearch
15980   kCycles for 3 * BMHBinsearch
12553   kCycles for 3 * InstrJJ

31322   kCycles for 3 * MB Instr_
46642   kCycles for 3 * InString
45696   kCycles for 3 * crt_strstr
15170   kCycles for 3 * BMBinSearch
15942   kCycles for 3 * BMHBinsearch
12535   kCycles for 3 * InstrJJ

31228   kCycles for 3 * MB Instr_
46267   kCycles for 3 * InString
45374   kCycles for 3 * crt_strstr
15143   kCycles for 3 * BMBinSearch
16418   kCycles for 3 * BMHBinsearch
12569   kCycles for 3 * InstrJJ

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMBinSearch
2026935 = eax BMHBinsearch
2026935 = eax InstrJJ

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
251     kCycles for 3 * MB Instr_
9125    kCycles for 3 * InString
372     kCycles for 3 * crt_strstr
6391    kCycles for 3 * BMBinSearch
6416    kCycles for 3 * BMHBinsearch
74      kCycles for 3 * InstrJJ

251     kCycles for 3 * MB Instr_
8678    kCycles for 3 * InString
776     kCycles for 3 * crt_strstr
6404    kCycles for 3 * BMBinSearch
6408    kCycles for 3 * BMHBinsearch
74      kCycles for 3 * InstrJJ

252     kCycles for 3 * MB Instr_
8648    kCycles for 3 * InString
373     kCycles for 3 * crt_strstr
6412    kCycles for 3 * BMBinSearch
6483    kCycles for 3 * BMHBinsearch
76      kCycles for 3 * InstrJJ

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMBinSearch
18057   = eax BMHBinsearch
18057   = eax InstrJJ

Testing if [This is a simple string which has toward...] contains Dupli
3497    kCycles for 10000 * MB Instr_
4322    kCycles for 10000 * InString
3980    kCycles for 10000 * crt_strstr
4953    kCycles for 10000 * BMBinSearch
5665    kCycles for 10000 * BMHBinsearch
1030    kCycles for 10000 * InstrJJ

3500    kCycles for 10000 * MB Instr_
4328    kCycles for 10000 * InString
3986    kCycles for 10000 * crt_strstr
5050    kCycles for 10000 * BMBinSearch
5379    kCycles for 10000 * BMHBinsearch
1033    kCycles for 10000 * InstrJJ

3501    kCycles for 10000 * MB Instr_
4323    kCycles for 10000 * InString
3980    kCycles for 10000 * crt_strstr
4949    kCycles for 10000 * BMBinSearch
5529    kCycles for 10000 * BMHBinsearch
1036    kCycles for 10000 * InstrJJ

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ


Same phenomenon on my Celeron M:

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
245     kCycles for 3 * MB Instr_
8814    kCycles for 3 * InString
237     kCycles for 3 * crt_strstr
4764    kCycles for 3 * BMBinSearch
4783    kCycles for 3 * BMHBinsearch
58      kCycles for 3 * InstrJJ


EDIT(2): Here is the culprit for InString (\Masm32\m32lib\instring.asm, line 45):

if 0
   mov sLen, Len(lpSource)   ; 4830 cycles for the TXFS_ case (MasmBasic Len is a bit faster than StrLen...)
else
   invoke StrLen,lpSource   ; 8690 cycles for the TXFS_ case
   mov sLen, eax           ; source length
endif


The other algos don't check the source len beforehand, and therefore are much faster if the source string is long and there is an early match.
Title: Re: Instr timings: Boyer-Moore not working
Post by: dedndave on March 08, 2014, 11:04:59 PM
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
++++++++++++++++++2 of 20 tests valid, loop overhead is approx. 8/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
39731   kCycles for 3 * MB Instr_
56594   kCycles for 3 * InString
45181   kCycles for 3 * crt_strstr
12963   kCycles for 3 * BMBinSearch
17715   kCycles for 3 * BMHBinsearch
22831   kCycles for 3 * InstrJJ

39550   kCycles for 3 * MB Instr_
56437   kCycles for 3 * InString
45368   kCycles for 3 * crt_strstr
12850   kCycles for 3 * BMBinSearch
16990   kCycles for 3 * BMHBinsearch
23049   kCycles for 3 * InstrJJ

39448   kCycles for 3 * MB Instr_
56528   kCycles for 3 * InString
45606   kCycles for 3 * crt_strstr
12935   kCycles for 3 * BMBinSearch
17348   kCycles for 3 * BMHBinsearch
23027   kCycles for 3 * InstrJJ

2026771 = eax MB Instr_
2026771 = eax InString
2026771 = eax crt_strstr
2026771 = eax BMBinSearch
2026771 = eax BMHBinsearch
2026771 = eax InstrJJ

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
338     kCycles for 3 * MB Instr_
12471   kCycles for 3 * InString
340     kCycles for 3 * crt_strstr
5087    kCycles for 3 * BMBinSearch
5153    kCycles for 3 * BMHBinsearch
144     kCycles for 3 * InstrJJ

335     kCycles for 3 * MB Instr_
12728   kCycles for 3 * InString
341     kCycles for 3 * crt_strstr
5137    kCycles for 3 * BMBinSearch
5283    kCycles for 3 * BMHBinsearch
150     kCycles for 3 * InstrJJ

341     kCycles for 3 * MB Instr_
12273   kCycles for 3 * InString
340     kCycles for 3 * crt_strstr
5117    kCycles for 3 * BMBinSearch
5492    kCycles for 3 * BMHBinsearch
146     kCycles for 3 * InstrJJ

17893   = eax MB Instr_
17893   = eax InString
17893   = eax crt_strstr
17893   = eax BMBinSearch
17893   = eax BMHBinsearch
17893   = eax InstrJJ

Testing if [This is a simple string which has toward...] contains Dupli
4225    kCycles for 10000 * MB Instr_
5270    kCycles for 10000 * InString
3536    kCycles for 10000 * crt_strstr
7601    kCycles for 10000 * BMBinSearch
8514    kCycles for 10000 * BMHBinsearch
1461    kCycles for 10000 * InstrJJ

4261    kCycles for 10000 * MB Instr_
5135    kCycles for 10000 * InString
3627    kCycles for 10000 * crt_strstr
7884    kCycles for 10000 * BMBinSearch
8420    kCycles for 10000 * BMHBinsearch
1457    kCycles for 10000 * InstrJJ

4224    kCycles for 10000 * MB Instr_
5104    kCycles for 10000 * InString
3529    kCycles for 10000 * crt_strstr
7891    kCycles for 10000 * BMBinSearch
8378    kCycles for 10000 * BMHBinsearch
1461    kCycles for 10000 * InstrJJ

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ
Title: Re: Instr timings: Boyer-Moore not working
Post by: Gunther on March 08, 2014, 11:57:17 PM
Jochen,


Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
++++++++++++++++++++
Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
14679   kCycles for 3 * MB Instr_
25238   kCycles for 3 * InString
19752   kCycles for 3 * crt_strstr
4576    kCycles for 3 * BMBinSearch
7510    kCycles for 3 * BMHBinsearch
6096    kCycles for 3 * InstrJJ

14331   kCycles for 3 * MB Instr_
27113   kCycles for 3 * InString
22980   kCycles for 3 * crt_strstr
4748    kCycles for 3 * BMBinSearch
7450    kCycles for 3 * BMHBinsearch
6046    kCycles for 3 * InstrJJ

14210   kCycles for 3 * MB Instr_
25347   kCycles for 3 * InString
22804   kCycles for 3 * crt_strstr
4637    kCycles for 3 * BMBinSearch
7607    kCycles for 3 * BMHBinsearch
6078    kCycles for 3 * InstrJJ

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMBinSearch
2026935 = eax BMHBinsearch
2026935 = eax InstrJJ

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
124     kCycles for 3 * MB Instr_
4489    kCycles for 3 * InString
173     kCycles for 3 * crt_strstr
1103    kCycles for 3 * BMBinSearch
1092    kCycles for 3 * BMHBinsearch
16      kCycles for 3 * InstrJJ

127     kCycles for 3 * MB Instr_
6381    kCycles for 3 * InString
170     kCycles for 3 * crt_strstr
1162    kCycles for 3 * BMBinSearch
1162    kCycles for 3 * BMHBinsearch
16      kCycles for 3 * InstrJJ

124     kCycles for 3 * MB Instr_
7405    kCycles for 3 * InString
152     kCycles for 3 * crt_strstr
1141    kCycles for 3 * BMBinSearch
1155    kCycles for 3 * BMHBinsearch
16      kCycles for 3 * InstrJJ

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMBinSearch
18057   = eax BMHBinsearch
18057   = eax InstrJJ

Testing if [This is a simple string which has toward...] contains Dupli
1716    kCycles for 10000 * MB Instr_
2546    kCycles for 10000 * InString
1804    kCycles for 10000 * crt_strstr
2392    kCycles for 10000 * BMBinSearch
2447    kCycles for 10000 * BMHBinsearch
458     kCycles for 10000 * InstrJJ

1742    kCycles for 10000 * MB Instr_
2535    kCycles for 10000 * InString
1700    kCycles for 10000 * crt_strstr
2352    kCycles for 10000 * BMBinSearch
2425    kCycles for 10000 * BMHBinsearch
1111    kCycles for 10000 * InstrJJ

1656    kCycles for 10000 * MB Instr_
2525    kCycles for 10000 * InString
1854    kCycles for 10000 * crt_strstr
2295    kCycles for 10000 * BMBinSearch
2444    kCycles for 10000 * BMHBinsearch
1119    kCycles for 10000 * InstrJJ

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ

--- ok ---


Gunther
Title: Re: Instr timings: Boyer-Moore not working
Post by: Siekmanski on March 09, 2014, 12:44:51 AM
Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)
+++++++++++++++++++1 of 20 tests valid, loop overhead is approx. 13/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
18560   kCycles for 3 * MB Instr_
33473   kCycles for 3 * InString
26370   kCycles for 3 * crt_strstr
6243    kCycles for 3 * BMBinSearch
9480    kCycles for 3 * BMHBinsearch
7954    kCycles for 3 * InstrJJ

18421   kCycles for 3 * MB Instr_
33493   kCycles for 3 * InString
25844   kCycles for 3 * crt_strstr
6192    kCycles for 3 * BMBinSearch
9686    kCycles for 3 * BMHBinsearch
8163    kCycles for 3 * InstrJJ

18932   kCycles for 3 * MB Instr_
33374   kCycles for 3 * InString
26216   kCycles for 3 * crt_strstr
6265    kCycles for 3 * BMBinSearch
9498    kCycles for 3 * BMHBinsearch
8013    kCycles for 3 * InstrJJ

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMBinSearch
2026935 = eax BMHBinsearch
2026935 = eax InstrJJ

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
143     kCycles for 3 * MB Instr_
5745    kCycles for 3 * InString
312     kCycles for 3 * crt_strstr
1798    kCycles for 3 * BMBinSearch
1400    kCycles for 3 * BMHBinsearch
20      kCycles for 3 * InstrJJ

143     kCycles for 3 * MB Instr_
6027    kCycles for 3 * InString
181     kCycles for 3 * crt_strstr
1561    kCycles for 3 * BMBinSearch
1504    kCycles for 3 * BMHBinsearch
20      kCycles for 3 * InstrJJ

143     kCycles for 3 * MB Instr_
6113    kCycles for 3 * InString
175     kCycles for 3 * crt_strstr
1692    kCycles for 3 * BMBinSearch
1504    kCycles for 3 * BMHBinsearch
19      kCycles for 3 * InstrJJ

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMBinSearch
18057   = eax BMHBinsearch
18057   = eax InstrJJ

Testing if [This is a simple string which has toward...] contains Dupli
2166    kCycles for 10000 * MB Instr_
3191    kCycles for 10000 * InString
2224    kCycles for 10000 * crt_strstr
2962    kCycles for 10000 * BMBinSearch
3092    kCycles for 10000 * BMHBinsearch
787     kCycles for 10000 * InstrJJ

2109    kCycles for 10000 * MB Instr_
3189    kCycles for 10000 * InString
2233    kCycles for 10000 * crt_strstr
3004    kCycles for 10000 * BMBinSearch
3064    kCycles for 10000 * BMHBinsearch
585     kCycles for 10000 * InstrJJ

2194    kCycles for 10000 * MB Instr_
3162    kCycles for 10000 * InString
2214    kCycles for 10000 * crt_strstr
2927    kCycles for 10000 * BMBinSearch
3095    kCycles for 10000 * BMHBinsearch
547     kCycles for 10000 * InstrJJ

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ

--- ok ---
Title: Re: Instr timings: Boyer-Moore not working
Post by: Ficko on March 09, 2014, 01:53:08 AM
I just see no clock rate, why ?  ::)


AMD FX(tm)-8320 Eight-Core Processor            (SSE4)
++++++++++++++++++++
Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
22894   kCycles for 3 * MB Instr_
42856   kCycles for 3 * InString
33888   kCycles for 3 * crt_strstr
5857    kCycles for 3 * BMBinSearch
9078    kCycles for 3 * BMHBinsearch
9763    kCycles for 3 * InstrJJ

23025   kCycles for 3 * MB Instr_
42396   kCycles for 3 * InString
33900   kCycles for 3 * crt_strstr
5982    kCycles for 3 * BMBinSearch
9084    kCycles for 3 * BMHBinsearch
9871    kCycles for 3 * InstrJJ

24064   kCycles for 3 * MB Instr_
42467   kCycles for 3 * InString
33944   kCycles for 3 * crt_strstr
5732    kCycles for 3 * BMBinSearch
8951    kCycles for 3 * BMHBinsearch
9755    kCycles for 3 * InstrJJ

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMBinSearch
2026935 = eax BMHBinsearch
2026935 = eax InstrJJ

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
198     kCycles for 3 * MB Instr_
13271   kCycles for 3 * InString
295     kCycles for 3 * crt_strstr
1845    kCycles for 3 * BMBinSearch
1883    kCycles for 3 * BMHBinsearch
80      kCycles for 3 * InstrJJ

200     kCycles for 3 * MB Instr_
8780    kCycles for 3 * InString
296     kCycles for 3 * crt_strstr
4574    kCycles for 3 * BMBinSearch
1896    kCycles for 3 * BMHBinsearch
70      kCycles for 3 * InstrJJ

196     kCycles for 3 * MB Instr_
8768    kCycles for 3 * InString
296     kCycles for 3 * crt_strstr
5267    kCycles for 3 * BMBinSearch
1788    kCycles for 3 * BMHBinsearch
79      kCycles for 3 * InstrJJ

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMBinSearch
18057   = eax BMHBinsearch
18057   = eax InstrJJ

Testing if [This is a simple string which has toward...] contains Dupli
2243    kCycles for 10000 * MB Instr_
3933    kCycles for 10000 * InString
2940    kCycles for 10000 * crt_strstr
3951    kCycles for 10000 * BMBinSearch
4177    kCycles for 10000 * BMHBinsearch
749     kCycles for 10000 * InstrJJ

2274    kCycles for 10000 * MB Instr_
3898    kCycles for 10000 * InString
2911    kCycles for 10000 * crt_strstr
4014    kCycles for 10000 * BMBinSearch
4104    kCycles for 10000 * BMHBinsearch
741     kCycles for 10000 * InstrJJ

2570    kCycles for 10000 * MB Instr_
3881    kCycles for 10000 * InString
2887    kCycles for 10000 * crt_strstr
3903    kCycles for 10000 * BMBinSearch
4143    kCycles for 10000 * BMHBinsearch
913     kCycles for 10000 * InstrJJ

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ
Title: Re: Instr timings: Boyer-Moore not working
Post by: Ficko on March 09, 2014, 02:24:48 AM
I do not know exactly why the CPU clocking is not displayed but I do see a big bug in logic for timing multi core AMD cpus.

Quoteinvoke SetProcessAffinityMask, -1, 1   ; restrict to one core

Modern AMD multi core CPUs are running in very low clocking if the average of all cores are not above a certain % - I do not know the exact threshold but ~30% -

Mine is running at 1400Mhz but can go up to 3600Mhz on demand.

So to force the CPUs be clocked at max all cores have to be make working.
1 core 100% will not make the clocking go up.
Title: Re: Instr timings: Boyer-Moore not working
Post by: KeepingRealBusy on March 09, 2014, 02:36:29 AM
That is interesting, but not a happy interesting. On a 4 core AMD, to get good timings, you have to kick off 3 cores, locked to the cores, each in a dummy loop burning 30% cpu time, then kick off your timing test locked on the the 4th core? What a bummer. What does that do to the CPU temp?

Dave.
Title: Re: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 09, 2014, 03:29:08 AM
Quote from: Ficko on March 09, 2014, 02:24:48 AMSo to force the CPUs be clocked at max all cores have to be make working.

That's correct, but the purpose of lab benchmarks is to test relative performance of algos, not of CPUs.

Thanks to all for the interesting results. I am tempted to add InstrJJ to the MB lib, but with source & pattern len passed as parameter, i.e. not limited to zero-delimited strings. For parsing files, the source len is known, and it seems that without the test for zero bytes the algo could be 15-20% faster.
Title: Re: Instr timings: Boyer-Moore not working
Post by: Ficko on March 09, 2014, 03:35:35 AM
Yes, depending what you wanna time.

If you wanna compare 2 or more algos execution time but not interested how it compares to other CPUs than the above timing can work under certain circumstances just fine.

Like in my case with 8 cores the clock will not likely to change during the measurement.
But having 4 cores only making 1 core running on 100% would perhaps force clocking fluctuation.

I am not an AMD expert but may it is possible to force programmatically the CPU to run on a certain clock or not to change its clocking.
Title: Re: Instr timings: Boyer-Moore not working
Post by: KeepingRealBusy on March 09, 2014, 03:45:27 AM
Ficko,

I'll check to see what AMD says. I certainly want to run my programs as fast as I can (take as short a time as possible to run), and my algos are usually not easily able to be split into multi core threading algos (they all use huge memory buffers and if I used multi threading, I'd get killed with memory paging).

Dave.
Title: Re: Instr timings: Boyer-Moore not working
Post by: Ficko on March 09, 2014, 04:27:04 AM
Hi Dave,
Have found something maybe interessting little OS patching to squeeze out more of "Bulldozers".

https://forums.geforce.com/default/topic/548205/pc-components/amd-cpu-clock-guide-comparison-bulldozer-vs-piledriver-/ (https://forums.geforce.com/default/topic/548205/pc-components/amd-cpu-clock-guide-comparison-bulldozer-vs-piledriver-/)
Title: Re: Instr timings: Boyer-Moore not working
Post by: KeepingRealBusy on March 09, 2014, 04:57:31 AM
Thank you for the link.

Dave.
Title: Re: Instr timings: Boyer-Moore not working
Post by: Ficko on March 09, 2014, 06:32:07 AM
I have added a combined "Instr/RevInstr" algo more size as speed optimized but performs well on random lenght string/pattern since it will quit if LEN(pattern) > LEN(string).


AMD FX(tm)-8320 Eight-Core Processor            (SSE4)
++++++++++++++++++++
Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
22823   kCycles for 3 * MB Instr_
42185   kCycles for 3 * InString
33973   kCycles for 3 * crt_strstr
5473    kCycles for 3 * BMBinSearch
8673    kCycles for 3 * BMHBinsearch
9792    kCycles for 3 * InstrJJ
40499   kCycles for 3 * InstrFicko
1760    kCycles for 3 * RevInstrFicko

22453   kCycles for 3 * MB Instr_
42327   kCycles for 3 * InString
33933   kCycles for 3 * crt_strstr
5614    kCycles for 3 * BMBinSearch
8806    kCycles for 3 * BMHBinsearch
9729    kCycles for 3 * InstrJJ
40208   kCycles for 3 * InstrFicko
2333    kCycles for 3 * RevInstrFicko

22461   kCycles for 3 * MB Instr_
42165   kCycles for 3 * InString
33840   kCycles for 3 * crt_strstr
5849    kCycles for 3 * BMBinSearch
8807    kCycles for 3 * BMHBinsearch
9910    kCycles for 3 * InstrJJ
39540   kCycles for 3 * InstrFicko
1867    kCycles for 3 * RevInstrFicko

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMBinSearch
2026935 = eax BMHBinsearch
2026935 = eax InstrJJ
2026935 = eax InstrFicko
2026935 = eax RevInstrFicko

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
192     kCycles for 3 * MB Instr_
12205   kCycles for 3 * InString
300     kCycles for 3 * crt_strstr
1485    kCycles for 3 * BMBinSearch
1634    kCycles for 3 * BMHBinsearch
61      kCycles for 3 * InstrJJ
1426    kCycles for 3 * InstrFicko
1580    kCycles for 3 * RevInstrFicko

194     kCycles for 3 * MB Instr_
12415   kCycles for 3 * InString
293     kCycles for 3 * crt_strstr
1469    kCycles for 3 * BMBinSearch
1605    kCycles for 3 * BMHBinsearch
79      kCycles for 3 * InstrJJ
2699    kCycles for 3 * InstrFicko
1472    kCycles for 3 * RevInstrFicko

195     kCycles for 3 * MB Instr_
11464   kCycles for 3 * InString
295     kCycles for 3 * crt_strstr
1472    kCycles for 3 * BMBinSearch
1538    kCycles for 3 * BMHBinsearch
81      kCycles for 3 * InstrJJ
1616    kCycles for 3 * InstrFicko
1586    kCycles for 3 * RevInstrFicko

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMBinSearch
18057   = eax BMHBinsearch
18057   = eax InstrJJ
18057   = eax InstrFicko
2028751 = eax RevInstrFicko

Testing if [This is a simple string which has toward...] contains Dupli
2256    kCycles for 10000 * MB Instr_
4103    kCycles for 10000 * InString
2918    kCycles for 10000 * crt_strstr
4292    kCycles for 10000 * BMBinSearch
4371    kCycles for 10000 * BMHBinsearch
766     kCycles for 10000 * InstrJJ
3641    kCycles for 10000 * InstrFicko
2847    kCycles for 10000 * RevInstrFicko

2294    kCycles for 10000 * MB Instr_
4264    kCycles for 10000 * InString
2919    kCycles for 10000 * crt_strstr
4260    kCycles for 10000 * BMBinSearch
4542    kCycles for 10000 * BMHBinsearch
1072    kCycles for 10000 * InstrJJ
3689    kCycles for 10000 * InstrFicko
2833    kCycles for 10000 * RevInstrFicko

2236    kCycles for 10000 * MB Instr_
4074    kCycles for 10000 * InString
3002    kCycles for 10000 * crt_strstr
4284    kCycles for 10000 * BMBinSearch
4103    kCycles for 10000 * BMHBinsearch
806     kCycles for 10000 * InstrJJ
3623    kCycles for 10000 * InstrFicko
2912    kCycles for 10000 * RevInstrFicko

75      = eax MB Instr_
75      = eax InString
75      = eax crt_strstr
0       = eax BMBinSearch
75      = eax BMHBinsearch
75      = eax InstrJJ
75      = eax InstrFicko
75      = eax RevInstrFicko
--- ok ---
Title: Re: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 09, 2014, 08:55:41 AM
Thanks a lot, Ficko. I have added your algo to the original testbed, attached below, and here are my results:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
+++++++13 of 20 tests valid, loop overhead is approx. 10/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_DO_NOT_RESET_RM
_AT_NEXT_START
28115   kCycles for 3 * MB Instr_
32154   kCycles for 3 * InString
29185   kCycles for 3 * crt_strstr
18713   kCycles for 3 * BMHBinSearch, using len
7549    kCycles for 3 * BMHBinsearch, known length
9636    kCycles for 3 * InstrJJ
60151   kCycles for 3 * InstrFicko
5172    kCycles for 3 * RevInstrFicko
4894    kCycles for 3 * MB Rinstr

28012   kCycles for 3 * MB Instr_
32150   kCycles for 3 * InString
29138   kCycles for 3 * crt_strstr
18742   kCycles for 3 * BMHBinSearch, using len
7538    kCycles for 3 * BMHBinsearch, known length
9577    kCycles for 3 * InstrJJ
60022   kCycles for 3 * InstrFicko
5156    kCycles for 3 * RevInstrFicko
4900    kCycles for 3 * MB Rinstr

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMHBinSearch, using len
2026935 = eax BMHBinsearch, known length
2026935 = eax InstrJJ
2026935 = eax InstrFicko
2026935 = eax RevInstrFicko
2026935 = eax MB Rinstr

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
245     kCycles for 3 * MB Instr_
8609    kCycles for 3 * InString
237     kCycles for 3 * crt_strstr
11329   kCycles for 3 * BMHBinSearch, using len
120     kCycles for 3 * BMHBinsearch, known length
57      kCycles for 3 * InstrJJ
5011    kCycles for 3 * InstrFicko
5172    kCycles for 3 * RevInstrFicko
4875    kCycles for 3 * MB Rinstr

245     kCycles for 3 * MB Instr_
8611    kCycles for 3 * InString
237     kCycles for 3 * crt_strstr
11332   kCycles for 3 * BMHBinSearch, using len
118     kCycles for 3 * BMHBinsearch, known length
58      kCycles for 3 * InstrJJ
5011    kCycles for 3 * InstrFicko
5186    kCycles for 3 * RevInstrFicko
4935    kCycles for 3 * MB Rinstr

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMHBinSearch, using len
18057   = eax BMHBinsearch, known length
18057   = eax InstrJJ
18057   = eax InstrFicko
2028751 = eax RevInstrFicko
2028751 = eax MB Rinstr

Testing if [This is a simple string which has at the...] contains Dupli
2869    kCycles for 10000 * MB Instr_
3168    kCycles for 10000 * InString
2433    kCycles for 10000 * crt_strstr
6243    kCycles for 10000 * BMHBinSearch, using len
5298    kCycles for 10000 * BMHBinsearch, known length
869     kCycles for 10000 * InstrJJ
4091    kCycles for 10000 * InstrFicko
1974    kCycles for 10000 * RevInstrFicko
1171    kCycles for 10000 * MB Rinstr

2881    kCycles for 10000 * MB Instr_
3177    kCycles for 10000 * InString
2433    kCycles for 10000 * crt_strstr
6366    kCycles for 10000 * BMHBinSearch, using len
5329    kCycles for 10000 * BMHBinsearch, known length
870     kCycles for 10000 * InstrJJ
4085    kCycles for 10000 * InstrFicko
1974    kCycles for 10000 * RevInstrFicko
1171    kCycles for 10000 * MB Rinstr


Note that I replaced the not-so-reliable standard Boyer-Moore algo with its Horspool variant, in two flavours, one using len() to determine the size of the source, the other using a precalculated source length.
Title: Re: Instr timings: Boyer-Moore not working
Post by: dedndave on March 09, 2014, 11:33:48 AM
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz (SSE3)
++++++++++++++++4 of 20 tests valid, loop overhead is approx. 11/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
39387   kCycles for 3 * MB Instr_
57139   kCycles for 3 * InString
45535   kCycles for 3 * crt_strstr
31791   kCycles for 3 * BMHBinSearch, using len
12079   kCycles for 3 * BMHBinsearch, known length
23119   kCycles for 3 * InstrJJ
85090   kCycles for 3 * InstrFicko
6215    kCycles for 3 * RevInstrFicko
5637    kCycles for 3 * MB Rinstr

39401   kCycles for 3 * MB Instr_
56685   kCycles for 3 * InString
45497   kCycles for 3 * crt_strstr
31761   kCycles for 3 * BMHBinSearch, using len
11845   kCycles for 3 * BMHBinsearch, known length
23208   kCycles for 3 * InstrJJ
85101   kCycles for 3 * InstrFicko
6475    kCycles for 3 * RevInstrFicko
5527    kCycles for 3 * MB Rinstr

2026771 = eax MB Instr_
2026771 = eax InString
2026771 = eax crt_strstr
2026771 = eax BMHBinSearch, using len
2026771 = eax BMHBinsearch, known length
2026771 = eax InstrJJ
2026771 = eax InstrFicko
2026771 = eax RevInstrFicko
2026771 = eax MB Rinstr

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
336     kCycles for 3 * MB Instr_
12614   kCycles for 3 * InString
340     kCycles for 3 * crt_strstr
19855   kCycles for 3 * BMHBinSearch, using len
172     kCycles for 3 * BMHBinsearch, known length
142     kCycles for 3 * InstrJJ
5976    kCycles for 3 * InstrFicko
6208    kCycles for 3 * RevInstrFicko
5294    kCycles for 3 * MB Rinstr

335     kCycles for 3 * MB Instr_
12458   kCycles for 3 * InString
370     kCycles for 3 * crt_strstr
19813   kCycles for 3 * BMHBinSearch, using len
170     kCycles for 3 * BMHBinsearch, known length
142     kCycles for 3 * InstrJJ
5924    kCycles for 3 * InstrFicko
6269    kCycles for 3 * RevInstrFicko
5520    kCycles for 3 * MB Rinstr

17893   = eax MB Instr_
17893   = eax InString
17893   = eax crt_strstr
17893   = eax BMHBinSearch, using len
17893   = eax BMHBinsearch, known length
17893   = eax InstrJJ
17893   = eax InstrFicko
2028587 = eax RevInstrFicko
2028587 = eax MB Rinstr

Testing if [This is a simple string which has at the...] contains Dupli
4017    kCycles for 10000 * MB Instr_
4751    kCycles for 10000 * InString
3348    kCycles for 10000 * crt_strstr
9705    kCycles for 10000 * BMHBinSearch, using len
7654    kCycles for 10000 * BMHBinsearch, known length
1389    kCycles for 10000 * InstrJJ
5823    kCycles for 10000 * InstrFicko
5583    kCycles for 10000 * RevInstrFicko
1948    kCycles for 10000 * MB Rinstr

4044    kCycles for 10000 * MB Instr_
4726    kCycles for 10000 * InString
3368    kCycles for 10000 * crt_strstr
9642    kCycles for 10000 * BMHBinSearch, using len
7655    kCycles for 10000 * BMHBinsearch, known length
1390    kCycles for 10000 * InstrJJ
5974    kCycles for 10000 * InstrFicko
5585    kCycles for 10000 * RevInstrFicko
1944    kCycles for 10000 * MB Rinstr

70      = eax MB Instr_
70      = eax InString
70      = eax crt_strstr
70      = eax BMHBinSearch, using len
70      = eax BMHBinsearch, known length
70      = eax InstrJJ
70      = eax InstrFicko
70      = eax RevInstrFicko
70      = eax MB Rinstr
Title: Re: Instr timings: Boyer-Moore not working
Post by: Siekmanski on March 09, 2014, 07:02:46 PM
[codeIntel(R) Core(TM) i7-4930K CPU @ 3.40GHz (SSE4)
+++++++++++++++++++1 of 20 tests valid, loop overhead is approx. 15/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
15985   kCycles for 3 * MB Instr_
27849   kCycles for 3 * InString
22156   kCycles for 3 * crt_strstr
14938   kCycles for 3 * BMHBinSearch, using len
6891    kCycles for 3 * BMHBinsearch, known length
6804    kCycles for 3 * InstrJJ
43312   kCycles for 3 * InstrFicko
1695    kCycles for 3 * RevInstrFicko
1368    kCycles for 3 * MB Rinstr

15967   kCycles for 3 * MB Instr_
27955   kCycles for 3 * InString
22148   kCycles for 3 * crt_strstr
14943   kCycles for 3 * BMHBinSearch, using len
6900    kCycles for 3 * BMHBinsearch, known length
6802    kCycles for 3 * InstrJJ
43323   kCycles for 3 * InstrFicko
1691    kCycles for 3 * RevInstrFicko
1371    kCycles for 3 * MB Rinstr

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMHBinSearch, using len
2026935 = eax BMHBinsearch, known length
2026935 = eax InstrJJ
2026935 = eax InstrFicko
2026935 = eax RevInstrFicko
2026935 = eax MB Rinstr

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
145     kCycles for 3 * MB Instr_
8892    kCycles for 3 * InString
171     kCycles for 3 * crt_strstr
8151    kCycles for 3 * BMHBinSearch, using len
78      kCycles for 3 * BMHBinsearch, known length
21      kCycles for 3 * InstrJJ
4158    kCycles for 3 * InstrFicko
2351    kCycles for 3 * RevInstrFicko
1844    kCycles for 3 * MB Rinstr

143     kCycles for 3 * MB Instr_
5014    kCycles for 3 * InString
167     kCycles for 3 * crt_strstr
8134    kCycles for 3 * BMHBinSearch, using len
74      kCycles for 3 * BMHBinsearch, known length
19      kCycles for 3 * InstrJJ
1592    kCycles for 3 * InstrFicko
1730    kCycles for 3 * RevInstrFicko
1357    kCycles for 3 * MB Rinstr

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMHBinSearch, using len
18057   = eax BMHBinsearch, known length
18057   = eax InstrJJ
18057   = eax InstrFicko
2028751 = eax RevInstrFicko
2028751 = eax MB Rinstr

Testing if [This is a simple string which has at the...] contains Dupli
1825    kCycles for 10000 * MB Instr_
2745    kCycles for 10000 * InString
1525    kCycles for 10000 * crt_strstr
3327    kCycles for 10000 * BMHBinSearch, using len
2402    kCycles for 10000 * BMHBinsearch, known length
520     kCycles for 10000 * InstrJJ
2895    kCycles for 10000 * InstrFicko
2707    kCycles for 10000 * RevInstrFicko
828     kCycles for 10000 * MB Rinstr

1842    kCycles for 10000 * MB Instr_
2746    kCycles for 10000 * InString
1532    kCycles for 10000 * crt_strstr
3354    kCycles for 10000 * BMHBinSearch, using len
2409    kCycles for 10000 * BMHBinsearch, known length
523     kCycles for 10000 * InstrJJ
2897    kCycles for 10000 * InstrFicko
2705    kCycles for 10000 * RevInstrFicko
835     kCycles for 10000 * MB Rinstr

70      = eax MB Instr_
70      = eax InString
70      = eax crt_strstr
70      = eax BMHBinSearch, using len
70      = eax BMHBinsearch, known length
70      = eax InstrJJ
70      = eax InstrFicko
70      = eax RevInstrFicko
70      = eax MB Rinstr

--- ok ---]
Title: Re: Instr timings: Boyer-Moore not working
Post by: sinsi on March 09, 2014, 07:37:25 PM
Maybe you should test a deliberate failure? I would imagine that testing strings would give lots of fails i.e. substring not found.
Title: Re: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 09, 2014, 09:08:16 PM
Quote from: sinsi on March 09, 2014, 07:37:25 PM
Maybe you should test a deliberate failure? I would imagine that testing strings would give lots of fails i.e. substring not found.
Good idea - "No such string" in Windows.inc:

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains No such string
28235   kCycles for 3 * MB Instr_
30764   kCycles for 3 * InString
28484   kCycles for 3 * crt_strstr
30098   kCycles for 3 * BMHBinSearch, using len
18890   kCycles for 3 * BMHBinsearch, known length
8827    kCycles for 3 * InstrJJ
54616   kCycles for 3 * InstrFicko
36272   kCycles for 3 * RevInstrFicko
30595   kCycles for 3 * MB Rinstr

28191   kCycles for 3 * MB Instr_
30792   kCycles for 3 * InString
28334   kCycles for 3 * crt_strstr
30071   kCycles for 3 * BMHBinSearch, using len
19009   kCycles for 3 * BMHBinsearch, known length
8826    kCycles for 3 * InstrJJ
54628   kCycles for 3 * InstrFicko
36266   kCycles for 3 * RevInstrFicko
30599   kCycles for 3 * MB Rinstr
Title: Re: Instr timings: Boyer-Moore not working
Post by: Gunther on March 09, 2014, 09:44:06 PM
Jochen,

the timings from 0803:

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz (SSE4)
+++++++++++++++++++1 of 20 tests valid, loop overhead is approx. 6/3 cycles

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_RM_FLAG_D
O_NOT_RESET_RM_AT_NEXT_START
14064   kCycles for 3 * MB Instr_
24877   kCycles for 3 * InString
19507   kCycles for 3 * crt_strstr
13258   kCycles for 3 * BMHBinSearch, using len
6090    kCycles for 3 * BMHBinsearch, known length
6102    kCycles for 3 * InstrJJ
38735   kCycles for 3 * InstrFicko
3651    kCycles for 3 * RevInstrFicko
1460    kCycles for 3 * MB Rinstr

15733   kCycles for 3 * MB Instr_
24878   kCycles for 3 * InString
23220   kCycles for 3 * crt_strstr
13145   kCycles for 3 * BMHBinSearch, using len
6102    kCycles for 3 * BMHBinsearch, known length
6154    kCycles for 3 * InstrJJ
38324   kCycles for 3 * InstrFicko
1499    kCycles for 3 * RevInstrFicko
1164    kCycles for 3 * MB Rinstr

2026935 = eax MB Instr_
2026935 = eax InString
2026935 = eax crt_strstr
2026935 = eax BMHBinSearch, using len
2026935 = eax BMHBinsearch, known length
2026935 = eax InstrJJ
2026935 = eax InstrFicko
2026935 = eax RevInstrFicko
2026935 = eax MB Rinstr

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains TXFS_
124     kCycles for 3 * MB Instr_
4392    kCycles for 3 * InString
147     kCycles for 3 * crt_strstr
7152    kCycles for 3 * BMHBinSearch, using len
68      kCycles for 3 * BMHBinsearch, known length
42      kCycles for 3 * InstrJJ
3423    kCycles for 3 * InstrFicko
2005    kCycles for 3 * RevInstrFicko
1501    kCycles for 3 * MB Rinstr

125     kCycles for 3 * MB Instr_
7601    kCycles for 3 * InString
148     kCycles for 3 * crt_strstr
10462   kCycles for 3 * BMHBinSearch, using len
89      kCycles for 3 * BMHBinsearch, known length
41      kCycles for 3 * InstrJJ
3503    kCycles for 3 * InstrFicko
3667    kCycles for 3 * RevInstrFicko
1454    kCycles for 3 * MB Rinstr

18057   = eax MB Instr_
18057   = eax InString
18057   = eax crt_strstr
18057   = eax BMHBinSearch, using len
18057   = eax BMHBinsearch, known length
18057   = eax InstrJJ
18057   = eax InstrFicko
2028751 = eax RevInstrFicko
2028751 = eax MB Rinstr

Testing if [comment * -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=...] contains No such string

14289   kCycles for 3 * MB Instr_
23227   kCycles for 3 * InString
21122   kCycles for 3 * crt_strstr
21344   kCycles for 3 * BMHBinSearch, using len
17299   kCycles for 3 * BMHBinsearch, known length
5317    kCycles for 3 * InstrJJ
35784   kCycles for 3 * InstrFicko
16121   kCycles for 3 * RevInstrFicko
15147   kCycles for 3 * MB Rinstr

14124   kCycles for 3 * MB Instr_
23151   kCycles for 3 * InString
18370   kCycles for 3 * crt_strstr
21374   kCycles for 3 * BMHBinSearch, using len
14369   kCycles for 3 * BMHBinsearch, known length
5282    kCycles for 3 * InstrJJ
33821   kCycles for 3 * InstrFicko
16071   kCycles for 3 * RevInstrFicko
18405   kCycles for 3 * MB Rinstr

0       = eax MB Instr_
0       = eax InString
0       = eax crt_strstr
0       = eax BMHBinSearch, using len
0       = eax BMHBinsearch, known length
0       = eax InstrJJ
0       = eax InstrFicko
0       = eax RevInstrFicko
0       = eax MB Rinstr

Testing if [This is a simple string which has at the...] contains Dupli
1567    kCycles for 10000 * MB Instr_
2370    kCycles for 10000 * InString
1332    kCycles for 10000 * crt_strstr
2814    kCycles for 10000 * BMHBinSearch, using len
2039    kCycles for 10000 * BMHBinsearch, known length
1180    kCycles for 10000 * InstrJJ
6366    kCycles for 10000 * InstrFicko
2789    kCycles for 10000 * RevInstrFicko
698     kCycles for 10000 * MB Rinstr

1571    kCycles for 10000 * MB Instr_
2389    kCycles for 10000 * InString
1319    kCycles for 10000 * crt_strstr
2804    kCycles for 10000 * BMHBinSearch, using len
4874    kCycles for 10000 * BMHBinsearch, known length
481     kCycles for 10000 * InstrJJ
2591    kCycles for 10000 * InstrFicko
2420    kCycles for 10000 * RevInstrFicko
747     kCycles for 10000 * MB Rinstr

70      = eax MB Instr_
70      = eax InString
70      = eax crt_strstr
70      = eax BMHBinSearch, using len
70      = eax BMHBinsearch, known length
70      = eax InstrJJ
70      = eax InstrFicko
70      = eax RevInstrFicko
70      = eax MB Rinstr

--- ok ---


Gunther
Title: Re: Instr timings: Boyer-Moore not working
Post by: jj2007 on March 09, 2014, 11:38:59 PM
Thanks, Gunther. Your results show clearly that getting the source len is not a realistic option for the Boyer-Moore algos:

7152    kCycles for 3 * BMHBinSearch, using len
68      kCycles for 3 * BMHBinsearch, known length
42      kCycles for 3 * InstrJJ

It gives me some ideas  :P
Title: Re: Instr timings: Boyer-Moore not working
Post by: Gunther on March 10, 2014, 01:23:23 AM
Jochen,

Quote from: jj2007 on March 09, 2014, 11:38:59 PM
It gives me some ideas  :P

for MB or assembly language in general?

Gunther
Title: Re: Instr timings: Boyer-Moore not working
Post by: MichaelW on March 17, 2014, 01:07:13 AM
Quote from: Ficko on March 09, 2014, 03:35:35 AM
Like in my case with 8 cores the clock will not likely to change during the measurement.
But having 4 cores only making 1 core running on 100% would perhaps force clocking fluctuation.

I am not an AMD expert but may it is possible to force programmatically the CPU to run on a certain clock or not to change its clocking.

The timing in clock cycles is independent of the clock speed.