News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Bug in Boyer-Moore BM.asm ?

Started by Rav, May 07, 2022, 06:13:41 AM

Previous topic - Next topic

jj2007

#45
@Mineiro - timings using the second version:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 1000 times in a 954kB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring
1119 ms for CRT strstr          result=977321
225 ms  for BMHBinsearch        result=977321
100 ms  for BMBinSearch         result=977321
94 ms   for Boyer-M Mineiro     result=977321
78 ms   for MB Instr_()         result=977321

... using a 43 bytes substring
1106 ms for CRT strstr          result=977321
326 ms  for BMHBinsearch        result=977321
252 ms  for BMBinSearch         result=977321
241 ms  for Boyer-M Mineiro     result=977321
81 ms   for MB Instr_()         result=977321

... using a 22 bytes substring
1105 ms for CRT strstr          result=977321
640 ms  for BMHBinsearch        result=977321
436 ms  for BMBinSearch         result=977321
410 ms  for Boyer-M Mineiro     result=0
80 ms   for MB Instr_()         result=977321

... using a 11 bytes substring
1099 ms for CRT strstr          result=977321
935 ms  for BMHBinsearch        result=977321
1284 ms for BMBinSearch         result=977321
1372 ms for Boyer-M Mineiro     result=977321
78 ms   for MB Instr_()         result=977321

... using a 6 bytes substring
1090 ms for CRT strstr          result=977321
339 ms  for BMHBinsearch        result=977321
302 ms  for BMBinSearch         result=0
762 ms  for Boyer-M Mineiro     result=977321
80 ms   for MB Instr_()         result=977321

**** last sub$ was [Duplic] -- hit any key ****


Building this project requires MasmBasic version 10 May 22 :cool:

P.S.: The second attachment is built with both your first and your second code, and a finer granularity for the length of the substrings showing where the problems are (idem for BMBinSearch - the Horspool variant seems ok). Here is its output:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 10 times in a 954kB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Duplicate include file windows.inc<crlf>echo ------------------------------------------]
11 ms   for CRT strstr          result=977321
2401 µs for BMHBinsearch        result=977321
1082 µs for BMBinSearch         result=977321
996 µs  for Boyer-M Mineiro 1   result=977321
998 µs  for Boyer-M Mineiro 2   result=977321
835 µs  for MB Instr_()         result=977321

... using a 71 bytes substring [Duplicate include file windows.inc<crlf>echo ------------------------------]
11 ms   for CRT strstr          result=977321
2989 µs for BMHBinsearch        result=977321
1280 µs for BMBinSearch         result=977321
1177 µs for Boyer-M Mineiro 1   result=0
1185 µs for Boyer-M Mineiro 2   result=0
834 µs  for MB Instr_()         result=977321

... using a 60 bytes substring [Duplicate include file windows.inc<crlf>echo -------------------]
11 ms   for CRT strstr          result=977321
3095 µs for BMHBinsearch        result=977321
1620 µs for BMBinSearch         result=977321
1399 µs for Boyer-M Mineiro 1   result=0
1403 µs for Boyer-M Mineiro 2   result=0
834 µs  for MB Instr_()         result=977321

... using a 51 bytes substring [Duplicate include file windows.inc<crlf>echo ----------]
11 ms   for CRT strstr          result=977321
3022 µs for BMHBinsearch        result=977321
1891 µs for BMBinSearch         result=977321
1637 µs for Boyer-M Mineiro 1   result=977321
1665 µs for Boyer-M Mineiro 2   result=977321
843 µs  for MB Instr_()         result=977321

... using a 43 bytes substring [Duplicate include file windows.inc<crlf>echo --]
12 ms   for CRT strstr          result=977321
3486 µs for BMHBinsearch        result=977321
2676 µs for BMBinSearch         result=977321
2751 µs for Boyer-M Mineiro 1   result=977321
2615 µs for Boyer-M Mineiro 2   result=977321
837 µs  for MB Instr_()         result=977321

... using a 37 bytes substring [Duplicate include file windows.inc<crlf>e]
11 ms   for CRT strstr          result=977321
4709 µs for BMHBinsearch        result=977321
2301 µs for BMBinSearch         result=977321
2152 µs for Boyer-M Mineiro 1   result=977321
2377 µs for Boyer-M Mineiro 2   result=977321
1082 µs for MB Instr_()         result=977321

... using a 31 bytes substring [Duplicate include file windows.]
11 ms   for CRT strstr          result=977321
5671 µs for BMHBinsearch        result=977321
2737 µs for BMBinSearch         result=977321
2609 µs for Boyer-M Mineiro 1   result=977321
2715 µs for Boyer-M Mineiro 2   result=977321
1113 µs for MB Instr_()         result=977321

... using a 26 bytes substring [Duplicate include file win]
11 ms   for CRT strstr          result=977321
6292 µs for BMHBinsearch        result=977321
4377 µs for BMBinSearch         result=977321
4338 µs for Boyer-M Mineiro 1   result=0
3937 µs for Boyer-M Mineiro 2   result=0
1188 µs for MB Instr_()         result=977321

... using a 22 bytes substring [Duplicate include file]
11 ms   for CRT strstr          result=977321
6236 µs for BMHBinsearch        result=977321
4743 µs for BMBinSearch         result=977321
4248 µs for Boyer-M Mineiro 1   result=0
4701 µs for Boyer-M Mineiro 2   result=0
1156 µs for MB Instr_()         result=977321

... using a 19 bytes substring [Duplicate include f]
11 ms   for CRT strstr          result=977321
7136 µs for BMHBinsearch        result=977321
10 ms   for BMBinSearch         result=977321
10 ms   for Boyer-M Mineiro 1   result=977321
9423 µs for Boyer-M Mineiro 2   result=977321
921 µs  for MB Instr_()         result=977321

... using a 16 bytes substring [Duplicate includ]
11 ms   for CRT strstr          result=977321
8211 µs for BMHBinsearch        result=977321
5034 µs for BMBinSearch         result=977321
3128 µs for Boyer-M Mineiro 1   result=0
3165 µs for Boyer-M Mineiro 2   result=0
935 µs  for MB Instr_()         result=977321

... using a 14 bytes substring [Duplicate incl]
11 ms   for CRT strstr          result=977321
8635 µs for BMHBinsearch        result=977321
6008 µs for BMBinSearch         result=977321
5635 µs for Boyer-M Mineiro 1   result=977321
5693 µs for Boyer-M Mineiro 2   result=977321
948 µs  for MB Instr_()         result=977321

... using a 12 bytes substring [Duplicate in]
11 ms   for CRT strstr          result=977321
9443 µs for BMHBinsearch        result=977321
8710 µs for BMBinSearch         result=977321
4186 µs for Boyer-M Mineiro 1   result=977321
4147 µs for Boyer-M Mineiro 2   result=977321
973 µs  for MB Instr_()         result=977321

... using a 10 bytes substring [Duplicate ]
11 ms   for CRT strstr          result=977321
6199 µs for BMHBinsearch        result=977321
3345 µs for BMBinSearch         result=0
7191 µs for Boyer-M Mineiro 1   result=977321
6984 µs for Boyer-M Mineiro 2   result=977321
1518 µs for MB Instr_()         result=977321

... using a 9 bytes substring [Duplicate]
11 ms   for CRT strstr          result=977321
3473 µs for BMHBinsearch        result=977321
2853 µs for BMBinSearch         result=977321
5692 µs for Boyer-M Mineiro 1   result=977321
5833 µs for Boyer-M Mineiro 2   result=977321
1111 µs for MB Instr_()         result=977321

... using a 8 bytes substring [Duplicat]
11 ms   for CRT strstr          result=977321
3303 µs for BMHBinsearch        result=977321
2778 µs for BMBinSearch         result=977321
6208 µs for Boyer-M Mineiro 1   result=0
6207 µs for Boyer-M Mineiro 2   result=0
1128 µs for MB Instr_()         result=977321

... using a 7 bytes substring [Duplica]
11 ms   for CRT strstr          result=977321
3625 µs for BMHBinsearch        result=977321
3216 µs for BMBinSearch         result=977321
7117 µs for Boyer-M Mineiro 1   result=977321
7055 µs for Boyer-M Mineiro 2   result=977321
1237 µs for MB Instr_()         result=977321

... using a 6 bytes substring [Duplic]
11 ms   for CRT strstr          result=977321
3689 µs for BMHBinsearch        result=977321
3376 µs for BMBinSearch         result=0
8198 µs for Boyer-M Mineiro 1   result=977321
8125 µs for Boyer-M Mineiro 2   result=977321
1222 µs for MB Instr_()         result=977321

... using a 5 bytes substring [Dupli]
11 ms   for CRT strstr          result=977321
4588 µs for BMHBinsearch        result=977321
4129 µs for BMBinSearch         result=0
9861 µs for Boyer-M Mineiro 1   result=977321
9925 µs for Boyer-M Mineiro 2   result=977321
954 µs  for MB Instr_()         result=977321

... using a 4 bytes substring [Dupl]
4551 µs for CRT strstr          result=391356
2000 µs for BMHBinsearch        result=391356
1821 µs for BMBinSearch         result=391356
12 ms   for Boyer-M Mineiro 1   result=0
12 ms   for Boyer-M Mineiro 2   result=0
455 µs  for MB Instr_()         result=391356

mineiro

Thanks sir jj2007;
I found where the errors sits in my code.
First was an optimization, used "test" instead of "cmp".
Second was an logic error optimization, added more displacement than need.
I have corrected previous posts.
This only occurs after I tried to optimize code.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Quote from: mineiro on May 10, 2022, 08:22:38 PM
I found where the errors sits in my code.

Thanks, mineiro :thumbsup:

So far no errors found in your code. I am now using a 100MB "haystack" based on Windows.inc with "Doplicate" added to the end for the search string:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 5 times in a 100MB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------

621 ms  for CRT strstr          result=105560497
618 ms  for M32 InString        result=105560497
137 ms  for BMHBinsearch        result=0
75 ms   for BMBinSearch         result=105560497
178 ms  for Boyer-M Mineiro 1   result=105560497
185 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 71 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------
625 ms  for CRT strstr          result=105560497
588 ms  for M32 InString        result=105560497
167 ms  for BMHBinsearch        result=105560497
84 ms   for BMBinSearch         result=105560497
232 ms  for Boyer-M Mineiro 1   result=105560497
240 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 60 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------
620 ms  for CRT strstr          result=105560497
590 ms  for M32 InString        result=105560497
177 ms  for BMHBinsearch        result=105560497
99 ms   for BMBinSearch         result=105560497
247 ms  for Boyer-M Mineiro 1   result=105560497
251 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 51 bytes substring [Doplicate include file windows.inc<crlf>echo ----------]
613 ms  for CRT strstr          result=105560497
576 ms  for M32 InString        result=105560497
176 ms  for BMHBinsearch        result=105560497
117 ms  for BMBinSearch         result=105560497
242 ms  for Boyer-M Mineiro 1   result=105560497
243 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 43 bytes substring [Doplicate include file windows.inc<crlf>echo --]
626 ms  for CRT strstr          result=105560497
578 ms  for M32 InString        result=105560497
181 ms  for BMHBinsearch        result=105560497
151 ms  for BMBinSearch         result=105560497
203 ms  for Boyer-M Mineiro 1   result=105560497
205 ms  for Boyer-M Mineiro 2   result=105560497
64 ms   for MB Instr_()         result=105560497

... using a 37 bytes substring [Doplicate include file windows.inc<crlf>e]
616 ms  for CRT strstr          result=105560497
586 ms  for M32 InString        result=105560497
262 ms  for BMHBinsearch        result=105560497
124 ms  for BMBinSearch         result=105560497
137 ms  for Boyer-M Mineiro 1   result=105560497
143 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 31 bytes substring [Doplicate include file windows.]
618 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
310 ms  for BMHBinsearch        result=105560497
148 ms  for BMBinSearch         result=105560497
152 ms  for Boyer-M Mineiro 1   result=105560497
154 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 26 bytes substring [Doplicate include file win]
622 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
344 ms  for BMHBinsearch        result=105560497
254 ms  for BMBinSearch         result=105560497
255 ms  for Boyer-M Mineiro 1   result=105560497
246 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 22 bytes substring [Doplicate include file]
624 ms  for CRT strstr          result=105560497
581 ms  for M32 InString        result=105560497
359 ms  for BMHBinsearch        result=105560497
259 ms  for BMBinSearch         result=105560497
252 ms  for Boyer-M Mineiro 1   result=105560497
254 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 19 bytes substring [Doplicate include f]
620 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
387 ms  for BMHBinsearch        result=105560497
547 ms  for BMBinSearch         result=105560497
575 ms  for Boyer-M Mineiro 1   result=105560497
576 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 16 bytes substring [Doplicate includ]
624 ms  for CRT strstr          result=105560497
595 ms  for M32 InString        result=105560497
442 ms  for BMHBinsearch        result=105560497
265 ms  for BMBinSearch         result=105560497
253 ms  for Boyer-M Mineiro 1   result=105560497
259 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 14 bytes substring [Doplicate incl]
622 ms  for CRT strstr          result=105560497
585 ms  for M32 InString        result=105560497
463 ms  for BMHBinsearch        result=105560497
319 ms  for BMBinSearch         result=105560497
310 ms  for Boyer-M Mineiro 1   result=105560497
308 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 12 bytes substring [Doplicate in]
620 ms  for CRT strstr          result=105560497
594 ms  for M32 InString        result=105560497
506 ms  for BMHBinsearch        result=105560497
469 ms  for BMBinSearch         result=105560497
483 ms  for Boyer-M Mineiro 1   result=105560497
480 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 10 bytes substring [Doplicate ]
634 ms  for CRT strstr          result=105560497
609 ms  for M32 InString        result=105560497
307 ms  for BMHBinsearch        result=105560497
173 ms  for BMBinSearch         result=105560497
916 ms  for Boyer-M Mineiro 1   result=105560497
913 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 9 bytes substring [Doplicate]
613 ms  for CRT strstr          result=105560497
617 ms  for M32 InString        result=105560497
173 ms  for BMHBinsearch        result=105560497
142 ms  for BMBinSearch         result=105560497
312 ms  for Boyer-M Mineiro 1   result=105560497
310 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 8 bytes substring [Doplicat]
619 ms  for CRT strstr          result=105560497
579 ms  for M32 InString        result=105560497
161 ms  for BMHBinsearch        result=105560497
137 ms  for BMBinSearch         result=0
334 ms  for Boyer-M Mineiro 1   result=105560497
331 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 7 bytes substring [Doplica]
616 ms  for CRT strstr          result=105560497
576 ms  for M32 InString        result=105560497
174 ms  for BMHBinsearch        result=105560497
146 ms  for BMBinSearch         result=105560497
369 ms  for Boyer-M Mineiro 1   result=105560497
368 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 6 bytes substring [Doplic]
618 ms  for CRT strstr          result=105560497
580 ms  for M32 InString        result=105560497
185 ms  for BMHBinsearch        result=105560497
161 ms  for BMBinSearch         result=0
428 ms  for Boyer-M Mineiro 1   result=105560497
432 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

mineiro

Follow 3 adaptations from Boyer Moore algorithm with error check (no bounds error, zero size of string and pattern, pattern bigger than string).
This version its a byte search, so you can search for ascii, unicode, utf8, binary data, ... .

The nice thing is that algorithm can be implemented more.
One situation happens when pattern to search have a lot of repeating chars. One option is to search for less frequent char (byte). Other option is to create a strong prefix characters position (I avoid this because adds more complexity to algorithm (it's based in single digraphs, trigraphs, ..., read Boyer Moore paper in internet)). Other option is insert a linear search inside this algorithm (yes, both can be combined).
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Your exe:
empty -8 cycles
BM13 308776 cycles, 1 ocurrences found
BM16 327836 cycles, 1 ocurrences found
BM17 344912 cycles, 1 ocurrences found

I've integrated your version 13 in the testbed (attached), here are the results:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 5 times in a 100MB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Doplicate include file windows.inc<crlf>echo ------------------------------------------]
577 ms  for M32 InString        result=105560497
133 ms  for BMHBinsearch        result=0
74 ms  for BMBinSearch        result=105560497
171 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

... using a 71 bytes substring [Doplicate include file windows.inc<crlf>echo ------------------------------]
560 ms  for M32 InString        result=105560497
159 ms  for BMHBinsearch        result=105560497
80 ms  for BMBinSearch        result=105560497
233 ms  for Boyer-M Mineiro 13  result=105560497
61 ms  for MB Instr_()        result=105560497

... using a 60 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------]
559 ms  for M32 InString        result=105560497
160 ms  for BMHBinsearch        result=105560497
96 ms  for BMBinSearch        result=105560497
249 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 51 bytes substring [Doplicate include file windows.inc<crlf>echo ----------]
558 ms  for M32 InString        result=105560497
164 ms  for BMHBinsearch        result=105560497
113 ms  for BMBinSearch        result=105560497
237 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 43 bytes substring [Doplicate include file windows.inc<crlf>echo --]
559 ms  for M32 InString        result=105560497
176 ms  for BMHBinsearch        result=105560497
147 ms  for BMBinSearch        result=105560497
207 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 37 bytes substring [Doplicate include file windows.inc<crlf>e]
580 ms  for M32 InString        result=105560497
249 ms  for BMHBinsearch        result=105560497
126 ms  for BMBinSearch        result=105560497
371 ms  for Boyer-M Mineiro 13  result=0
60 ms  for MB Instr_()        result=105560497

... using a 31 bytes substring [Doplicate include file windows.]
585 ms  for M32 InString        result=105560497
302 ms  for BMHBinsearch        result=105560497
144 ms  for BMBinSearch        result=105560497
444 ms  for Boyer-M Mineiro 13  result=0
61 ms  for MB Instr_()        result=105560497

... using a 26 bytes substring [Doplicate include file win]
577 ms  for M32 InString        result=105560497
335 ms  for BMHBinsearch        result=105560497
242 ms  for BMBinSearch        result=105560497
489 ms  for Boyer-M Mineiro 13  result=0
61 ms  for MB Instr_()        result=105560497

... using a 22 bytes substring [Doplicate include file]
584 ms  for M32 InString        result=105560497
347 ms  for BMHBinsearch        result=105560497
247 ms  for BMBinSearch        result=105560497
532 ms  for Boyer-M Mineiro 13  result=0
60 ms  for MB Instr_()        result=105560497

... using a 19 bytes substring [Doplicate include f]
589 ms  for M32 InString        result=105560497
381 ms  for BMHBinsearch        result=105560497
538 ms  for BMBinSearch        result=105560497
538 ms  for Boyer-M Mineiro 13  result=0
63 ms  for MB Instr_()        result=105560497

... using a 16 bytes substring [Doplicate includ]
593 ms  for M32 InString        result=105560497
426 ms  for BMHBinsearch        result=105560497
254 ms  for BMBinSearch        result=105560497
596 ms  for Boyer-M Mineiro 13  result=0
62 ms  for MB Instr_()        result=105560497

... using a 14 bytes substring [Doplicate incl]
586 ms  for M32 InString        result=105560497
444 ms  for BMHBinsearch        result=105560497
305 ms  for BMBinSearch        result=105560497
642 ms  for Boyer-M Mineiro 13  result=0
60 ms  for MB Instr_()        result=105560497

... using a 12 bytes substring [Doplicate in]
577 ms  for M32 InString        result=105560497
493 ms  for BMHBinsearch        result=105560497
451 ms  for BMBinSearch        result=105560497
693 ms  for Boyer-M Mineiro 13  result=0
61 ms  for MB Instr_()        result=105560497

... using a 10 bytes substring [Doplicate ]
603 ms  for M32 InString        result=105560497
305 ms  for BMHBinsearch        result=105560497
163 ms  for BMBinSearch        result=105560497
821 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 9 bytes substring [Doplicate]
585 ms  for M32 InString        result=105560497
167 ms  for BMHBinsearch        result=105560497
142 ms  for BMBinSearch        result=105560497
845 ms  for Boyer-M Mineiro 13  result=0
61 ms  for MB Instr_()        result=105560497

... using a 8 bytes substring [Doplicat]
579 ms  for M32 InString        result=105560497
160 ms  for BMHBinsearch        result=105560497
132 ms  for BMBinSearch        result=0
833 ms  for Boyer-M Mineiro 13  result=0
62 ms  for MB Instr_()        result=105560497

... using a 7 bytes substring [Doplica]
578 ms  for M32 InString        result=105560497
167 ms  for BMHBinsearch        result=105560497
147 ms  for BMBinSearch        result=105560497
885 ms  for Boyer-M Mineiro 13  result=0
63 ms  for MB Instr_()        result=105560497

... using a 6 bytes substring [Doplic]
584 ms  for M32 InString        result=105560497
187 ms  for BMHBinsearch        result=105560497
165 ms  for BMBinSearch        result=0
945 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

**** last sub$ was [Doplic] -- hit any key ****

mineiro

Quote from: jj2007 on August 14, 2023, 01:44:30 AM371 ms  for Boyer-M Mineiro 13  result=0
444 ms  for Boyer-M Mineiro 13  result=0

Strange no!? It's not finding some patterns.
Please, can you check for the same string not found in your tests by using the one I posted?
Thanks.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Please post your haystack/needle combination, or just use my source (attached above) to do it yourself.

Note that the Masm32 SDK BMHBinSearch & BMBinSearch algos also have problems. Your code works most of the time.

All others work all the time.

mineiro

Quote from: jj2007 on August 14, 2023, 04:04:11 AMPlease post your haystack/needle combination, or just use my source (attached above) to do it yourself.
I can't, I'm not a basic guy.

QuoteNote that the Masm32 SDK BMHBinSearch & BMBinSearch algos also have problems. Your code works most of the time.
All others work all the time.
That one works to single search instance all the times. That was the error, I was able to find all strings. Not a problem, this one is ready to multiple instances.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

Congrats, mineiro, this one does not produce any errors, and it's also much faster than the previous version :thumbsup:

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 5 times in a 100MB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Doplicate include file windows.inc<crlf>echo ------------------------------------------]
581 ms  for M32 InString        result=105560497
130 ms  for BMHBinsearch        result=0
73 ms  for BMBinSearch        result=105560497
174 ms  for Boyer-M Mineiro 13  result=105560497
59 ms  for MB Instr_()        result=105560497

... using a 71 bytes substring [Doplicate include file windows.inc<crlf>echo ------------------------------]
571 ms  for M32 InString        result=105560497
159 ms  for BMHBinsearch        result=105560497
82 ms  for BMBinSearch        result=105560497
230 ms  for Boyer-M Mineiro 13  result=105560497
59 ms  for MB Instr_()        result=105560497

... using a 60 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------]
570 ms  for M32 InString        result=105560497
167 ms  for BMHBinsearch        result=105560497
99 ms  for BMBinSearch        result=105560497
259 ms  for Boyer-M Mineiro 13  result=105560497
59 ms  for MB Instr_()        result=105560497

... using a 51 bytes substring [Doplicate include file windows.inc<crlf>echo ----------]
570 ms  for M32 InString        result=105560497
169 ms  for BMHBinsearch        result=105560497
115 ms  for BMBinSearch        result=105560497
243 ms  for Boyer-M Mineiro 13  result=105560497
61 ms  for MB Instr_()        result=105560497

... using a 43 bytes substring [Doplicate include file windows.inc<crlf>echo --]
572 ms  for M32 InString        result=105560497
182 ms  for BMHBinsearch        result=105560497
148 ms  for BMBinSearch        result=105560497
203 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 37 bytes substring [Doplicate include file windows.inc<crlf>e]
573 ms  for M32 InString        result=105560497
258 ms  for BMHBinsearch        result=105560497
121 ms  for BMBinSearch        result=105560497
139 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

... using a 31 bytes substring [Doplicate include file windows.]
575 ms  for M32 InString        result=105560497
302 ms  for BMHBinsearch        result=105560497
146 ms  for BMBinSearch        result=105560497
150 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

... using a 26 bytes substring [Doplicate include file win]
572 ms  for M32 InString        result=105560497
346 ms  for BMHBinsearch        result=105560497
238 ms  for BMBinSearch        result=105560497
250 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

... using a 22 bytes substring [Doplicate include file]
577 ms  for M32 InString        result=105560497
370 ms  for BMHBinsearch        result=105560497
246 ms  for BMBinSearch        result=105560497
246 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 19 bytes substring [Doplicate include f]
572 ms  for M32 InString        result=105560497
383 ms  for BMHBinsearch        result=105560497
532 ms  for BMBinSearch        result=105560497
577 ms  for Boyer-M Mineiro 13  result=105560497
61 ms  for MB Instr_()        result=105560497

... using a 16 bytes substring [Doplicate includ]
577 ms  for M32 InString        result=105560497
433 ms  for BMHBinsearch        result=105560497
259 ms  for BMBinSearch        result=105560497
254 ms  for Boyer-M Mineiro 13  result=105560497
61 ms  for MB Instr_()        result=105560497

... using a 14 bytes substring [Doplicate incl]
574 ms  for M32 InString        result=105560497
457 ms  for BMHBinsearch        result=105560497
314 ms  for BMBinSearch        result=105560497
312 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 12 bytes substring [Doplicate in]
577 ms  for M32 InString        result=105560497
501 ms  for BMHBinsearch        result=105560497
460 ms  for BMBinSearch        result=105560497
472 ms  for Boyer-M Mineiro 13  result=105560497
61 ms  for MB Instr_()        result=105560497

... using a 10 bytes substring [Doplicate ]
601 ms  for M32 InString        result=105560497
307 ms  for BMHBinsearch        result=105560497
165 ms  for BMBinSearch        result=105560497
849 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 9 bytes substring [Doplicate]
575 ms  for M32 InString        result=105560497
176 ms  for BMHBinsearch        result=105560497
144 ms  for BMBinSearch        result=105560497
310 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 8 bytes substring [Doplicat]
583 ms  for M32 InString        result=105560497
158 ms  for BMHBinsearch        result=105560497
131 ms  for BMBinSearch        result=0
339 ms  for Boyer-M Mineiro 13  result=105560497
60 ms  for MB Instr_()        result=105560497

... using a 7 bytes substring [Doplica]
575 ms  for M32 InString        result=105560497
176 ms  for BMHBinsearch        result=105560497
149 ms  for BMBinSearch        result=105560497
377 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

... using a 6 bytes substring [Doplic]
586 ms  for M32 InString        result=105560497
187 ms  for BMHBinsearch        result=105560497
166 ms  for BMBinSearch        result=0
432 ms  for Boyer-M Mineiro 13  result=105560497
62 ms  for MB Instr_()        result=105560497

**** last sub$ was [Doplic] -- hit any key ****

HSE

Hi mineiro!!

Look like you are forgetting that zero terminated strings requiere a final zero  :biggrin:

Text_test db 'aba aba aba aba abax'
String_test db 'bax'

Any way there is some problem.
Equations in Assembly: SmplMath

jj2007

That just means that Test_text is db 'aba aba aba aba abaxbax', 0

Anyway, that text isn't used in mineiro's source. With my haystack & needle(s) his latest version behaves correctly now.

HSE

:biggrin:

Quote from: jj2007 on August 14, 2023, 07:37:33 AMThat just means that Test_text is db 'aba aba aba aba abaxbax', 0

Yes, somebody put a zero!

Anyway, I was trying to run mineiro test.
Equations in Assembly: SmplMath

jj2007


mineiro

Quote from: HSE on August 14, 2023, 06:46:57 AMHi mineiro!!
Look like you are forgetting that zero terminated strings requiere a final zero  :biggrin:
Text_test db 'aba aba aba aba abax'
String_test db 'bax'

Any way there is some problem.
Hello sir HSE, that don't need end of string because search for bytes, not only string, it's a binary search.

I'd rather be this ambulant metamorphosis than to have that old opinion about everything

mineiro

Just to things get clear.
Version below search for a 1024 random bytes inside a ~100MB random bytes.
As you can guess, string search will fail into this testcase.

.data
fail db 0,0

bm.exe
empty -19 cycles
BM13 81306365 cycles, 1 ocurrences found
BM17 90373957 cycles, 1 ocurrences found
BinSearch 127647148 cycles, 1 ocurrences found
I'd rather be this ambulant metamorphosis than to have that old opinion about everything