Author Topic: Bug in Boyer-Moore BM.asm ?  (Read 3585 times)

jj2007

  • Member
  • *****
  • Posts: 13336
  • Assembly is fun ;-)
    • MasmBasic
Re: Bug in Boyer-Moore BM.asm ?
« Reply #45 on: May 10, 2022, 08:09:50 AM »
@Mineiro - timings using the second version:

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 1000 times in a 954kB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring
1119 ms for CRT strstr          result=977321
225 ms  for BMHBinsearch        result=977321
100 ms  for BMBinSearch         result=977321
94 ms   for Boyer-M Mineiro     result=977321
78 ms   for MB Instr_()         result=977321

... using a 43 bytes substring
1106 ms for CRT strstr          result=977321
326 ms  for BMHBinsearch        result=977321
252 ms  for BMBinSearch         result=977321
241 ms  for Boyer-M Mineiro     result=977321
81 ms   for MB Instr_()         result=977321

... using a 22 bytes substring
1105 ms for CRT strstr          result=977321
640 ms  for BMHBinsearch        result=977321
436 ms  for BMBinSearch         result=977321
410 ms  for Boyer-M Mineiro     result=0
80 ms   for MB Instr_()         result=977321

... using a 11 bytes substring
1099 ms for CRT strstr          result=977321
935 ms  for BMHBinsearch        result=977321
1284 ms for BMBinSearch         result=977321
1372 ms for Boyer-M Mineiro     result=977321
78 ms   for MB Instr_()         result=977321

... using a 6 bytes substring
1090 ms for CRT strstr          result=977321
339 ms  for BMHBinsearch        result=977321
302 ms  for BMBinSearch         result=0
762 ms  for Boyer-M Mineiro     result=977321
80 ms   for MB Instr_()         result=977321

**** last sub$ was [Duplic] -- hit any key ****

Building this project requires MasmBasic version 10 May 22 :cool:

P.S.: The second attachment is built with both your first and your second code, and a finer granularity for the length of the substrings showing where the problems are (idem for BMBinSearch - the Horspool variant seems ok). Here is its output:

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 10 times in a 954kB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Duplicate include file windows.inc<crlf>echo ------------------------------------------]
11 ms   for CRT strstr          result=977321
2401 µs for BMHBinsearch        result=977321
1082 µs for BMBinSearch         result=977321
996 µs  for Boyer-M Mineiro 1   result=977321
998 µs  for Boyer-M Mineiro 2   result=977321
835 µs  for MB Instr_()         result=977321

... using a 71 bytes substring [Duplicate include file windows.inc<crlf>echo ------------------------------]
11 ms   for CRT strstr          result=977321
2989 µs for BMHBinsearch        result=977321
1280 µs for BMBinSearch         result=977321
1177 µs for Boyer-M Mineiro 1   result=0
1185 µs for Boyer-M Mineiro 2   result=0
834 µs  for MB Instr_()         result=977321

... using a 60 bytes substring [Duplicate include file windows.inc<crlf>echo -------------------]
11 ms   for CRT strstr          result=977321
3095 µs for BMHBinsearch        result=977321
1620 µs for BMBinSearch         result=977321
1399 µs for Boyer-M Mineiro 1   result=0
1403 µs for Boyer-M Mineiro 2   result=0
834 µs  for MB Instr_()         result=977321

... using a 51 bytes substring [Duplicate include file windows.inc<crlf>echo ----------]
11 ms   for CRT strstr          result=977321
3022 µs for BMHBinsearch        result=977321
1891 µs for BMBinSearch         result=977321
1637 µs for Boyer-M Mineiro 1   result=977321
1665 µs for Boyer-M Mineiro 2   result=977321
843 µs  for MB Instr_()         result=977321

... using a 43 bytes substring [Duplicate include file windows.inc<crlf>echo --]
12 ms   for CRT strstr          result=977321
3486 µs for BMHBinsearch        result=977321
2676 µs for BMBinSearch         result=977321
2751 µs for Boyer-M Mineiro 1   result=977321
2615 µs for Boyer-M Mineiro 2   result=977321
837 µs  for MB Instr_()         result=977321

... using a 37 bytes substring [Duplicate include file windows.inc<crlf>e]
11 ms   for CRT strstr          result=977321
4709 µs for BMHBinsearch        result=977321
2301 µs for BMBinSearch         result=977321
2152 µs for Boyer-M Mineiro 1   result=977321
2377 µs for Boyer-M Mineiro 2   result=977321
1082 µs for MB Instr_()         result=977321

... using a 31 bytes substring [Duplicate include file windows.]
11 ms   for CRT strstr          result=977321
5671 µs for BMHBinsearch        result=977321
2737 µs for BMBinSearch         result=977321
2609 µs for Boyer-M Mineiro 1   result=977321
2715 µs for Boyer-M Mineiro 2   result=977321
1113 µs for MB Instr_()         result=977321

... using a 26 bytes substring [Duplicate include file win]
11 ms   for CRT strstr          result=977321
6292 µs for BMHBinsearch        result=977321
4377 µs for BMBinSearch         result=977321
4338 µs for Boyer-M Mineiro 1   result=0
3937 µs for Boyer-M Mineiro 2   result=0
1188 µs for MB Instr_()         result=977321

... using a 22 bytes substring [Duplicate include file]
11 ms   for CRT strstr          result=977321
6236 µs for BMHBinsearch        result=977321
4743 µs for BMBinSearch         result=977321
4248 µs for Boyer-M Mineiro 1   result=0
4701 µs for Boyer-M Mineiro 2   result=0
1156 µs for MB Instr_()         result=977321

... using a 19 bytes substring [Duplicate include f]
11 ms   for CRT strstr          result=977321
7136 µs for BMHBinsearch        result=977321
10 ms   for BMBinSearch         result=977321
10 ms   for Boyer-M Mineiro 1   result=977321
9423 µs for Boyer-M Mineiro 2   result=977321
921 µs  for MB Instr_()         result=977321

... using a 16 bytes substring [Duplicate includ]
11 ms   for CRT strstr          result=977321
8211 µs for BMHBinsearch        result=977321
5034 µs for BMBinSearch         result=977321
3128 µs for Boyer-M Mineiro 1   result=0
3165 µs for Boyer-M Mineiro 2   result=0
935 µs  for MB Instr_()         result=977321

... using a 14 bytes substring [Duplicate incl]
11 ms   for CRT strstr          result=977321
8635 µs for BMHBinsearch        result=977321
6008 µs for BMBinSearch         result=977321
5635 µs for Boyer-M Mineiro 1   result=977321
5693 µs for Boyer-M Mineiro 2   result=977321
948 µs  for MB Instr_()         result=977321

... using a 12 bytes substring [Duplicate in]
11 ms   for CRT strstr          result=977321
9443 µs for BMHBinsearch        result=977321
8710 µs for BMBinSearch         result=977321
4186 µs for Boyer-M Mineiro 1   result=977321
4147 µs for Boyer-M Mineiro 2   result=977321
973 µs  for MB Instr_()         result=977321

... using a 10 bytes substring [Duplicate ]
11 ms   for CRT strstr          result=977321
6199 µs for BMHBinsearch        result=977321
3345 µs for BMBinSearch         result=0
7191 µs for Boyer-M Mineiro 1   result=977321
6984 µs for Boyer-M Mineiro 2   result=977321
1518 µs for MB Instr_()         result=977321

... using a 9 bytes substring [Duplicate]
11 ms   for CRT strstr          result=977321
3473 µs for BMHBinsearch        result=977321
2853 µs for BMBinSearch         result=977321
5692 µs for Boyer-M Mineiro 1   result=977321
5833 µs for Boyer-M Mineiro 2   result=977321
1111 µs for MB Instr_()         result=977321

... using a 8 bytes substring [Duplicat]
11 ms   for CRT strstr          result=977321
3303 µs for BMHBinsearch        result=977321
2778 µs for BMBinSearch         result=977321
6208 µs for Boyer-M Mineiro 1   result=0
6207 µs for Boyer-M Mineiro 2   result=0
1128 µs for MB Instr_()         result=977321

... using a 7 bytes substring [Duplica]
11 ms   for CRT strstr          result=977321
3625 µs for BMHBinsearch        result=977321
3216 µs for BMBinSearch         result=977321
7117 µs for Boyer-M Mineiro 1   result=977321
7055 µs for Boyer-M Mineiro 2   result=977321
1237 µs for MB Instr_()         result=977321

... using a 6 bytes substring [Duplic]
11 ms   for CRT strstr          result=977321
3689 µs for BMHBinsearch        result=977321
3376 µs for BMBinSearch         result=0
8198 µs for Boyer-M Mineiro 1   result=977321
8125 µs for Boyer-M Mineiro 2   result=977321
1222 µs for MB Instr_()         result=977321

... using a 5 bytes substring [Dupli]
11 ms   for CRT strstr          result=977321
4588 µs for BMHBinsearch        result=977321
4129 µs for BMBinSearch         result=0
9861 µs for Boyer-M Mineiro 1   result=977321
9925 µs for Boyer-M Mineiro 2   result=977321
954 µs  for MB Instr_()         result=977321

... using a 4 bytes substring [Dupl]
4551 µs for CRT strstr          result=391356
2000 µs for BMHBinsearch        result=391356
1821 µs for BMBinSearch         result=391356
12 ms   for Boyer-M Mineiro 1   result=0
12 ms   for Boyer-M Mineiro 2   result=0
455 µs  for MB Instr_()         result=391356
« Last Edit: May 10, 2022, 09:27:06 AM by jj2007 »

mineiro

  • Member
  • ****
  • Posts: 864
Re: Bug in Boyer-Moore BM.asm ?
« Reply #46 on: May 10, 2022, 08:22:38 PM »
Thanks sir jj2007;
I found where the errors sits in my code.
First was an optimization, used "test" instead of "cmp".
Second was an logic error optimization, added more displacement than need.
I have corrected previous posts.
This only occurs after I tried to optimize code.
I'd rather be this ambulant metamorphosis than to have that old opinion about everything

jj2007

  • Member
  • *****
  • Posts: 13336
  • Assembly is fun ;-)
    • MasmBasic
Re: Bug in Boyer-Moore BM.asm ?
« Reply #47 on: May 10, 2022, 08:44:03 PM »
I found where the errors sits in my code.

Thanks, mineiro :thumbsup:

So far no errors found in your code. I am now using a 100MB "haystack" based on Windows.inc with "Doplicate" added to the end for the search string:

Code: [Select]
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
Finding a string 5 times in a 100MB 'haystack' from\Masm32\include\Windows.inc

... using a 83 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------

621 ms  for CRT strstr          result=105560497
618 ms  for M32 InString        result=105560497
137 ms  for BMHBinsearch        result=0
75 ms   for BMBinSearch         result=105560497
178 ms  for Boyer-M Mineiro 1   result=105560497
185 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 71 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------
625 ms  for CRT strstr          result=105560497
588 ms  for M32 InString        result=105560497
167 ms  for BMHBinsearch        result=105560497
84 ms   for BMBinSearch         result=105560497
232 ms  for Boyer-M Mineiro 1   result=105560497
240 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 60 bytes substring [Doplicate include file windows.inc<crlf>echo -------------------
620 ms  for CRT strstr          result=105560497
590 ms  for M32 InString        result=105560497
177 ms  for BMHBinsearch        result=105560497
99 ms   for BMBinSearch         result=105560497
247 ms  for Boyer-M Mineiro 1   result=105560497
251 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 51 bytes substring [Doplicate include file windows.inc<crlf>echo ----------]
613 ms  for CRT strstr          result=105560497
576 ms  for M32 InString        result=105560497
176 ms  for BMHBinsearch        result=105560497
117 ms  for BMBinSearch         result=105560497
242 ms  for Boyer-M Mineiro 1   result=105560497
243 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 43 bytes substring [Doplicate include file windows.inc<crlf>echo --]
626 ms  for CRT strstr          result=105560497
578 ms  for M32 InString        result=105560497
181 ms  for BMHBinsearch        result=105560497
151 ms  for BMBinSearch         result=105560497
203 ms  for Boyer-M Mineiro 1   result=105560497
205 ms  for Boyer-M Mineiro 2   result=105560497
64 ms   for MB Instr_()         result=105560497

... using a 37 bytes substring [Doplicate include file windows.inc<crlf>e]
616 ms  for CRT strstr          result=105560497
586 ms  for M32 InString        result=105560497
262 ms  for BMHBinsearch        result=105560497
124 ms  for BMBinSearch         result=105560497
137 ms  for Boyer-M Mineiro 1   result=105560497
143 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 31 bytes substring [Doplicate include file windows.]
618 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
310 ms  for BMHBinsearch        result=105560497
148 ms  for BMBinSearch         result=105560497
152 ms  for Boyer-M Mineiro 1   result=105560497
154 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 26 bytes substring [Doplicate include file win]
622 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
344 ms  for BMHBinsearch        result=105560497
254 ms  for BMBinSearch         result=105560497
255 ms  for Boyer-M Mineiro 1   result=105560497
246 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 22 bytes substring [Doplicate include file]
624 ms  for CRT strstr          result=105560497
581 ms  for M32 InString        result=105560497
359 ms  for BMHBinsearch        result=105560497
259 ms  for BMBinSearch         result=105560497
252 ms  for Boyer-M Mineiro 1   result=105560497
254 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 19 bytes substring [Doplicate include f]
620 ms  for CRT strstr          result=105560497
582 ms  for M32 InString        result=105560497
387 ms  for BMHBinsearch        result=105560497
547 ms  for BMBinSearch         result=105560497
575 ms  for Boyer-M Mineiro 1   result=105560497
576 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 16 bytes substring [Doplicate includ]
624 ms  for CRT strstr          result=105560497
595 ms  for M32 InString        result=105560497
442 ms  for BMHBinsearch        result=105560497
265 ms  for BMBinSearch         result=105560497
253 ms  for Boyer-M Mineiro 1   result=105560497
259 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 14 bytes substring [Doplicate incl]
622 ms  for CRT strstr          result=105560497
585 ms  for M32 InString        result=105560497
463 ms  for BMHBinsearch        result=105560497
319 ms  for BMBinSearch         result=105560497
310 ms  for Boyer-M Mineiro 1   result=105560497
308 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 12 bytes substring [Doplicate in]
620 ms  for CRT strstr          result=105560497
594 ms  for M32 InString        result=105560497
506 ms  for BMHBinsearch        result=105560497
469 ms  for BMBinSearch         result=105560497
483 ms  for Boyer-M Mineiro 1   result=105560497
480 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 10 bytes substring [Doplicate ]
634 ms  for CRT strstr          result=105560497
609 ms  for M32 InString        result=105560497
307 ms  for BMHBinsearch        result=105560497
173 ms  for BMBinSearch         result=105560497
916 ms  for Boyer-M Mineiro 1   result=105560497
913 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 9 bytes substring [Doplicate]
613 ms  for CRT strstr          result=105560497
617 ms  for M32 InString        result=105560497
173 ms  for BMHBinsearch        result=105560497
142 ms  for BMBinSearch         result=105560497
312 ms  for Boyer-M Mineiro 1   result=105560497
310 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497

... using a 8 bytes substring [Doplicat]
619 ms  for CRT strstr          result=105560497
579 ms  for M32 InString        result=105560497
161 ms  for BMHBinsearch        result=105560497
137 ms  for BMBinSearch         result=0
334 ms  for Boyer-M Mineiro 1   result=105560497
331 ms  for Boyer-M Mineiro 2   result=105560497
62 ms   for MB Instr_()         result=105560497

... using a 7 bytes substring [Doplica]
616 ms  for CRT strstr          result=105560497
576 ms  for M32 InString        result=105560497
174 ms  for BMHBinsearch        result=105560497
146 ms  for BMBinSearch         result=105560497
369 ms  for Boyer-M Mineiro 1   result=105560497
368 ms  for Boyer-M Mineiro 2   result=105560497
63 ms   for MB Instr_()         result=105560497

... using a 6 bytes substring [Doplic]
618 ms  for CRT strstr          result=105560497
580 ms  for M32 InString        result=105560497
185 ms  for BMHBinsearch        result=105560497
161 ms  for BMBinSearch         result=0
428 ms  for Boyer-M Mineiro 1   result=105560497
432 ms  for Boyer-M Mineiro 2   result=105560497
60 ms   for MB Instr_()         result=105560497