Author Topic: Problem with BMBinSearch  (Read 157 times)

prino

  • Regular Member
  • *
  • Posts: 4
    • Prino's homepage on Hitchwiki
Problem with BMBinSearch
« on: October 31, 2020, 11:58:41 PM »
I'm using this code in a Virtual Pascal program, unaltered, except for

- adding an initial '@' to the labels,
- changing the two @F/@@ pairs into @F1 & @F2,
- doing a rep stosd for 257 elements of the shift_table below

and just cannot get it to work, but only for some strings. My code?

function BMBinSearch(startpos: longint;
            lpsource: pointer;
            srcLngth: longint;
            lpSubStr: pointer;
            subLngth: longint): longint; assembler; {&uses ebx,esi,edi} {&frame+}

var cval       : longint;
var shift_table: array [0..256] of longint;

asm
code from BMBinSearch
end;

const _bigbuf  = 16777216;  {Use big buffers - less I/O                }

var ifile: file;

var ibuf : pointer;
var i    : longint;
var r    : longint;

const srch: string ='{Z+';

begin
  getmem(ibuf, _bigbuf);

  assign(ifile, 'd:\01-lift\01-data\lift.dat');   // File in liftdat.rar @ https://goo.gl/ZN3XAB
  reset(ifile, 1);

  blockread(ifile, ibuf^, _bigbuf, r);
  close(ifile);

  i:= BMBinSearch(0, ibuf, r, @srch[1], length(srch));
asm int 3;end;
end.

And it basically refuses to find the '{Z+' string, pointing me to a '{Z-' one, and a '{Z-' one somewhere in the middle of the file - it contains many of them...

Should I make more changes to cater for C vs Pascal differences?
Robert AH Prins
robert.ah.prins @ the.15+Gb.Google thingy
No programming here :)

jj2007

  • Member
  • *****
  • Posts: 10861
  • Assembler is fun ;-)
    • MasmBasic
Re: Problem with BMBinSearch
« Reply #1 on: November 01, 2020, 12:53:46 AM »
See Instr timings: Boyer-Moore not working. If you desperately need the function, I could prepare a DLL that does the job with Instr_(FAST, ...). Is that still your Converting HLL to assembler project?
Quote
                - the FAST option is typically about twice as fast as CRT strstr, but 3..4 times as fast when used with
                string arrays (Intel Core i5 timings for counting a rare word in a file with 800 MB, 6 Mio lines):
                                232 ms                    for fast Instr_
                                795 ms for "normal" Instr_
                                999 ms for Masm32 InString
                                929 ms for CRT strstr
                - using FAST, binary search in haystacks containing zeros is possible by assigning the buffer size to edx:
                        mov edx, LastFileSize            ; any info on length of buffer can be used with edx