My instr() entry: a possible contender?

NoCforMe · July 10, 2022, 06:40:58 AM

After reading some threads here on timing and execution speed, I decided to put one of my creations to the test. I wrote what's basically an instr() function to look for strings in files for my FindIt! program. FindFileText() is an all-in-one deal that takes a file handle and does all the needed buffer-filling stuff, as well as searching the text.

I did the same test as JJ used in another thread here, looking for "WARNING duplicate" in windows.inc, and was quite surprised at the results. I certainly wasn't expecting to even be in the ballpark of a fast routine; I didn't really optimize my code for ... anything. But according to my calculations, my code ran in about 35 ms. Now it's possible that this is wrong (and if anyone is interested they can check my code). Who knows? maybe I'm off by a factor of 10.

Here's how I'm timing myself:

Code Select


; Prime the pump:
	INVOKE	FillBuffer, fileHandle

; Get the starting time:
	INVOKE	QueryPerformanceCounter, OFFSET QPCstartTime

; Do the test:
	INVOKE	FindFileText, fileHandle, ADDR searchBuffer

; Get the end time:
	PUSH	EAX			;Save search results.
	INVOKE	QueryPerformanceCounter, ADDR qPCendTime
	POP	EAX

	[display index value of match here]

; Calculate elapsed time:
	FILD	qPCendTime
	FILD	QPCstartTime
	FSUB
	FILD	QPCfreq
	FDIV
	FILD	OneThousand		;Convert to milliseconds.
	FMUL				;Leave result on FPU.

	INVOKE	FpuFLtoA, NULL, $FPUfl2aFlags, ADDR timeBuffer, SRC1_FPU or SRC2_DIMM or STR_REG

	[format and display elapsed time here]

It looks good to me, so is this accurate? Do I have a contender for a speed demon here?

Couple things about my code: I deviated from the standard instr() return values here. I return a 0-based index rather than a 1-based one, with an error value of -1 (seems more useful that way to me). You could easily handle this by just adding 1 to the return value.

Notice that I do an initial buffer fill before calling the function, to exclude file-system latency from the results. (The function still has to do buffer refills after that. My buffer size is 64kB.) Not sure what the rules of the game are here.

There's one thing I don't like about my code: it has a couple of dangling appendages that make for a less-than-neat interface. The caller has the responsibility of allocating a buffer. I use HeapAlloc() ) for the file buffer, and I use 2 global variables, FileBuffer and FileBufferSize, to pass this to the buffer-fill routine. But this seems less bad than having the function itself allocate (and then release) the buffer. Or passing in those variables to the function; I'd have to pass them below to 2 subroutines. If anyone has any good suggestions here, let me know. (I've long ago made my peace with the use of global variables, no matter what the "structured programming" tightasses say ...)

NoCforMe · July 10, 2022, 08:09:07 AM

BTW, I find that match at offset 966,017. Does that mean I have an out-of-date windows.inc?

hutch-- · July 10, 2022, 09:28:25 AM

Normally with a search algo, you load the entire source into memory in one buffer and put the pattern you are searching for in another that only has to be big enough to hold the pattern. The OS based search APIs are usually fast enough but generally tailored to the task they have to perform and are not always that easy to adapt to another task.

I don't have any problem using global scope variables and I select either local or global scope depending on the scope of the task to perform. The OS based search in richedit controls is plenty fast enough but its tailored for a rich edit control and little use otherwise.

NoCforMe · July 10, 2022, 09:31:48 AM

So I take it from your reply that my searcher may be a contender? Because it kind of has one arm tied behind its back since it has to do file accesses, whereas you're saying most already have the entire file loaded into memory. (Unless, of course, my timings are off!)

hutch-- · July 10, 2022, 09:40:49 AM

You need to look at what you are competing with, the last search algo I posted was running at over 2 gig/second and I think JJ had a faster one. If you are doing multiple file accesses, you are throwing away a lot of speed as file IO is far slower than direct memory search. There is a technique using memory mapped files with a sliding window but its no joy to code and mainly useful when you have to search files that are much larger than available memory.

jj2007 · July 10, 2022, 10:14:23 AM

Quote from: NoCforMe on July 10, 2022, 08:09:07 AM
BTW, I find that match at offset 966,017. Does that mean I have an out-of-date windows.inc?

I'm afraid it's 966,017 on my machine, too - and that's not the correct value.

Re timings, 99.x% is inside FillBuffer. An Instr() algo is so much faster than file I/O.

The interface is well done btw

NoCforMe · July 10, 2022, 10:22:52 AM

OK, I'll rewrite it to use one humongous buffer (easily done). But how do I do that? Can I use HeapAlloc to get a 1Mb+ buffer? Not familiar working with big objects here.

My windows.inc has a timestamp of 10/15/2020, so maybe that's the right answer? What's the latest?

jj2007 · July 10, 2022, 10:28:25 AM

Quote from: NoCforMe on July 10, 2022, 10:22:52 AMCan I use HeapAlloc to get a 1Mb+ buffer?

Sure. If you have enough RAM, you can allocate a gigabyte and more.

NoCforMe · July 10, 2022, 11:31:42 AM

Another question about instr() functions: am I right in assuming that this function works with ASCIIZ text, and that a zero in the text being searched indicates a terminator? That's the way my instr() function is set up (but not the FindFileText() function used here, which uses the count of bytes in the buffer to know where the end is).

It's possible to set up an instr() function that'll search for text (or any string of characters) within data that contains zeroes, like binary data. I've done that to find strings in executables, for instance. But I'm assuming that the str part of the name here indicates ASCIIZ, amiright?

hutch-- · July 10, 2022, 12:10:42 PM

They are two different tasks, zero termination for text and length specified for binary search. You can get the second to do the first but its not as convenient as zero terminated string search.

In 32 bit, a megabyte is trivial, depending on the memory size in your computer, 100 meg is no big deal, it starts to get fussy once you allocate over 1 gig but it can be done. Either HeapAlloc() or GlobalAlloc() using the GMEM_FIXED flag will do the job with no problems.

NoCforMe · July 10, 2022, 12:21:19 PM

Thanks for the encouragement, Hutch.

OK, here's another test candidate. This one uses a function I like to call instri(), the trailing 'i' meaning "case insensitive". It gives the right answer for the offset of "WARNING duplicate" (977,313). And the execution time bounces between ~35 and 43 ms. This time the entire file is in the buffer, so the only time I'm measuring is the string search time.

So can someone tell me already if I'm even in the running? I want to know if I can expect a prize anytime soon ...

hutch-- · July 10, 2022, 12:38:44 PM

Well, I don't want to discourage you but the FindStr algo I posted does not beat the bare granularity of GetTickCount, on a single pass it returns 0 ms. I would try to narrow down how the search is being done and see if you can tweak it a bit further. Make sure you are only timing the search, not the file load time or any other code.

The MASM Forum

News:

My instr() entry: a possible contender?

NoCforMe

NoCforMe

hutch--

NoCforMe

hutch--

jj2007

NoCforMe

jj2007

NoCforMe

hutch--

NoCforMe

hutch--