After reading some threads here on timing and execution speed, I decided to put one of my creations to the test. I wrote what's basically an instr() function to look for strings in files for my FindIt! program. FindFileText() is an all-in-one deal that takes a file handle and does all the needed buffer-filling stuff, as well as searching the text.
I did the same test as JJ used in another thread here, looking for "WARNING duplicate" in windows.inc, and was quite surprised at the results. I certainly wasn't expecting to even be in the ballpark of a fast routine; I didn't really optimize my code for ... anything. But according to my calculations, my code ran in about 35 ms. Now it's possible that this is wrong (and if anyone is interested they can check my code). Who knows? maybe I'm off by a factor of 10.
Here's how I'm timing myself:
; Prime the pump:
INVOKE FillBuffer, fileHandle
; Get the starting time:
INVOKE QueryPerformanceCounter, OFFSET QPCstartTime
; Do the test:
INVOKE FindFileText, fileHandle, ADDR searchBuffer
; Get the end time:
PUSH EAX ;Save search results.
INVOKE QueryPerformanceCounter, ADDR qPCendTime
POP EAX
[display index value of match here]
; Calculate elapsed time:
FILD qPCendTime
FILD QPCstartTime
FSUB
FILD QPCfreq
FDIV
FILD OneThousand ;Convert to milliseconds.
FMUL ;Leave result on FPU.
INVOKE FpuFLtoA, NULL, $FPUfl2aFlags, ADDR timeBuffer, SRC1_FPU or SRC2_DIMM or STR_REG
[format and display elapsed time here]
It looks good to me, so is this accurate? Do I have a contender for a speed demon here?
Couple things about my code: I deviated from the standard instr() return values here. I return a 0-based index rather than a 1-based one, with an error value of -1 (seems more useful that way to me). You could easily handle this by just adding 1 to the return value.
Notice that I do an initial buffer fill before calling the function, to exclude file-system latency from the results. (The function still has to do buffer refills after that. My buffer size is 64kB.) Not sure what the rules of the game are here.
There's one thing I don't like about my code: it has a couple of dangling appendages that make for a less-than-neat interface. The caller has the responsibility of allocating a buffer. I use HeapAlloc() ) for the file buffer, and I use 2 global variables, FileBuffer and FileBufferSize, to pass this to the buffer-fill routine. But this seems less bad than having the function itself allocate (and then release) the buffer. Or passing in those variables to the function; I'd have to pass them below to 2 subroutines. If anyone has any good suggestions here, let me know. (I've long ago made my peace with the use of global variables, no matter what the "structured programming" tightasses say ...)