invoke InstrJB, "The haystack: does it really contain a needle?", "needle"
conout "The needle is at pos ", str$(rdx), lf
invoke InstrCrt, "The haystack: does it really contain a needle?", "needle"
conout "The needle is at pos ", str$(rdx), lf
invoke FindStr, 1, "The haystack: does it really contain a needle?", "needle"
conout "The needle is at pos ", str$(rax), lf
The attached exe needs the DLL (with JBasic Instr + crt strstr) in the same folder. Thanks for some timings (it looks for the string "WARNING Duplicate" in Windows.inc)
jj,
I can download from the link but the file is 0 length.
Strange, for me it works fine. 4 files in the archive, all ok...
Downloaded it with Chrome.
A:\jj\test>jjapp
The needle is at pos 40
The needle is at pos 40
The needle is at pos 40
WARNING at pos 977313, and it took 46 ms with InstrJB
WARNING at pos 977313, and it took 844 ms with InstrCrt
WARNING at pos 977313, and it took 844 ms with FindStr
bye
The needle is at pos 40
The needle is at pos 40
The needle is at pos 40
WARNING at pos 978042, and it took 47 ms with InstrJB
WARNING at pos 978042, and it took 828 ms with InstrCrt
WARNING at pos 978042, and it took 828 ms with FindStr
bye
Quote from: HSE on July 05, 2022, 07:36:16 AM
WARNING at pos 978042
You fumbled your Windows.inc, Hector :biggrin:
Thanks, guys :thumbsup:
Quote from: jj2007 on July 05, 2022, 07:38:25 AM
You fumbled your Windows.inc
I have 27 Windows.inc files :biggrin:
jj,
Something weird with your demo, I have timed FindStr which is an ordinary byte scanner on the win32 windows.inc file and on my old Haswell, I get this result. It is too fast for a file of that length returning 0 ms.
len = 977313 bytes
timing in ms = 0
thats all folks ....
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
include \masm64\include64\masm64rt.inc
.code
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
entry_point proc
USING r12
LOCAL pMem :QWORD
LOCAL rslt :QWORD
LOCAL tcnt :QWORD
SaveRegs
mov pMem, loadfile("windows.inc")
rcall GetTickCount
mov r12, rax
mov rslt, rvcall(FindStr,1,pMem,"WARNING Duplicate include file windows.inc")
rcall GetTickCount
sub rax, r12
mov r12, rax
conout " len = ",str$(rslt)," bytes",lf
conout " timing in ms = ",str$(r12),lf,lf
mfree pMem
waitkey " thats all folks ...."
RestoreRegs
.exit
entry_point endp
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
end
Hutch,
I don't think it's too fast, but there is indeed a glitch - I forgot to re-initialise the counter for FindStr (see attached new version):
iterations=1000
mov r12d, iterations
mov r13, rv(GetTickCount)
@@: invoke FindStr, 1, pMem, "WARNING Duplicate"
dec r12d
jge @B
conout "WARNING at pos ", str$(rax)
sub rv(GetTickCount), r13
conout ", and it took ", str$(rax), " ms with FindStr", lf
I had already wondered why the CRT and FindStr timings were always identical :rolleyes: :sad:
I have timed the example correctly and with a less than 1 meg file it did not get above the base granularity of GetTickCount. The algo is fast enough for general purpose search, there may be a trick with one of the new SSE4.2 instructions but I have not seen it done yet.
Quote from: hutch-- on July 06, 2022, 09:18:44 PM
I have timed the example correctly
With iterations=1000?
Your results are not making sense, this is what I get.
The needle is at pos 40
The needle is at pos 40
The needle is at pos 40
WARNING at pos 977313, and it took 46 ms with InstrJB
WARNING at pos 977313, and it took 891 ms with InstrCrt
WARNING at pos 977313, and it took 703 ms with FindStr
bye
My timing has no errors in it, and it does not get over the base granularity of GetTickCount. I have no idea of what you have done with the timing as your DLL source is not available.
It does 1000 iterations in the loop:
mov r12d, iterations
mov r13, rv(GetTickCount)
@@: invoke FindStr, 1, pMem, "WARNING Duplicate"
dec r12d
jge @B
conout "WARNING at pos ", str$(rax)
sub rv(GetTickCount), r13
Your timings look entirely plausible - your machine is just a lot faster than mine:
WARNING at pos 977313, and it took 94 ms with InstrJB
WARNING at pos 977313, and it took 1888 ms with InstrCrt
WARNING at pos 977313, and it took 1170 ms with FindStr
Here is the 1000 loop test piece. On my old Haswell I get 609 ms. I have no idea of what you are doing with the other two algos as your source is closed.
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
include \masm64\include64\masm64rt.inc
.code
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
entry_point proc
USING r12,r13
LOCAL pMem :QWORD
LOCAL rslt :QWORD
LOCAL tcnt :QWORD
SaveRegs
mov pMem, loadfile("windows.inc")
; ------------------------------------------------
mov r13, 1000
rcall GetTickCount
mov r12, rax
lbl:
mov rslt, rvcall(FindStr,1,pMem,"WARNING Duplicate include file windows.inc")
sub r13, 1
jnz lbl
rcall GetTickCount
sub rax, r12
mov r12, rax
; ------------------------------------------------
conout " len = ",str$(rslt)," bytes",lf
conout " timing in ms = ",str$(r12),lf,lf
mfree pMem
waitkey " thats all folks ...."
RestoreRegs
.exit
entry_point endp
; «»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»«»
end
Quote from: hutch-- on July 06, 2022, 09:42:54 PMas your source is closed
I am sorry for that. The point is that there are some individuals here, mainly Germans, who do hardly any coding but love to make fun of MasmBasic. One of them insults me frequently. Another one has 40 posts but never posted a single line of code.
I have always shared my code in the past, but I can't do so any more, knowing that these individuals would not hesitate to grab my sources, modify a line or to and then declare it their own work.
Quote from: hutch-- on July 06, 2022, 09:26:07 PM
WARNING at pos 977313, and it took 703 ms with FindStr
Quote from: hutch-- on July 06, 2022, 09:42:54 PM
On my old Haswell I get 609 ms
That's quite a difference between your first and second test - same cpu?
The first was a single pass, the next was 1000 iterations. Just under 1000 gig in just over 600 ms, probably fast enough. :biggrin:
Here is a tweaked version, I use the Agner Fog StrLen version and unrolled the scan loop by 4 and its about 30% faster. On my old Haswell, I keep getting 438 ms.
Very good :thumbsup:
In case you need more inspiration (http://masm32.com/board/index.php?topic=4631.0)