Quora (https://www.quora.com/In-Python-how-do-you-search-for-a-substring-in-a-giant-string):
QuoteIn Python, how do you search for a substring in a giant string?
...
I made a 1GB+ string and timed searching for a substring which appears at the very end of the string:
Under 1 second on my MacBook Pro
Obviously, that tempted me ;-)
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
bytes=160000000 ; adapt to your HeapAlloc limits
uselib ShlWapi
Init
PrintCpu 0
Let esi=New$(bytes) ; create a fat buffer
Let edi=String$(20000,"x") ; and a small one with xxx
mov ebx, esi
For_ ecx=1 To bytes/20000-1
invoke MbCopy, ebx, edi, 20000 ; fill the buffer
xchg eax, ebx
Next
invoke MbCopyz, ebx, Chr$("There it is") ; add the string to search for
NanoTimer() ; time it
mov edx, bytes ; pass total length of string in edx
Print Str$("fast Instr_ at\t%i: ", Instr_(FAST, esi, "it is", 64))
PrintLine NanoTimer$()
NanoTimer()
Print Str$("slow Instr_ at\t%i: ", Instr_(esi, "it is")) ; slow Instr_() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1153)
PrintLine NanoTimer$()
NanoTimer()
Print Str$("Masm32 find$ at\t%i: ", find$(1, esi, "it is")) ; slow
PrintLine NanoTimer$()
NanoTimer()
invoke crt_strstr, esi, Chr$("it is")
sub eax, esi
Print Str$("CRT strstr at\t%i: ", eax) ; slow
PrintLine NanoTimer$(), CrLf$, CrLf$, "case-insensitive:"
NanoTimer() ; time it
mov edx, bytes ; pass total length of string in edx
Print Str$("fast Instr_ at\t%i: ", Instr_(FAST, esi, "there it is", 64+2))
PrintLine NanoTimer$(), " (only first character case-insensitive)"
NanoTimer()
Print Str$("slow Instr_ at\t%i: ", Instr_(esi, "there it is", 1)) ; slow Instr_() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1153)
PrintLine NanoTimer$(), " (full case-insensitive search)"
; NanoTimer()
; Print Str$("Masm32 match at\t%i: ", find$(1, esi, "there it is")) ; no such function
; PrintLine NanoTimer$()
NanoTimer()
lea edx, [esi+bytes-1000000]
push edx
invoke StrStrIA, edx, Chr$("there it is")
pop edx
sub eax, edx
Print Str$("StrStrIA at\t%i: ", eax)
Print NanoTimer$(), " (shortened because it's utterly slow)"
EndOfCodeOutput:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
fast Instr_ at 1599980007: 171 ms
slow Instr_ at 1599980007: 1504 ms
Masm32 find$ at 1599980007: 1748 ms
CRT strstr at 1599980006: 1331 ms
case-insensitive:
fast Instr_ at 1599980001: 164 ms (only first character case-insensitive)
slow Instr_ at 1599980001: 1492 ms (full case-insensitive search)
StrStrIA at 980000: 223 ms (shortened because it's utterly slow)
P.S.: Recently, I upgraded my RAM from 4 to 6 GB. Before, I could HeapAlloc about 800MB, not it's about 1.6GB, sometimes even 1.8 :t
8 GB ram installed
AMD A4-7210 APU with AMD Radeon R3 Graphics
fast Instr_ at 1599980007: 290 ms
slow Instr_ at 1599980007: 2817 ms
Masm32 find$ at 1599980007: 2277 ms
CRT strstr at 1599980006: 2625 ms
case-insensitive:
fast Instr_ at 1599980001: 291 ms (only first character case-insensitive)
slow Instr_ at 1599980001: 2823 ms (full case-insensitive search)
StrStrIA at 980000: 377 ms (shortened because it's utterly slow)
Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
fast Instr_ at 1599980007: 110 ms
slow Instr_ at 1599980007: 677 ms
Masm32 find$ at 1599980007: 674 ms
CRT strstr at 1599980006: 710 ms
case-insensitive:
fast Instr_ at 1599980001: 107 ms (only first character case-insensitive)
slow Instr_ at 1599980001: 678 ms (full case-insensitive search)
StrStrIA at 980000: 98 ms (shortened because it's utterly slow)
OK, I'll weigh in..
AMD A6-9220e RADEON R4, 5 COMPUTE CORES 2C+3G
fast Instr_ at 1599980007: 288 ms
slow Instr_ at 1599980007: 1856 ms
Masm32 find$ at 1599980007: 1864 ms
CRT strstr at 1599980006: 1948 ms
case-insensitive:
fast Instr_ at 1599980001: 289 ms (only first character case-insensitive)
slow Instr_ at 1599980001: 1851 ms (full case-insensitive search)
StrStrIA at 980000: 266 ms (shortened because it's utterly slow)
Windows 7 Pro, 32 bit btw.
These AMD's seem to be really slow. :(
32 GB ram installed
Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
fast Instr_ at 1599980007: 136 ms
slow Instr_ at 1599980007: 795 ms
Masm32 find$ at 1599980007: 788 ms
CRT strstr at 1599980006: 822 ms
case-insensitive:
fast Instr_ at 1599980001: 114 ms (only first character case-insensitive)
slow Instr_ at 1599980001: 798 ms (full case-insensitive search)
StrStrIA at 980000: 121 ms (shortened because it's utterly slow)
Quote from: LiaoMi on January 26, 2019, 10:17:36 PM
32 GB ram installed
That's quite a lot! Can you test where is the limit? Run the exe from a DOS prompt as
PythonSearch.exe 2.0 meaning "try to allocate 2GB". Without a commandline arg, 1GB is assumed.
:biggrin:
Quotefrom a DOS prompt as PythonSearch.exe 2.0
Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
trying with 1.00 GB
fast Instr_ at 1073720007: 109 ms
slow Instr_ at 1073720007: 1038 ms
Masm32 find$ at 1073720007: 805 ms
CRT strstr at 1073720006: 842 ms
case-insensitive:
fast Instr_ at 1073720001: 105 ms (only first character case-insensitive)
slow Instr_ at 1073720001: 908 ms (full case-insensitive search)
StrStrIA at 978176: 187 ms (shortened because it's utterly slow)
no matter how I enter, always displays a message "trying with 1.00 GB"
You are running it with the default 1GB. Can you go higher, e.g. *.exe 1.99? I wonder if HeapAlloc can accept values >2.0 GB (it's 32-bit code) (see also Raymond Chen on A 32-bit application can allocate more than 4GB of memory, and you don't need 64-bit Windows to do it (https://blogs.msdn.microsoft.com/oldnewthing/20090706-00/?p=17623))
On my 12GB machine, 1.67 GB is the max:
Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz
trying with 1.67 GB
fast Instr_ at 1793120007: 199 ms
slow Instr_ at 1793120007: 1688 ms
Masm32 find$ at 1793120007: 1937 ms
CRT strstr at 1793120006: 1508 ms
case-insensitive:
fast Instr_ at 1793120001: 185 ms (only first character case-insensitive)
slow Instr_ at 1793120001: 1665 ms (full case-insensitive search)
StrStrIA at 971154: 220 ms (shortened because it's utterly slow)
1.99 - :( Fatal Error - HeapAlloc
PythonSearch 1.52 - is the max
Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
trying with 1.52 GB
fast Instr_ at 1632060007: 117 ms
slow Instr_ at 1632060007: 819 ms
Masm32 find$ at 1632060007: 791 ms
CRT strstr at 1632060006: 833 ms
case-insensitive:
fast Instr_ at 1632060001: 122 ms (only first character case-insensitive)
slow Instr_ at 1632060001: 825 ms (full case-insensitive search)
StrStrIA at 972428: 132 ms (shortened because it's utterly slow)
Hutch will be happy to see that even with 32GB of RAM, you can allocate only 1.5GB in a 32-bit process :P
Windows 7 x64 8 GBPythonSearch2.exe 1.99
AMD Athlon(tm) II X2 220 Processor
trying with 1.99 GB
fast Instr_ at 2136720007: 396 ms
slow Instr_ at 2136720007: 2712 ms
Masm32 find$ at 2136720007: 2949 ms
CRT strstr at 2136720006: 2370 ms
case-insensitive:
fast Instr_ at 2136720001: 398 ms (only first character case-insensitive)
slow Instr_ at 2136720001: 2720 ms (full case-insensitive search)
StrStrIA at 973770: 227 ms (shortened because it's utterly slow)
Wow! What you get with 2.01?
After 1.99 it fallback to 1 GB.
Forgot to mention of using a editbin.exe /LARGEADDRESSAWARE PythonSearch2.exe
without it 1.78 GB
Quote from: jj2007 on January 27, 2019, 04:42:08 AM
Quote from: TimoVJL on January 27, 2019, 02:52:07 AM
After 1.99 it fallback to 1 GB.
Forgot to mention of using a editbin.exe /LARGEADDRESSAWARE PythonSearch2.exe
without it 1.78 GB
Sorry, forgot to allow >2GB :icon_redface:
This one allows >2GB and is linked with /LARGEADDRESSAWARE
Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz
trying with 1.99 GB
fast Instr_ at 2136720007: 154 ms
slow Instr_ at 2136720007: 1093 ms
Masm32 find$ at 2136720007: 1053 ms
CRT strstr at 2136720006: 1091 ms
case-insensitive:
fast Instr_ at 2136720001: 171 ms (only first character case-insensitive)
slow Instr_ at 2136720001: 1062 ms (full case-insensitive search)
StrStrIA at 973770: 122 ms (shortened because it's utterly slow)
> 2 Gb
---------------------------
Fatal error:
---------------------------
Insufficient buffer or New$(<=0)
---------------------------
OK
---------------------------
> Hutch will be happy to see that even with 32GB of RAM, you can allocate only 1.5GB in a 32-bit process
I test with 64 gig of memory but with /LARGEADDRESSAWARE in 32 bit you should be able to allocate up to 2 gig and still have about 700 meg free. Depends on the video card.
@JJ
I'm not sure why Python is mentioned or whether why you are trying to challenge Python, but anything in assembly will be monumentally faster. The Python method is find('substring', start, finish). In Ruby, it is include?, but Ruby has a bajillion versions all of them are syntactic sugar for the same thing...
A while back in a Data Structures class we had to implement string search algorithms, so I made one from scratch in Java, Ruby and Python, and tested them. The preferred algorithm to search for a substring in an array of characters is some version of rabin-karp or knuth-pratt algorithms. I believe that all languages use some form of these, optimized for the language platform.
I attached a file done in Ruby that is based on the rabin-karp, it can search any string and find the pattern and offset, and match it.
Anywho, I will add this to my list of scripts to make in assembly, this will be fun, since it requires a good random num generator, prime generator, regex parser, and hash generator.
Python is slow. Except when it is fast.
We must keep separate the language (often slow) and its functions, which may have been coded in excellent assembly. Python, for example, has the fastest sort functions of all hi level languages.