Dear all,
Can I have some timings please? I am testing a new GetFiles (https://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1056) implementation.
Thanks, J
AMD Athlon Gold 3150U with Radeon Graphics
\Masm32
267 ms for new algo, 17400 files, bufsize 4096
2394 ms for old algo, 17398 files, ratio 8.937
231 ms for new algo, 17400 files, bufsize 4096
2333 ms for old algo, 17398 files, ratio 10.07
242 ms for new algo, 17400 files, bufsize 4096
2342 ms for old algo, 17398 files, ratio 9.640
It yields only *.as?, *.dll, *.rc, *.inf, *.txu and *.etl files in your \Masm32 folder. Start the exe from a DOS prompt on your \Masm32 drive.
Not sure how much use this is :biggrin:
5 ms for new algo, 936 files, bufsize 4096
44 msng for old algo, 936 files, ratio 7.390
5 ms for new algo, 936 files, bufsize 4096
36 msng for old algo, 936 files, ratio 7.192
5 ms for new algo, 936 files, bufsize 4096
35 msng for old algo, 936 files, ratio 6.330
msng?
Quote from: sinsi on May 14, 2024, 08:50:14 PMNot sure how much use this is
You get a complete folder tree in a Files$(index) array. For me it's occasionally useful :biggrin:
OK, I was thinking that 936 files was a small sample size timing-wise compared to 17400.
No problem, the ratios are pretty clear: about a factor 7. Btw the commandline accepts any folder path, in case you want to test it, at your own risk, with C:\Windows :badgrin:
Quote from: sinsi on May 14, 2024, 09:13:48 PMOK, I was thinking that 936 files was a small sample size timing-wise compared to 17400.
Me too. From my E: drive
C:\Users\Administrator\Downloads\GetFilesNew>GetFilesNew.exe
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
E:\
1873 ms for new algo, 36843 files, bufsize 4096
7735 ms for old algo, 36859 files, ratio 4.129
1095 ms for new algo, 36843 files, bufsize 4096
7886 ms for old algo, 36859 files, ratio 7.197
1181 ms for new algo, 36843 files, bufsize 4096
7807 ms for old algo, 36859 files, ratio 6.609
C:\Users\Administrator\Downloads\GetFilesNew>pause
Press any key to continue . . .
Hmmm..
The new algo is missing some files, in my test.
The old algo missing a couple in your test, jj.
Not good for the case where those missing files were the ones I might have been searching for.
It seems that the two algos do not behave in the same manner == not equal. There are hiccups in both, it appears.
If the results are not identical between the two algos, the test is invalid - results must be verified for accuracy. If the results do not match between the two different algos... ???
Quote from: sudoku on May 14, 2024, 11:13:07 PMThe new algo is missing some files, in my test.
The old algo missing a couple in your test, jj.
Can you give me an example? The old one finds more files because it accepted xx.inf_loc (frequent in WinXSX) as *.inf, which was wrong imho, so I eliminated that behaviour. The new one should not miss any files, in theory, so I'd be grateful if you could give me examples of missing files.
Quote from: jj2007 on May 15, 2024, 12:24:37 AMThe new one should not miss any files, in theory, so I'd be grateful if you could give me examples of missing files.
The first test had the drive listed as E:\ I changed it to E: as there were two backslashes in the first round.
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
E:
2042 ms for new algo, 36843 files, bufsize 4096
Running GetFiles
8474 ms for old algo, 36859 files, ratio 4.150
1150 ms for new algo, 36843 files, bufsize 4096
Running GetFiles
8545 ms for old algo, 36859 files, ratio 7.425
1131 ms for new algo, 36843 files, bufsize 4096
Running GetFiles
8491 ms for old algo, 36859 files, ratio 7.508
Attachment removed as the intended recipient has received it.
Thanks, very useful :thumbsup:
NewF.txt, line 3024:
E:\archive...scripts\pass
A file with no extension; I'll have to check how to deal with that :cool:
GetF.txt, line 6076:
E:\archive...test\.asm
Not found by the new algo - hmmmm... new version attached, should solve this problem.
File was produced by an errant program. :biggrin: "I've been through the desert on a horse file with no name"...
Yes, you have plenty of them. Version 2 ^^^ should find them.
Is that where all the 'missing' files were not found?
I'll run the latest version when I get back in front of my computer.
Hmmm... using drive E: as an argument produces this now:
Image removed.
I really don't need a list of the masm32 examples folder. :smiley:
But if you insist...
D:\>GetFilesNewV2.exe
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
\Masm32\Examples
6 ms for new algo, 474 files, bufsize 4096
47 ms for old algo, 474 files, ratio 6.919
D:\>pause
Press any key to continue . . .
btw hard coding the E: drive (as search path) in olly produces similar results as in the png image above and no filelist files produced, in either case.
Anyway, I cannot change the path to recreate the original search on my "E:" drive... and successfully create the file listings with any of the methods I had tried.
Is there any way that you could use a dialog box for the user to select the path to search, similar to my file lister Here (https://masm32.com/board/index.php?msg=129508) ??? That would simplify it a bit. (At least for the user)
Weird, that dialog with E:...
AMD Athlon Gold 3150U with Radeon Graphics
E:
54 ms for new algo, 368 files, bufsize 4096
374 ms for old algo, 368 files, ratio 6.914
Is your E: drive extremely full?
Tried with C:\ (C: gives a different result, are you using relative paths?)
(3e68.280c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=c0000005 ebx=0018af28 ecx=00000058 edx=0018add8 esi=0018aee8 edi=0018c247
eip=00007300 esp=0018adf8 ebp=65006c00 iopl=0 nv up ei ng nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010286
00007300 ?? ???
Call Stack Return Address = GetFilesNewV2+0x8078
00408078 61 73 3f 20 64 6c 6c 20-72 63 20 20 69 6e 66 20 as? dll rc inf
00408088 74 78 75 20 65 74 6c 20-00 5c 2a 2e 61 73 3f 7c txu etl .\*.as?|
00408098 2a 2e 64 6c 6c 7c 2a 2e-72 63 7c 2a 2e 69 6e 66 *.dll|*.rc|*.inf
004080a8 7c 2a 2e 74 78 75 7c 2a-2e 65 74 6c 00 00 00 00 |*.txu|*.etl....
004080b8 5c 6e 25 69 20 6d 73 00-5c 74 66 6f 72 20 6e 65 \n%i ms.\tfor ne
004080c8 77 20 61 6c 67 6f 2c 20-25 69 20 66 69 6c 65 73 w algo, %i files
004080d8 00 00 00 00 2c 20 62 75-66 73 69 7a 65 20 25 69 ...., bufsize %i
004080e8 00 00 00 00 4e 65 77 46-2e 74 78 74 00 00 00 00 ....NewF.txt....
Quote from: jj2007 on May 15, 2024, 02:31:40 AMIs your E: drive extremely full?
Nope. But there are 433607 total files there, which shouldn't make a difference.
Image removed.
Memory usage keeps climbing until that dialog box pops up again.
Several images removed.
Quote from: sudoku on May 15, 2024, 02:48:22 AMMemory usage keeps climbing until that dialog box pops up again.
Interesting. The only HeapAlloc in this program is the buildup of the string array. Assuming a path has 100 bytes on average, you would need 11 million files to arrive there.
It seems there is an endless loop going on. The question is why... I will see if I can cook up a debugging version, thanks a lot for your feedback :thup:
Quote from: jj2007 on May 15, 2024, 03:49:42 AMIt seems there is an endless loop going on. The questions is why... I will see if I can cook up a debugging version, thanks a lot for your feedback :thup:
The problem is not exclusive to my E: drive. Using C: as a parameter, I have the same results. I get the "low on memory" dialog box again.
OK, here is a test version that displays memory use and #files every 256 paths created. I am curious...
Okay, on the first run with C:, it completed successfully. I did not save the results that I had in an open txt file on the desktop. There were 17000 odd files there. (I had rebooted - read further)
On subsequent runs...
E: drive:
Image removed.
C: drive:
Image removed.
I did not let this run finish, I knew the outcome. I rebooted to see if that would make a difference, but alas, every time after the first run fails.
On the first run (drive C:), it did not list the files as you see in the pngs above, but exited gracefully after displaying the results and pressing "any" key.
Just for kicks, I reran the program for C: drive
C:\Users\Administrator\Downloads\GetFilesNewDebug>GetFilesNewDebug.exe C:
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
Scanning C:
0 ms for new algo, 1 files, bufsize 4096
20856 ms for old algo, 17246 files, ratio 5.794e+05
hit any key
new algo 1 file? But nothing is in that txt file
ratio 5.794e+05???
Over 6 Million files? Almost impossible...
There are also many \.\.\.\. sequences in your screenshot, which don't make much sense.
Here's what I see:
Scanning D:\Masm32
# 0, 0 MB D:\Masm32\asmc-master\li..ter\lib\amd64\import.asm
# 1024, 0 MB D:\Masm32\asmc-master\so..lib\src\conio\dlshow.asm
# 2048, 0 MB D:\Masm32\asmc-master\so..\lib64\string\memcpy.asm
# 3072, 0 MB D:\Masm32\CmdGUI\LibTmp\LibTmpFT.asm
# 4096, 0 MB D:\Masm32\Gfa2Masm\DimRe..l\DimRecallGetFilesC.asc
# 5120, 0 MB D:\Masm32\GmpBigNum\gmp-..ltrasparct3\addmul_1.asm
# 6144, 0 MB D:\Masm32\Jwasm\Samples\..Ctrl\Release\AsmCtrl.dll
# 7168, 0 MB D:\Masm32\MasmBasic\AscU..YourManifestDualBuild.rc
# 8192, 0 MB D:\Masm32\MasmBasic\AscU..\TmpOut\PellesC_stub.asc
# 9216, 0 MB D:\Masm32\MasmBasic\BugT..sts\InstrFast4Arrays.asc
#10240, 0 MB D:\Masm32\MasmBasic\BugT..ol\GuiTableControlOld.rc
#11264, 0 MB D:\Masm32\MasmBasic\Misc..HighestFrequencySort.asc
#12288, 0 MB D:\Masm32\MasmBasic\Res\MbGui30J.asc
#13312, 0 MB D:\Masm32\MasmBasic\Timi..mings\Shift_vs_movsx.asc
#14336, 0 MB D:\Masm32\Members\kkurki..urkiewicz\Dialog\prog.rc
#15360, 0 MB D:\Masm32\ObjAsm32\Code\..ode\COM\COM_Dispatch.asm
#16384, 0 MB D:\Masm32\RichMasm\Db32\DbMap4APo.asc
484 ms for new algo, 17401 files, bufsize 4096
2396 ms for old algo, 17399 files, ratio 4.946
Now I run it on my C: drive... finished:
42061 ms for new algo, 68082 files, bufsize 4096
113748 ms for old algo, 70073 files, ratio 2.704
I piped the output to a text file... this is for E:
I tried one more time. Memory kept going up and up, so I aborted
All the filenames have 20h padding after the (truncated) filenames??
Quote from: jj2007 on May 15, 2024, 05:23:46 AMOver 6 Million files? Almost impossible...
Yes indeedy. On E:. I only have 433607 files! On C: much less than that
Obviously we need more testers here, to ensure that this is not just another one of those
ME issues. :tongue: (its a curse I have had once or thrice before)
I attached some of the output in the post above.
Quote from: sudoku on May 15, 2024, 05:28:53 AMAll the filenames have 20h padding after the (truncated) filenames??
Yes, for cosmetic reasons only.
What do these sequences mean? Do you also get them e.g. with a dir E:\archive\desktops\*.rc?
#1024, 0 MB E:\archive\desktops\2023..ix
\.\.\.\.\.\rsrc.old.rc
Btw both the new and the old algo write all files to disk as NewF.txt and GetF.txt
Quote from: jj2007 on May 15, 2024, 05:53:00 AMDo you also get them e.g. with a dir E:\archive\desktops\*.rc?
I can check that later... I have other things to attend to at the moment.
Quote from: jj2007 on May 15, 2024, 05:53:00 AMDo you also get them e.g. with a dir E:\archive\desktops\*.rc?
No.
I'll be back later.
Quote from: sudoku on May 15, 2024, 06:07:09 AMattached as rc.zip
Thanks. Weird, I don't have the faintest idea what's going on. Can I come over to debug it on your machine? :biggrin:
Quote from: jj2007 on May 15, 2024, 06:44:38 AMThanks. Weird, I don't have the faintest idea what's going on. Can I come over to debug it on your machine?
lol
I just think we need more testers. I know some of my files were an issue (those with only an extension) , but other than that, who knows???
I have been playing with this for over four hours now, time for someone else to step up and give you some other results. :smiley:
Quote from: jj2007 on May 15, 2024, 05:23:46 AMThere are also many \.\.\.\. sequences in your screenshot, which don't make much sense.
I had tons of those in a previous Windows 7 installation, apparently some kind of link or virtual folder; gives you "access denied" if you try to click on the path in Explorer. Also very difficult to get rid of! (My current Windows 7 has none of these.)
Quote from: NoCforMe on May 15, 2024, 07:25:28 AMQuote from: jj2007 on May 15, 2024, 05:23:46 AMThere are also many \.\.\.\. sequences in your screenshot, which don't make much sense.
I had tons of those in a previous Windows 7 installation, apparently some kind of link or virtual folder; gives you "access denied" if you try to click on the path in Explorer. Also very difficult to get rid of! (My current Windows 7 has none of these.)
Just so you know NoCforMe, none of those show up using 'dir', or my own 'file lister program' (linked above in #12) to get the file listings. Some other oddity happening here. And all files listed by those two methods are accessible.
But just for kicks, I will reinstall Windows 7 later tonight. jj, don't take down the attachments. I wil be testing those with a clean install later tonight. (It's 5:ish PM here right now) :biggrin:
When you run it with this path, what happens? Can you post the NewF.txt please?
#512, 0 MB E:\archive\desktops\2023.. 2024\.\.\.\SFSudoku.asm
#768, 0 MB E:\archive\desktops\2023..\solver\.\.\.\solver.asm
Quote from: jj2007 on May 15, 2024, 07:43:38 AMWhen you run it with this path, what happens? Can you post the NewF.txt please?
#512, 0 MB E:\archive\desktops\2023.. 2024\.\.\.\SFSudoku.asm
#768, 0 MB E:\archive\desktops\2023..\solver\.\.\.\solver.asm
The debug version or the original version?
From the debug version:
C:\Users\Administrator\Downloads\GetFilesNewDebug>GetFilesNewDebug.exe E:\archiv
e\desktops\2023.. 2024\.\.\.\SFSudoku.asm
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
Scanning E:\archive\desktops\2023.. 2024\.\.\.\SFSudoku.asm
0 ms for new algo, 1 files, bufsize 4096
0 ms for old algo, 0 files, ratio 3.104
hit any key
C:\Users\Administrator\Downloads\GetFilesNewDebug>GetFilesNewDebug.exe E:\archiv
e\desktops\2023..\solver\.\.\.\solver.asm
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
Scanning E:\archive\desktops\2023..\solver\.\.\.\solver.asm
0 ms for new algo, 1 files, bufsize 4096
0 ms for old algo, 0 files, ratio 3.474
hit any key
both files (GetF.txt and NewF.txt) remain at zero bytes, even though the 'new' version reports '1 files'.
The file "E:\archive\desktops\2023.. 2024\.\.\.\SFSudoku.asm" should be
"E:\archive\desktops\2023\05 may\5-19-2023 desktop 1\sudoku 2024\SFSudoku.asm", or similar. As shown here...
Image removed.
I can navigate there just fine with windows explorer, btw. And none of them are 0 bytes.
Image removed.
Quote from: jj2007 on May 15, 2024, 07:43:38 AMWhen you run it with this path, what happens?
Sorry, I meant with the first part of this path, i.e. E:\archive\desktops\2023 (or similar, I can't see the complete path)
Quote from: jj2007 on May 15, 2024, 08:07:02 AMSorry, I meant with the first part of this path, i.e. E:\archive\desktops\2023
C:\Users\Administrator\Downloads\GetFilesNewDebug>GetFilesNewDebug.exe E:\archiv
e\desktops\2023
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
Scanning E:\archive\desktops\2023
0 ms for new algo, 1 files, bufsize 4096
0 ms for old algo, 0 files, ratio 0.6126
hit any key
both files remain zero bytes.
I am going to do a clean install later this evening. I will run both tests afterwards.
From a clean install version2 on drive C:
C:\Users\Administrator\Downloads\GetFilesNew2>GetFilesNewV2.exe C:
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
C:
0 ms for new algo, 1 files, bufsize 4096
21477 ms for old algo, 16137 files, ratio 6.136e+05
C:\Users\Administrator\Downloads\GetFilesNew2>pause
Press any key to continue . . .
from clean install debug version drive C:
C:\Users\Administrator\Downloads\GetFilesNewDebug>GetFilesNewDebug.exe C:
Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz
Scanning C:
0 ms for new algo, 1 files, bufsize 4096
4318 ms for old algo, 16137 files, ratio 7.080e+04
hit any key
In both cases, NewF.txt is zero bytes, GetF.txt is appropriately sized.
For drive E:, it had the memory creep issues. So, I did not let them finish.
I am now officially done here. :smiley:
Regarding .\.\.\ ...:
The term I was looking for is "junction point" or "reparse point". More info here (https://www.geeksforgeeks.org/creating-junction-points/#).
Quote from: sudoku on May 15, 2024, 08:10:35 AME:\archiv
e\desktops\2023
Is that the full folder path? Or could it be
E:\archive\desktops\2023 whatever?
If there is a space in the path, one would need to use GetFilesNewDebug.exe "
E:\archive\desktops\2023 whatever" (if you drag a file with spaces around, Windows adds the quotes).
I have no other explanation for the 0 files result :sad:
Even if there were spaces in the path, your program should be able to handle that. Especially if the search path given is only the drive letter, I.e., C:, D:, E:, .....
Anyway, I am copying the entire contents of my E: partition, file by file, folder by folder, to another drive.
Image removed.
I am thinking that the file system is compromised on the E: partition somehow. Probably a good time to delete some old test code and other junk (unusable projects, etc) , that will probably never be used again. :biggrin:
As a side note, I am quite sure that there are no junction points, reparse points, symbolic links, etc. on my E: partition. Windows does that with some files/folders though, on the OS partition (winsxs for instance).
476,197 files, wow. Ok, that explains perhaps the memory use. So now I am running into a conceptual issue: a Million files is not a problem for the
dir command: it just dumps them all to the console. In contrast, GetFiles() (https://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1056) creates an array of strings for further processing in memory. Which implies there will be a limit...
Quote from: sudoku on May 15, 2024, 09:00:51 AMEven if there were spaces in the path, your program should be able to handle that.
Actually, it can handle spaces, I just checked, and it turns out it uses the complete commandline.
Because you use HD,shouldn't each tester write drive info ?
Ramdrive,ssd,physical HD,old HD or brand new HD
Would fragmented / not fragmented drive have effect on speed ?
Yes, drive info could be useful, but it's not essential. What I really want to see is the speed difference between the old version (relying on FindFirstFile) and the new one.
Less than a half million files is a small number. You wanted to test speed difference between two algos, I chose that partition for a reason. You never know how the end user will use the function, presuming you are including it in MasmBasic. So, you can't really cherry pick the easiest search path with much fewer files. I have an external drive with over 4 million files. But that would be exteremely slow to search, as it is a USB2 connection which inherently makes it slower than an internal hard drive or partition. :biggrin:
If it is going to be "production" code, it should be put through the wringer with various tests to ensure it is bullet proof. :cool:.
Anyway, it is clear that your new function is much faster than the old. :thumbsup: about 7x
[detour]
The upshot of my part in this testing, is that I have cleaned up that partition. Removed the errant ".asm" files and associated files, got rid of some old test code, got rid of several folders that were extracted from large .zip files (I left the .zip files of course, for use when needed), and some other housekeeping chores on a couple other partitions. :biggrin: So far, I have not found any corrupted files.
Now doing housekeeping on my external drives is another matter. Too big of a task to do in one day, I'll have to put that on the back burner. lol
Side note: The partition that I cleaned had an MFT almost 800 MB. After the cleanup and formatting, and moving the files back, the MFT is now only 73 MB. Not that space is an issue these days, but thought it was worth a mention. The MFT continually grows, but doesn't shrink when files are deleted (without using external tools). Moving the files en mass to another partition, formatting then moving them back also works to reduce the MFT size. Sorry for the detour.[/detour]