News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

FindFirstFile and the question mark

Started by jj2007, December 03, 2023, 12:23:02 PM

Previous topic - Next topic

jj2007

I always believed the "?" in a dir command indicated "one character". I was wrong.

Here are the two DLLs in my Python folder:
D:\Python\python3.dll
D:\Python\python313.dll

D:\Python>dir python3?.dll
13/10/2023  10:12            68.888 python3.dll
               1 File         68.888 byte
Oops :sad:

:\Python>dir python3??.dll
13/10/2023  10:12            68.888 python3.dll
13/10/2023  10:12         6.092.568 python313.dll
               2 File      6.161.456 byte
Oops again :rolleyes:

FindFirstFile does the same. I learned something today :smiley:

This is on Windows 10. See also

Strange Windows DIR command behavior

glob: The other common wildcard is the question mark (?), which stands for one character.

Using the Question Mark Wildcard: You could say "goals?.txt" because this is going match every file that starts with goals then has one single character and then is followed by .txt.

NoCforMe

Sloppy, sloppy coding on Micro$oft's part. I'm sure this behavior has been in there since the good old DOS days.

Too bad Windoze doesn't use regular expressions like *nixes do, whose behavior is well-defined and reliable. (I suppose there are shells for Windows out there that do this?)
Assembly language programming should be fun. That's why I do it.

adeyblue

Quote from: NoCforMe on December 03, 2023, 12:38:32 PMToo bad Windoze doesn't use regular expressions
...he says straight after a post demonstrating that the question mark works exactly like it does in a regular expression.

Biterider

Quote from: NoCforMe on December 03, 2023, 12:38:32 PMToo bad Windoze doesn't use regular expressions
Hi NoCforMe
Don't forget how old the Windows shell is and how many times it has been updated.
The new shell version, PowerShell, supports regular expressions right from the beginning.
 (https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_regular_expressions?view=powershell-7.4)

Biterider

jj2007

Raymond Chen: Clarifying the documentation on wildcards accepted by FindFirstFile/FindFirstFileEx

This is the only Microsoft page that appears for a findfirstfile wildcards search - incredible. And Chen does not address the question mark issue.

Then there is the official FindFirstFileA doc:
Quote[in] lpFileName

The directory or path, and the file name. The file name can include wildcard characters, for example, an asterisk (*) or a question mark (?).

This parameter should not be NULL, an invalid string (for example, an empty string or a string that is missing the terminating null character), or end in a trailing backslash (\).

If the string ends with a wildcard, period (.), or directory name, the user must have access permissions to the root and all subdirectories on the path.

Again, silence on the question mark's behaviour - do they feel a little bit ashamed? :cool:

Then, digging deeper, one may find the dir command in M$ Learn - and it becomes interesting:
QuoteYou can use the question mark (?) as a substitute for a single character in a name. For example, typing dir read???.txt lists any files in the current directory with the .txt extension that begin with read and are followed by up to three characters. This includes Read.txt, Read1.txt, Read12.txt, Read123.txt, and Readme1.txt, but not Readme12.txt.

So, in short: Some???.dll doesn't mean "three characters", it means up to three characters, i.e. ZERO to THREE characters.

The rest of the web has a different opinion, but oh well, it seems that M$ eventually adapted the documentation to the behaviour of their software :badgrin:

NoCforMe

Quote from: adeyblue on December 03, 2023, 05:29:52 PM
Quote from: NoCforMe on December 03, 2023, 12:38:32 PMToo bad Windoze doesn't use regular expressions
...he says straight after a post demonstrating that the question mark works exactly like it does in a regular expression.

True; it's just the 95% of the other stuff that doesn't work ...

(except in that new shell that Biterider wrote about, which I have no knowledge of)
Assembly language programming should be fun. That's why I do it.

KSS

Maybe use FindFirstFileEx with fInfoLevelId param FindExInfoBasic?
QuoteThe FindFirstFileEx function does not query the short file name, improving overall enumeration speed.

jj2007

GetFiles does use FindFirstFileExW under the hood, with FindExInfoBasic and FIND_FIRST_EX_LARGE_FETCH, and it shows the same behaviour as the dir command. The weird interpretation of the question mark by Microsoft is to be found deep in the guts of FindFirstFile*

Re "improving overall enumeration speed", that is an interesting question. On the web you find this:
QuoteBest Solution
In my tests using FindFirstFileEx with FindExInfoBasic and FIND_FIRST_EX_LARGE_FETCH is much faster than the plain FindFirstFile.

Scanning 20 folders with ~300,000 files took 661 seconds with FindFirstFile and 11 seconds with FindFirstFileEx. Subsequent calls to the same folders took less than a second.

So he tried once with FFF, then with FFFx, and in run #3 it was 660 times faster. Wow, so much ignorance :biggrin:

Truth is that yes, when you search a folder, the first time it's significantly slower:
  For_ ecx=0 To 9
    NanoTimer()
    GetFiles \Masm32\*.asc
    Print NanoTimer$(), Str$(" for finding %i sources\n", Files$(?))
  Next
20 s for finding 5522 sources  <<<<<< first run
839 ms for finding 5522 sources
831 ms for finding 5522 sources
790 ms for finding 5522 sources
778 ms for finding 5522 sources
776 ms for finding 5522 sources
792 ms for finding 5522 sources
791 ms for finding 5522 sources
771 ms for finding 5522 sources
787 ms for finding 5522 sources

In the following runs, you get everything from the cache, a factor 25 faster...

Which tells us nothing about the relative performance improvement with FindExInfoBasic. In my tests, it was 7% slower with that flag (but I keep it anyway) :cool:

See also SOF:
QuoteMy measurements (W10 x64, SSD disk drive) show that FindFirstFileEx is marginally (~14%) faster than FindFirstFile. Test folder had 900K files. Enumerations took typically 1.5 sec. Except the very first enumeration, when the enumeration takes 10x longer. (For both methods, of course.) –
Jan Slodicka
 Apr 30, 2019 at 10:48
Have to correct myself regarding the very first enumeration. After many benchmark tests I can confirm that the first enumeration (after clearing the disk cache) is a) ~3x slower in case of FindFirstFileEx()(...FindExInfoBasic, ...FIND_FIRST_EX_LARGE_FETCH), b) ~20x slower in case of FindFirstFile(). –
Jan Slodicka

TimoVJL

Good to read, that there are differences in normal cases too  :thumbsup:
May the source be with you