News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Win7 opening a unicode file name using the 8dot3 (short) name

Started by MtheK, July 13, 2015, 02:51:11 AM

Previous topic - Next topic

MtheK

FYI

  A while back, I encountered unicode (non-latin) entries in the
registry via my MASM program. Although I finally got it to work,
w/much butchering of my code, and it was horribly slow (10x slower
when using the W calls, which I made optional), I vowed never to
butcher any of my other programs w/this.

   Well, never has arrived. My FIND MASM program (ASCII build)
encountered a few DIRs that have some unicode file names, and it
couldn't open them, getting rc=123 (ERROR_INVALID_NAME).
An 'INVOKE FindNextFile' returns this:


3569~1.AST   C:\PROGRA~1\Corel\CORELH~1\TEMPLA~1\RU\???????? ???????? ???????.ast


and a DOS Prompt (cmd.exe) 'DIR /x' shows the same:


03/11/2009  05:11 PM           180,736 3569~1.AST   ???????? ???????? ???????.ast


I found that if I re-constructed the PATH using the 8dot3 names
instead of the long names, then they opened and it was able to read
thru them successfully. I reported the file names w/the 8dot3 names
as the ? marks weren't unique.

  But, in my CLONE MASM program (for incremental backups), I do
'INVOKE CopyFileEx' which failed as FIND did B 4 the fix.  But, when
using the re-constructed PATH, the copied files had the 8dot3 names
as the long names instead of the un-printable names (Explorer shows
them graphically).

  Playing around w/DOS Prompt, I found that a 'COPY' DID keep the
unicode file name intact, so I figured that MASM could do it as well;
I just had to find it. I found that if I did this:


         manually converted the re-constructed PATH to WCHARs
           (maybe use 'INVOKE MultiByteToWideChar'?)
         INVOKE GetLongPathNameW,...
         INVOKE CopyFileExW,...


then the copied file names from 'GetLongPathNameW' remained intact
(unicode names).

  I really like this trick of using the 8dot3 names to access these
files using my normal ASCII-built MASM programs. As long as they only
read them, this works great. It seems that the Web has articles
saying the same thing. I think, but am not sure, that every file has
an 8dot3 name (in my case they do), tho I know that 'fsutil file
setshortname' can set/erase them, and, I think, a volume/system can
be set to not save them (NTFSDisable8dot3NameCreation).  But for me,
this is WAY BETTER than what I had to do for that 1 program (I don't
think the registry has "8dot3" names).  Probably, knowing what I now
know about unicode, it could be written better, but, really, it
probably should be written from scratch as I have SO MANY CHECKS to
make this work correctly. And that's assuming this 10-fold response
time goes away w/a re-write! In any case, I'm really glad that now I
don't have to do ANY of that for ANY of my other MASM programs that
use files.

dedndave

i like to use qWord's "tchr" macro to create UNICODE strings
strings created with this macro return the correct values when LENGTHOF and SIZEOF operators are used

;tchr macro by qWord

tchr    MACRO   lbl,args:VARARG
    IFDEF __UNICODE__
        UCSTR lbl,args
    ELSE
        lbl db args
    ENDIF
        ENDM


so - to create a file (or open one) with UNICODE strings, you must prepend the path with "\\?\"

    tchr szPath,"\\?\C:\UNICODEFILENAME",0

now, szPath may be used with CreateFileW (UNICODE version of CreateFile)

MtheK

  UCSTR seems to be static only? Is there a corresponding
execution-time routine? That is, after I dynamically re-construct the
PATH in ASCII w/the 8dot3 DSN, can I feed something that address and
length? In my CLONE program, 'INVOKE MultiByteToWideChar' works
w/codepage=CP_ACP (0; my base codepage?), tho there seems to be many
warnings in the use of this.

dedndave

i think UCSTR is a macro from the masm32 package
it creates the string at assembly-time

multibytetowide seems like the right one to use at run-time

there are some functions that will go from long to short and short to long, but i haven't used them much

hutch--

A macro like UCSTR is for creating a unicode string, manipulating and working with unicode strings is done the normal way with memory and unicode API functions.

Oliver Scantleberry

MtheK:

Have you ever heard of a New York City Disk Jockey from the '60s named "Murray The K"?

MtheK

 :icon_cool:

  Like CLARK the "K"! in "Wapshire", England in Superman #182 1966 ?
"It's smashing! You, CLARK the "K", will be England's answer to
America's "Murray the 'K'"!".   "...AND I'VE GOT IT!!".

rrr314159

Marvel Comics was much better than DC. DC comics (Superman, Batman, Flash, Green Lantern etc) was hot stuff up to the age of 10 or so, but we didn't outgrow Marvel (spiderman, Fantastic 4, Dr. Strange etc) until about 14. Today of course Marvel's too intellectual for typical adults; DC is just right. I wish I had a 1966 Superman in good condition (my oldest is 10 years later), but what a disappointment it is to read it - you can't believe you used to love this dreck
I am NaN ;)