Author Topic: Unicode file names  (Read 3172 times)

jj2007

  • Member
  • *****
  • Posts: 9912
  • Assembler is fun ;-)
    • MasmBasic
Unicode file names
« on: September 10, 2012, 09:06:37 AM »
In the process of fixing a minor bug in the way the MasmBasic CL$() and wCL$() handle commandline args, I stumbled over a problem with Unicode file names. For example, you can open Notepad and save an empty file as Добро пожаловать.txt

Afterwards, when you drag it over the attached TestCL.exe, you can see the filename correctly in the console window:
Arg 0=  ..path\TestCL.exe
Arg 1=  ..path\Добро пожаловать.txt
Arg 2=  [empty]

But apparently neither WinZip 9.0 (2004) nor the latest 7-Zip can add this file to an archive. They both complain about invalid filenames. How do you guys in Russia, China and the Arabic countries zip your stuff???

Gunner

  • Regular Member
  • *
  • Posts: 48
    • Gunners Software
Re: Unicode file names
« Reply #1 on: September 10, 2012, 09:49:35 AM »
WinRar has no problems creating a zip or rar with that filename.
~Rob

jj2007

  • Member
  • *****
  • Posts: 9912
  • Assembler is fun ;-)
    • MasmBasic
Re: Unicode file names
« Reply #2 on: September 10, 2012, 08:22:42 PM »
Yes, there must be some archivers which can deal with Unicode. But it is astonishing that the market leader introduced Unicode only in 2008 (Kaplan: WinZip, the (long awaited) Unicode edition!!!)...

After all, Unicode has been around for a little while, see Unicode support in Windows 95 and Windows 98

Unicode file names require NTFS, introduced in 1993...

jj2007

  • Member
  • *****
  • Posts: 9912
  • Assembler is fun ;-)
    • MasmBasic
Re: Unicode file names
« Reply #3 on: October 05, 2012, 05:47:53 AM »
I have played a bit more with Unicode file names. Attached a sample prog that performs in English, Russian, Arabic, Chinese, Greek, Hindi and Japanese the following:
- wOpen a file called Welcome.txu in its language equivalent, e.g. as Добро пожаловать.txu
- write a Unicode text, i.e. "Enter text here" in the seven languages covered
- close the file and re-read the content into a text buffer
- display the file content, e.g. "あなたのテキストをここに入力してください" aka "Enter text here" in Japanese
- display a FileOpen dialog where you can double-click on one of the files and show its boring content in a wMsgBox.

Here is my console output:
Content ID 1: [Enter text here  21:43:26]
Content ID 401: [Введите текст здесь    21:43:26]
Content ID 801: [أدخل النص هنا  21:43:26]
Content ID 1201: [在這裡輸入文字       21:43:26]
Content ID 1601: [Πληκτρολογήστε το κείμενο εδώ 21:43:26]
Content ID 2001: [पाठ यहाँ टाइप करें    21:43:26]
Content ID 2401: [あなたのテキストをここに入力してください  21:43:26]

Feedback welcome - in particular, shout foul if the (Google) translation is no good.
In case you use \masm32\RichMasm\RichMasm.exe, you can edit the Unicode text directly in the resource section at the bottom of the file; then press F6, and RichMasm will first generate the *.rc file in Unicode format, then assemble & link the application.