Author Topic: WriteConsoleW, CRT printf and Unicode  (Read 318 times)

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
WriteConsoleW, CRT printf and Unicode
« on: February 15, 2020, 12:42:53 AM »
No MasmBasic required here: I needed a testbed for the wMsgBox macro that did not require clicking all the time. So I tried to write the Unicode strings to the console without using MasmBasic's functions.

I tried crt_printf, crt_wprintf and WriteConsoleW, after reading Raymond Chen's OldNewThing article. Here is the CRT bit:
Code: [Select]
  .if useCrt==1
invoke crt_printf, chr$("TITLE:", 9, "%ls", 13, 10, "TEXT:", 9, "%ls", 13, 10, 10), pTitle, pText
  .elseif useCrt==2
invoke crt_wprintf, uc$("TITLE:", 9, "%ls", 13, 10, "TEXT:", 9, "%ls", 13, 10, 10), pTitle, pText
  .else ; use WriteConsoleW

And here is the output - fine for WriteConsoleW, but simply wrong for the printf variants if non-Ansi Utf16 elements are involved, i.e. it works fine if you pass plain English UTF-16 strings:
Code: [Select]
--- using WriteConsoleW: ----

TITLE:  Title
TEXT:   Моя первая программа is running right now

TITLE:  Title
TEXT:   Pure Ansi text works fine

--- using crt_printf: ----

TITLE:  Title
TEXT:   TITLE:  Title
TEXT:   Pure Ansi text works fine


--- using crt_wprintf: ----

TITLE:  Title
TEXT:      is running right now

TITLE:  Title
TEXT:   Pure Ansi text works fine

Source & exe attached - pure Masm32. The question is why does wprintf() ignore the valid Russian UTF-16 part?

What is available on the incredibly knowledgable Internet e.g. googling "wprintf" "UTF16" is hilarious, so I ask it here :badgrin:

LiaoMi

  • Member
  • ****
  • Posts: 649

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
Re: WriteConsoleW, CRT printf and Unicode
« Reply #2 on: February 15, 2020, 02:51:40 AM »
Thanks, Liaomi :thumbsup:

Interesting but, as Erol noted, this does not solve the issue.
Code: [Select]
  invoke SetConsoleOutputCP, CP_UTF8
  cls

LiaoMi

  • Member
  • ****
  • Posts: 649
Re: WriteConsoleW, CRT printf and Unicode
« Reply #3 on: February 15, 2020, 04:28:24 AM »
Thanks, Liaomi :thumbsup:

Interesting but, as Erol noted, this does not solve the issue.
Code: [Select]
  invoke SetConsoleOutputCP, CP_UTF8
  cls

I saw a logical answer to the question why  :biggrin:
http://archives.miloush.net/michkap/archive/2008/03/18/8306597.html

UnicodeStandard-12.0 - http://www.unicode.org/versions/Unicode12.0.0/UnicodeStandard-12.0.pdf and https://unicode.org/versions/Unicode12.1.0/

Project1
Code: [Select]
// crt_setmodeunicode.c
// This program uses _setmode to change
// stdout to Unicode. Cyrillic and Ideographic
// characters will appear on the console (if
// your console font supports those character sets).

#include <fcntl.h>
#include <io.h>
#include <stdio.h>

int main(void) {
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"\x043a\x043e\x0448\x043a\x0430\x65e5\x672c\x56fd\n");
return 0;
}

https://godbolt.org/
Code: [Select]
$SG5528 DB        ':', 04H, '>', 04H, 'H', 04H, ':', 04H, '0', 04H, 0e5H, 'e'
        DB      ',g', 0fdH, 'V', 0aH, 00H, 00H, 00H
unsigned __int64 `__local_stdio_printf_options'::`2'::_OptionsStorage DQ 01H DUP (?) ; `__local_stdio_printf_options'::`2'::_OptionsStorage

main    PROC
$LN3:
        sub     rsp, 40                             ; 00000028H
        mov     ecx, 1
        call    __acrt_iob_func
        mov     rcx, rax
        call    _fileno
        mov     edx, 131072                         ; 00020000H
        mov     ecx, eax
        call    _setmode
        lea     rcx, OFFSET FLAT:$SG5528
        call    wprintf
        xor     eax, eax
        add     rsp, 40                             ; 00000028H
        ret     0
main    ENDP

TWell - already submitted this answer ..
« Last Edit: February 15, 2020, 08:21:49 AM by LiaoMi »

daydreamer

  • Member
  • *****
  • Posts: 1095
  • I also want a stargate
Re: WriteConsoleW, CRT printf and Unicode
« Reply #4 on: February 15, 2020, 07:44:54 AM »
Jochen,it works here
so it would work with japanese,chinese too,if you have language packs
Quote from Flashdance
Nick  :  When you give up your dream, you die
*wears a flameproof asbestos suit*
Gone serverside programming p:  :D
I love assembly,because its legal to write
princess:lea eax,luke
:)

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
Re: WriteConsoleW, CRT printf and Unicode
« Reply #5 on: February 15, 2020, 12:23:19 PM »
Jochen,it works here

Which one, the C code by LiaoMi or my attachment on top of this thread?

Only slightly related: Does the attached exe work on a non-Ansi system, such as Russian or Chinese or Turkish?

My output on Win7-64 looks like this:

Code: [Select]
Program_files:  C:\Program Files (x86)
Personal:       C:\Users\Jochen\Documents
Pictures:       C:\Users\Jochen\Pictures
Admintools:     C:\Users\Jochen\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Administrative Tools
Appdata:        C:\Users\Jochen\AppData\Roaming
Common_docs:    C:\Users\Public\Documents
Cookies:        C:\Users\Jochen\AppData\Roaming\Microsoft\Windows\Cookies
History:        C:\Users\Jochen\AppData\Local\Microsoft\Windows\History
Internet_tmp:   C:\Users\Jochen\AppData\Local\Microsoft\Windows\Temporary Internet Files
Common progs:   C:\Program Files (x86)\Common Files
System:         C:\Windows\system32
BitBucket:      ???
Admin tools:    C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools
CD burn area:   C:\Users\Jochen\AppData\Local\Microsoft\Windows\Burn\Burn1
Startup c:      C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
Desktop c:      C:\Users\Public\Desktop
« Last Edit: February 16, 2020, 02:37:25 AM by jj2007 »

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
Re: WriteConsoleW, CRT printf and Unicode
« Reply #6 on: February 16, 2020, 02:39:21 AM »
Attached two versions - please, if you have a non-English OS (Russian, Turkish, Chinese, Spanish...), let me know if there is any difference between SpecialFolderA.exe and SpecialFolderB.exe

AW

  • Member
  • *****
  • Posts: 2549
  • Let's Make ASM Great Again!
Re: WriteConsoleW, CRT printf and Unicode
« Reply #7 on: February 16, 2020, 05:15:15 AM »


Code: [Select]
include \masm32\include\masm32rt.inc
   
 includelib _msvcrt.lib
 __iob_func proto C
 _O_U16TEXT equ 00020000h
 
.data
txFirst dw 1052, 1086, 1103, 32, 1087, 1077, 1088, 1074, 1072, 1103, 32, 1087, 1088, 1086, 1075, 1088, 1072, 1084, 1084, 1072, 32, 105, 115, 32, 114, 117, 110, 110, 105, 110, 103, 32, 114, 105, 103, 104, 116, 32, 110, 111, 119, 0
txAnsi dw 80, 117, 114, 101, 32, 65, 110, 115, 105, 32, 116, 101, 120, 116, 32, 119, 111, 114, 107, 115, 32, 102, 105, 110, 101, 0

.code

main proc
LOCAL stdout : dword

call __iob_func
add eax, 20h
mov stdout, eax;

invoke crt__fileno, stdout
mov edx, eax
invoke crt__setmode, edx, _O_U16TEXT
invoke crt_wprintf, uc$("TITLE:", 9, "%ls", 13, 10, "TEXT:", 9, "%ls", 13, 10, 10), uc$("Title"), offset txFirst
invoke crt_wprintf, uc$("TITLE:", 9, "%s", 13, 10, "TEXT:", 9, "%s", 13, 10, 10), uc$("Title"), offset txAnsi
ret
main endp

end


I include a msvcrt.lib that contains __iob_func

Mikl__

  • Member
  • ****
  • Posts: 794
Re: WriteConsoleW, CRT printf and Unicode
« Reply #8 on: February 16, 2020, 01:42:12 PM »
Ciao, Jochen!
questo sono i risultati dell'esecuzione dei tuoi programmi
SpecialFolderA
Code: [Select]
test 1: [C:\Program Files (x86)] is here: 2
test: C:\Users\Mikl\Documents
test: C:\Users\Mikl\Documents
test: C:\Users\Mikl\Pictures
Program_files: C:\Program Files (x86)
Personal: C:\Users\Mikl\Documents
Pictures: C:\Users\Mikl\Pictures
Admintools: C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Administrative Tools
Appdata: C:\Users\Mikl\AppData\Roaming
Common_docs: C:\Users\Public\Documents
Cookies: C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Cookies
History: C:\Users\Mikl\AppData\Local\Microsoft\Windows\History
Internet_tmp: C:\Users\Mikl\AppData\Local\Microsoft\Windows\Temporary Internet Files
Common progs: C:\Program Files (x86)\Common Files
System: C:\Windows\system32
BitBucket: ???
Admin tools: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools
CD burn area: C:\Users\Mikl\AppData\Local\Microsoft\Windows\Burn\Burn
Startup c: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
Desktop c: C:\Users\Public\Desktop
0 C:\Users\Mikl\Desktop
2 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs
5 C:\Users\Mikl\Documents
6 C:\Users\Mikl\Favorites
7 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup
8 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Recent
9 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\SendTo
11 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu
13 C:\Users\Mikl\Music
14 C:\Users\Mikl\Videos
16 C:\Users\Mikl\Desktop
19 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Network Shortcuts
20 C:\Windows\Fonts
21 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Templates
22 C:\ProgramData\Microsoft\Windows\Start Menu
23 C:\ProgramData\Microsoft\Windows\Start Menu\Programs
24 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
25 C:\Users\Public\Desktop
26 C:\Users\Mikl\AppData\Roaming
27 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Printer Shortcuts
28 C:\Users\Mikl\AppData\Local
29 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup
30 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
31 C:\Users\Mikl\Favorites
32 C:\Users\Mikl\AppData\Local\Microsoft\Windows\Temporary Internet Files
33 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Cookies
34 C:\Users\Mikl\AppData\Local\Microsoft\Windows\History
35 C:\ProgramData
36 C:\Windows
37 C:\Windows\system32
38 C:\Program Files (x86)
39 C:\Users\Mikl\Pictures
40 C:\Users\Mikl
41 C:\Windows\SysWOW64
42 C:\Program Files (x86)
43 C:\Program Files (x86)\Common Files
44 C:\Program Files (x86)\Common Files
45 C:\ProgramData\Microsoft\Windows\Templates
46 C:\Users\Public\Documents
47 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools
48 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Administrative Tools
53 C:\Users\Public\Music
54 C:\Users\Public\Pictures
55 C:\Users\Public\Videos
56 C:\Windows\resources
59 C:\Users\Mikl\AppData\Local\Microsoft\Windows\Burn\Burn
hit any key
SpecialFolderB
Code: [Select]
Program_files: C:\Program Files (x86)
Personal: C:\Users\Mikl\Documents
Pictures: C:\Users\Mikl\Pictures
Admintools: C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Administrative Tools
Appdata: C:\Users\Mikl\AppData\Roaming
Common_docs: C:\Users\Public\Documents
Cookies: C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Cookies
History: C:\Users\Mikl\AppData\Local\Microsoft\Windows\History
Internet_tmp: C:\Users\Mikl\AppData\Local\Microsoft\Windows\Temporary Internet Files
Common progs: C:\Program Files (x86)\Common Files
System: C:\Windows\system32
BitBucket: ?
Admin tools: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools
CD burn area: C:\Users\Mikl\AppData\Local\Microsoft\Windows\Burn\Burn
Startup c: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
Desktop c: C:\Users\Public\Desktop
0 C:\Users\Mikl\Desktop
2 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs
5 C:\Users\Mikl\Documents
6 C:\Users\Mikl\Favorites
7 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup
8 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Recent
9 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\SendTo
11 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu
13 C:\Users\Mikl\Music
14 C:\Users\Mikl\Videos
16 C:\Users\Mikl\Desktop
19 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Network Shortcuts
20 C:\Windows\Fonts
21 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Templates
22 C:\ProgramData\Microsoft\Windows\Start Menu
23 C:\ProgramData\Microsoft\Windows\Start Menu\Programs
24 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
25 C:\Users\Public\Desktop
26 C:\Users\Mikl\AppData\Roaming
27 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Printer Shortcuts
28 C:\Users\Mikl\AppData\Local
29 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup
30 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup
31 C:\Users\Mikl\Favorites
32 C:\Users\Mikl\AppData\Local\Microsoft\Windows\Temporary Internet Files
33 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Cookies
34 C:\Users\Mikl\AppData\Local\Microsoft\Windows\History
35 C:\ProgramData
36 C:\Windows
37 C:\Windows\system32
38 C:\Program Files (x86)
39 C:\Users\Mikl\Pictures
40 C:\Users\Mikl
41 C:\Windows\SysWOW64
42 C:\Program Files (x86)
43 C:\Program Files (x86)\Common Files
44 C:\Program Files (x86)\Common Files
45 C:\ProgramData\Microsoft\Windows\Templates
46 C:\Users\Public\Documents
47 C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Administrative Tools
48 C:\Users\Mikl\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Administrative Tools
53 C:\Users\Public\Music
54 C:\Users\Public\Pictures
55 C:\Users\Public\Videos
56 C:\Windows\resources
59 C:\Users\Mikl\AppData\Local\Microsoft\Windows\Burn\Burn
hit any key
IMHO "Press any botton/key"

AW

  • Member
  • *****
  • Posts: 2549
  • Let's Make ASM Great Again!
Re: WriteConsoleW, CRT printf and Unicode
« Reply #9 on: February 16, 2020, 03:49:16 PM »
Actually, there is no need to use the more modern (10 years old) msvcrt.lib I attached in my last post, we can do it as well with the msvcrt.lib distributed with MASM32, like this:

Code: [Select]
include \masm32\include\masm32rt.inc
   
externdef _imp___iob : ptr
 _O_U16TEXT equ 00020000h
 
.data
; To view, select NSimSun font in console properties. Included in latest Windows.
txFirst dw 96beh,5f97h,7ccah,6d82h,0
txAnsi dw 80, 117, 114, 101, 32, 65, 110, 115, 105, 32, 116, 101, 120, 116, 32, 119, 111, 114, 107, 115, 32, 102, 105, 110, 101, 0

.code

main proc
mov eax, dword ptr _imp___iob
add eax, 20h
invoke crt__fileno, eax
invoke crt__setmode, eax, _O_U16TEXT
invoke crt_wprintf, uc$("TITLE:", 9, "%ls", 13, 10, "TEXT:", 9, "%ls", 13, 10, 10), uc$("Title"), offset txFirst
invoke crt_wprintf, uc$("TITLE:", 9, "%s", 13, 10, "TEXT:", 9, "%s", 13, 10, 10), uc$("Title"), offset txAnsi
ret
main endp

end


Ignorance is a bliss

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
Re: WriteConsoleW, CRT printf and Unicode
« Reply #10 on: February 17, 2020, 02:11:37 AM »
Ciao, Jochen!
questo sono i risultati dell'esecuzione dei tuoi programmi

Thank you, Mikl :thup:

Your output looks identical to my Italian version. What is your OS code page? I had hoped to see something like
Code: [Select]
C:\Пользователи\Mikl\документы...

My OS produces, depending on I don't know what, both C:\Program Files and C:\Programmi. Mysteries of Windows ;-)

Mikl__

  • Member
  • ****
  • Posts: 794
Re: WriteConsoleW, CRT printf and Unicode
« Reply #11 on: February 17, 2020, 03:26:33 AM »
Jochen if look through the windows explorer, then the folder "c:\Users\Mikl\Documents" transform to folder "c:\Пользователи\Mikl\Мои документы" Most likely you need to remake the program

jj2007

  • Member
  • *****
  • Posts: 10020
  • Assembler is fun ;-)
    • MasmBasic
Re: WriteConsoleW, CRT printf and Unicode
« Reply #12 on: February 17, 2020, 06:58:13 PM »
Most likely you need to remake the program

Hi Mikl,

The program is OK, it just prints out what SHGetFolderPathW delivers. But now I understand what happens here: Explorer translates what it gets from SHGetFolderPathW. I am using the FreeCommander file manager, which uses both the original version and the translation, see screenshot below - the treeview to the left is translated to Italian, the main view is English.

So apparently Windows uses only English names for the special folders.

Hans Passant:
Quote
The names of system folders like C:\Program Files are recorded by the language specific version of Windows when you first install it.  They do *not* get translated to the culture that is selected, not even if you have MUI language packs installed.  That would be a massive appcompat breaking change, there is a ton of software out there that hard-codes the path names of system folders when they get installed.  To see this for yourself, start Regedit.exe and search for "C:\Program Files".  Yes, that includes Microsoft software.

Interesting, the official Micros**t doc says the contrary:
Quote
The following example uses SHGetFolderPath, which will return the correct localized version of the path

Wrong, folks - it always returns the English path! Now the next question is how to get the translated names :cool:
Code: [Select]
.DATA?
sfFileInfo SHFILEINFOW <>
.CODE
invoke SHGetFileInfoW, edi, 0, addr sfFileInfo, SHFILEINFOW, SHGFI_DISPLAYNAME

Yep, that works! Or at least, it seemed to work: It returned "Documenti". The problem is it only returns "Documenti", whatever source file name you throw at it. GetLastError is zero, no problem, but this W function returns invariably the Ansi string "Documenti". Micros**t, anybody awake at Redmond???

AW

  • Member
  • *****
  • Posts: 2549
  • Let's Make ASM Great Again!
Re: WriteConsoleW, CRT printf and Unicode
« Reply #13 on: February 18, 2020, 03:34:44 AM »
@Mikl__,




You should have your localized name for string ID 21781 in:

C:\Windows\System32\ru-RU\shell32.dll.mui

AW

  • Member
  • *****
  • Posts: 2549
  • Let's Make ASM Great Again!
Re: WriteConsoleW, CRT printf and Unicode
« Reply #14 on: February 18, 2020, 03:57:55 AM »
Russian Documents:



String ID 21770.  :thumbsup:

Some system folders don't change names across languages, to prevent a big mess.
I think Program Files does not change, at least on this mui it does not.