News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Command line parsing modules

Started by Vortex, May 02, 2017, 05:31:12 AM

Previous topic - Next topic

Vortex

Hello,

Attached is a set of modules to parse ANSI and UNICODE command lines.

Siekmanski

Creative coders use backward thinking techniques as a strategy.

jj2007

Hi Erol,
What's wrong with this test?

C:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUnicode.exe Arg1 Введите текст здесь Arg5
Command line parameter 0 = TestUnicode.exe
Command line parameter 1 = Arg1
Command line parameter 2 =
Command line parameter 3 =
Command line parameter 4 =
Command line parameter 5 = Arg5


Same with wCL$():J:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUcMb.exe Arg1 Введите текст здесь Arg5
TestUcMb.exe
Arg1
Введите
текст
здесь
Arg5

Vortex

Hi Jochen,

The Windows command line does not provide full unicode support if I am not wrong. Here is my test :

https://ufile.io/gicwi

Edit : The code below did not help either :

#include <windows.h>
#include <wchar.h>
#include <stdio.h>
#include <locale.h>

int _cdecl main(int argc,char *argv[])
{

int i;

setlocale(LC_ALL, "Russian");

for(i=0 ; i<argc ; ++i)
{
    wprintf(L"Command line parameter %d = %s\n",i+1,argv[i]);
}
}


https://gist.github.com/pkorotkov/8087262

jj2007

Hi Erol,

Interesting - it works fine for me. My pre-set codepage is 850, but I can paste Russian (and ä ö ü ç) into the console while codepage is 850. The MasmBasic snippet sets the codepage to 65001 aka UTF-8.

So it seems that Windows pastes and recognises UTF-8, although the codepage is different. It then passes full Unicode to the application: MB takes the Unicode commandline, and displays it as UTF-8 in the console.

Everything clear? No? This is Windows, my friend ::)

Btw which font have you set in the console? Lucida Console? There are only a handful of fonts that work properly, notably LC and Consolas.

P.S.: The C code by pkorotkov doesn't print anything on my console. Really weird: I have put printf statements before and after the wprintf, they display their strings, but there is nothing for wprintf() ::)

TWell

msvcrxx.dll xx => 80 _setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT

jj2007

Quote from: TWell on May 03, 2017, 10:21:26 AM
msvcrxx.dll xx => 80 _setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT

No effect. It compiles fine with Pelles C, but wprintf doesn't display anything. Ansi printf works fine.

TWell

PellesC crt don't support that feature, only in msvcr80.dll and beyond..model flat,stdcall

INCLUDELIB msvcr100.lib
exit PROTO C :DWORD
wprintf PROTO C :VARARG
__iob_func PROTO C
_fileno PROTO C :PTR
_setmode PROTO C :DWORD, :DWORD

.data
msg dw "ÄÖÜ",13,10,0

.code
mainCRTStartup PROC C
INVOKE __iob_func
add eax, 20h ; sizeof FILE stdout
INVOKE _fileno, eax
INVOKE _setmode, eax, 20000h
INVOKE wprintf, ADDR msg
INVOKE exit, eax
mainCRTStartup ENDP
END

jj2007

include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init
  Inkey "Я не понимаю, почему C такой сложный язык"
EndOfCode

Vortex

Hi Jochen,

Thanks for the help. The Lucida Console setting works fine with your application. It looks like that I need to stay away from the msvcrt routines to display UNICODE strings.

Hi Twell,

Thanks for your support. I need to do some more tests.

Vortex

Hello,

This time, I used the stdoutw function from the Masm32 package to print the UNICODE strings. It seems to work. Thanks again Jochen, now I can display correctly the Russian characters.

jj2007

Quote from: Vortex on May 04, 2017, 05:04:41 AMnow I can display correctly the Russian characters.

Works like a charm, Erol :t

Vortex

Another version built with my version of wprintf :


npnw

from stack overflow an explanation.

" 2
down vote
   

The stdout stream can be redirected and therefore always operates in 8-bit mode. The Unicode string you pass to wprintf() gets converted from utf-16 to the 8-bit code page that's selected for the console. By default that's the olden 437 OEM code page. That's where the buck stops, that code page doesn't support the character.

You'll need to switch to another 8-bit code page, one that does support that character. A good choice is 65001, the code page for utf-8. Fix:

SetConsoleOutputCP(CP_UTF8);

Or use SetConsoleCP() if you want stdin to use utf-8 as well.
"     
answered Apr 5 '13 at 8:05
Hans Passant
737k10111771893


Vortex

Hi npnw,

Thanks for the reply. SetConsoleOutputCP(CP_UTF8) does not solve the printf issue. This is why I prefer not to use printf in this case.