The MASM Forum

Projects => Poasm => Topic started by: Vortex on May 02, 2017, 05:31:12 AM

Title: Command line parsing modules
Post by: Vortex on May 02, 2017, 05:31:12 AM
Hello,

Attached is a set of modules to parse ANSI and UNICODE command lines.
Title: Re: Command line parsing modules
Post by: Siekmanski on May 02, 2017, 06:31:53 AM
Thanks.
Title: Re: Command line parsing modules
Post by: jj2007 on May 02, 2017, 07:04:03 AM
Hi Erol,
What's wrong with this test?

C:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUnicode.exe Arg1 Введите текст здесь Arg5
Command line parameter 0 = TestUnicode.exe
Command line parameter 1 = Arg1
Command line parameter 2 =
Command line parameter 3 =
Command line parameter 4 =
Command line parameter 5 = Arg5


Same with wCL$() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1218):J:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUcMb.exe Arg1 Введите текст здесь Arg5
TestUcMb.exe
Arg1
Введите
текст
здесь
Arg5
Title: Re: Command line parsing modules
Post by: Vortex on May 03, 2017, 04:29:36 AM
Hi Jochen,

The Windows command line does not provide full unicode support if I am not wrong. Here is my test :

https://ufile.io/gicwi

Edit : The code below did not help either :

#include <windows.h>
#include <wchar.h>
#include <stdio.h>
#include <locale.h>

int _cdecl main(int argc,char *argv[])
{

int i;

setlocale(LC_ALL, "Russian");

for(i=0 ; i<argc ; ++i)
{
    wprintf(L"Command line parameter %d = %s\n",i+1,argv[i]);
}
}


https://gist.github.com/pkorotkov/8087262
Title: Re: Command line parsing modules
Post by: jj2007 on May 03, 2017, 08:11:18 AM
Hi Erol,

Interesting - it works fine for me. My pre-set codepage is 850, but I can paste Russian (and ä ö ü ç) into the console while codepage is 850. The MasmBasic snippet sets the codepage to 65001 aka UTF-8.

So it seems that Windows pastes and recognises UTF-8, although the codepage is different. It then passes full Unicode to the application: MB takes the Unicode commandline, and displays it as UTF-8 in the console.

Everything clear? No? This is Windows, my friend ::)

Btw which font have you set in the console? Lucida Console? There are only a handful of fonts that work properly, notably LC and Consolas.

P.S.: The C code by pkorotkov doesn't print anything on my console. Really weird: I have put printf statements before and after the wprintf, they display their strings, but there is nothing for wprintf() ::)
Title: Re: Command line parsing modules
Post by: TWell on May 03, 2017, 10:21:26 AM
msvcrxx.dll xx => 80 _setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT
Title: Re: Command line parsing modules
Post by: jj2007 on May 03, 2017, 05:51:24 PM
Quote from: TWell on May 03, 2017, 10:21:26 AM
msvcrxx.dll xx => 80 _setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT

No effect. It compiles fine with Pelles C, but wprintf doesn't display anything. Ansi printf works fine.
Title: Re: Command line parsing modules
Post by: TWell on May 03, 2017, 06:06:47 PM
PellesC crt don't support that feature, only in msvcr80.dll and beyond..model flat,stdcall

INCLUDELIB msvcr100.lib
exit PROTO C :DWORD
wprintf PROTO C :VARARG
__iob_func PROTO C
_fileno PROTO C :PTR
_setmode PROTO C :DWORD, :DWORD

.data
msg dw "ÄÖÜ",13,10,0

.code
mainCRTStartup PROC C
INVOKE __iob_func
add eax, 20h ; sizeof FILE stdout
INVOKE _fileno, eax
INVOKE _setmode, eax, 20000h
INVOKE wprintf, ADDR msg
INVOKE exit, eax
mainCRTStartup ENDP
END
Title: Re: Command line parsing modules
Post by: jj2007 on May 03, 2017, 06:21:16 PM
include \masm32\MasmBasic\MasmBasic.inc      ; download (http://masm32.com/board/index.php?topic=94.0)
  Init
  Inkey "Я не понимаю, почему C такой сложный язык"
EndOfCode
Title: Re: Command line parsing modules
Post by: Vortex on May 04, 2017, 03:48:37 AM
Hi Jochen,

Thanks for the help. The Lucida Console setting works fine with your application. It looks like that I need to stay away from the msvcrt routines to display UNICODE strings.

Hi Twell,

Thanks for your support. I need to do some more tests.
Title: Re: Command line parsing modules
Post by: Vortex on May 04, 2017, 05:04:41 AM
Hello,

This time, I used the stdoutw function from the Masm32 package to print the UNICODE strings. It seems to work. Thanks again Jochen, now I can display correctly the Russian characters.
Title: Re: Command line parsing modules
Post by: jj2007 on May 04, 2017, 08:53:38 AM
Quote from: Vortex on May 04, 2017, 05:04:41 AMnow I can display correctly the Russian characters.

Works like a charm, Erol :t
Title: Re: Command line parsing modules
Post by: Vortex on May 08, 2017, 12:26:21 AM
Another version built with my version of wprintf :

(http://masm32.com/board/index.php?topic=5830.msg66224#msg66224)
Title: Re: Command line parsing modules
Post by: npnw on February 03, 2018, 02:38:30 PM
from stack overflow an explanation.

" 2
down vote
   

The stdout stream can be redirected and therefore always operates in 8-bit mode. The Unicode string you pass to wprintf() gets converted from utf-16 to the 8-bit code page that's selected for the console. By default that's the olden 437 OEM code page. That's where the buck stops, that code page doesn't support the character.

You'll need to switch to another 8-bit code page, one that does support that character. A good choice is 65001, the code page for utf-8. Fix:

SetConsoleOutputCP(CP_UTF8);

Or use SetConsoleCP() if you want stdin to use utf-8 as well.
"     
answered Apr 5 '13 at 8:05
Hans Passant
737k10111771893

Title: Re: Command line parsing modules
Post by: Vortex on February 03, 2018, 08:17:11 PM
Hi npnw,

Thanks for the reply. SetConsoleOutputCP(CP_UTF8) does not solve the printf issue. This is why I prefer not to use printf in this case.