Author Topic: Command line parsing modules  (Read 3245 times)

Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Command line parsing modules
« on: May 02, 2017, 05:31:12 AM »
Hello,

Attached is a set of modules to parse ANSI and UNICODE command lines.

Siekmanski

  • Member
  • *****
  • Posts: 1588
Re: Command line parsing modules
« Reply #1 on: May 02, 2017, 06:31:53 AM »
Thanks.
Creative coders use backward thinking techniques as their strategy.

jj2007

  • Member
  • *****
  • Posts: 8590
  • Assembler is fun ;-)
    • MasmBasic
Re: Command line parsing modules
« Reply #2 on: May 02, 2017, 07:04:03 AM »
Hi Erol,
What's wrong with this test?

Code: [Select]
C:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUnicode.exe Arg1 Введите текст здесь Arg5
Command line parameter 0 = TestUnicode.exe
Command line parameter 1 = Arg1
Command line parameter 2 =
Command line parameter 3 =
Command line parameter 4 =
Command line parameter 5 = Arg5

Same with wCL$():
Code: [Select]
J:\Masm32\MasmBasic\Members\Erol\ParseCL>TestUcMb.exe Arg1 Введите текст здесь Arg5
TestUcMb.exe
Arg1
Введите
текст
здесь
Arg5

Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Re: Command line parsing modules
« Reply #3 on: May 03, 2017, 04:29:36 AM »
Hi Jochen,

The Windows command line does not provide full unicode support if I am not wrong. Here is my test :

https://ufile.io/gicwi

Edit : The code below did not help either :

Code: [Select]
#include <windows.h>
#include <wchar.h>
#include <stdio.h>
#include <locale.h>

int _cdecl main(int argc,char *argv[])
{

int i;

setlocale(LC_ALL, "Russian");

for(i=0 ; i<argc ; ++i)
{
    wprintf(L"Command line parameter %d = %s\n",i+1,argv[i]);
}
}

https://gist.github.com/pkorotkov/8087262

jj2007

  • Member
  • *****
  • Posts: 8590
  • Assembler is fun ;-)
    • MasmBasic
Re: Command line parsing modules
« Reply #4 on: May 03, 2017, 08:11:18 AM »
Hi Erol,

Interesting - it works fine for me. My pre-set codepage is 850, but I can paste Russian (and ä ö ü ç) into the console while codepage is 850. The MasmBasic snippet sets the codepage to 65001 aka UTF-8.

So it seems that Windows pastes and recognises UTF-8, although the codepage is different. It then passes full Unicode to the application: MB takes the Unicode commandline, and displays it as UTF-8 in the console.

Everything clear? No? This is Windows, my friend ::)

Btw which font have you set in the console? Lucida Console? There are only a handful of fonts that work properly, notably LC and Consolas.

P.S.: The C code by pkorotkov doesn't print anything on my console. Really weird: I have put printf statements before and after the wprintf, they display their strings, but there is nothing for wprintf() ::)

TWell

  • Member
  • ****
  • Posts: 748
Re: Command line parsing modules
« Reply #5 on: May 03, 2017, 10:21:26 AM »
msvcrxx.dll xx => 80
Code: [Select]
_setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT

jj2007

  • Member
  • *****
  • Posts: 8590
  • Assembler is fun ;-)
    • MasmBasic
Re: Command line parsing modules
« Reply #6 on: May 03, 2017, 05:51:24 PM »
msvcrxx.dll xx => 80
Code: [Select]
_setmode(_fileno(stdout), 0x20000); // set console mode to unicode _O_U16TEXT

No effect. It compiles fine with Pelles C, but wprintf doesn't display anything. Ansi printf works fine.

TWell

  • Member
  • ****
  • Posts: 748
Re: Command line parsing modules
« Reply #7 on: May 03, 2017, 06:06:47 PM »
PellesC crt don't support that feature, only in msvcr80.dll and beyond.
Code: [Select]
.model flat,stdcall

INCLUDELIB msvcr100.lib
exit PROTO C :DWORD
wprintf PROTO C :VARARG
__iob_func PROTO C
_fileno PROTO C :PTR
_setmode PROTO C :DWORD, :DWORD

.data
msg dw "ÄÖÜ",13,10,0

.code
mainCRTStartup PROC C
INVOKE __iob_func
add eax, 20h ; sizeof FILE stdout
INVOKE _fileno, eax
INVOKE _setmode, eax, 20000h
INVOKE wprintf, ADDR msg
INVOKE exit, eax
mainCRTStartup ENDP
END

jj2007

  • Member
  • *****
  • Posts: 8590
  • Assembler is fun ;-)
    • MasmBasic
Re: Command line parsing modules
« Reply #8 on: May 03, 2017, 06:21:16 PM »
include \masm32\MasmBasic\MasmBasic.inc      ; download
  Init
  Inkey "Я не понимаю, почему C такой сложный язык"
EndOfCode

Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Re: Command line parsing modules
« Reply #9 on: May 04, 2017, 03:48:37 AM »
Hi Jochen,

Thanks for the help. The Lucida Console setting works fine with your application. It looks like that I need to stay away from the msvcrt routines to display UNICODE strings.

Hi Twell,

Thanks for your support. I need to do some more tests.

Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Re: Command line parsing modules
« Reply #10 on: May 04, 2017, 05:04:41 AM »
Hello,

This time, I used the stdoutw function from the Masm32 package to print the UNICODE strings. It seems to work. Thanks again Jochen, now I can display correctly the Russian characters.

jj2007

  • Member
  • *****
  • Posts: 8590
  • Assembler is fun ;-)
    • MasmBasic
Re: Command line parsing modules
« Reply #11 on: May 04, 2017, 08:53:38 AM »
now I can display correctly the Russian characters.

Works like a charm, Erol :t

Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Re: Command line parsing modules
« Reply #12 on: May 08, 2017, 12:26:21 AM »
Another version built with my version of wprintf :


npnw

  • Member
  • **
  • Posts: 152
Re: Command line parsing modules
« Reply #13 on: February 03, 2018, 02:38:30 PM »
from stack overflow an explanation.

" 2
down vote
   

The stdout stream can be redirected and therefore always operates in 8-bit mode. The Unicode string you pass to wprintf() gets converted from utf-16 to the 8-bit code page that's selected for the console. By default that's the olden 437 OEM code page. That's where the buck stops, that code page doesn't support the character.

You'll need to switch to another 8-bit code page, one that does support that character. A good choice is 65001, the code page for utf-8. Fix:

 SetConsoleOutputCP(CP_UTF8);

Or use SetConsoleCP() if you want stdin to use utf-8 as well.
"     
answered Apr 5 '13 at 8:05
Hans Passant
737k10111771893


Vortex

  • Moderator
  • Member
  • *****
  • Posts: 1821
Re: Command line parsing modules
« Reply #14 on: February 03, 2018, 08:17:11 PM »
Hi npnw,

Thanks for the reply. SetConsoleOutputCP(CP_UTF8) does not solve the printf issue. This is why I prefer not to use printf in this case.