Author Topic: Re: How to generate an Unicode string under MASM 6.15?  (Read 3763 times)

nidud

  • Member
  • *****
  • Posts: 1408
    • https://github.com/nidud/asmc
Re: How to generate an Unicode string under MASM 6.15?
« on: May 05, 2017, 11:15:13 PM »
Yes, thanks, after adding "__UNICODE__ equ 1" ML began to write unicode builds, but it also now gives errors despite the fact that I declare the strings through TCHAR:

ML allow using a string array up to the given byte-size of the type:
Code: [Select]
db "large byte array"
dw "ab"
dd "abcd"

The last two is converted to little-endian numbers, so you can't use TCHAR to define WORD size string arrays in MASM. Asmc allows declarations of Unicode and ASCII string arrays in some cases, but this is restricted to keep the code compatible with MASM.

Code: [Select]
invoke strchr,eax,"c" ; MASM compatible: number
strcat(eax, "c") ; converted to string

dw "ab" ; MASM compatible: number

option wstring:on
dw "ab" ; converted to Unicode string

Example

Code: [Select]
include stdio.inc
include stdlib.inc
include tchar.inc

.data
string TCHAR "TCHAR string",10,0

.code

_tmain proc _CDecl

_tprintf(addr string)
ret

_tmain endp

end _tstart

This will work with Unicode/ASCII, 32/64-bit
Code: [Select]
asmc -pe -D__PE__ test.asm
asmc -pe -ws -D_UNICODE -D__PE__ test.asm
asmc -pe -D_WIN64 -D__PE__ test.asm
asmc -pe -D_WIN64 -ws -D_UNICODE -D__PE__ test.asm

aw27

  • Member
  • ****
  • Posts: 852
  • Let's Make ASM Great Again!
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #1 on: May 06, 2017, 01:13:35 AM »
Another example with a mix of various language. All you need is an editor that saves in UTF8, most will do. Tested with JWASM/HJWASM.

Code: [Select]
.386

.MODEL FLAT, C
option casemap:none

CP_UTF8 equ 65001
MB_OK equ 0
NULL equ 0
option dllimport:<kernel32.dll>
MultiByteToWideChar PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD
ExitProcess   proto :dword
option dllimport:<user32.dll>
MessageBoxW PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD

.data


sKorean db "Korean: 한자"," "
sJapanese db "Japanese: 漢字"," "
sChinese db "Chinese: 汉字"," "
sRussian db "Russian: Прощай", 0,0
sCaption dw "L","a","n","g","u","a","g","e"," ","S","a","l","a","d",0,0

.data?
myBuffer      db 256 dup(?)
   
.code
     
start proc
invoke MultiByteToWideChar, CP_UTF8, 0, offset sKorean, -1, offset myBuffer, 256
invoke MessageBoxW, NULL, addr myBuffer, addr sCaption, MB_OK
invoke  ExitProcess, 0
ret
start endp

end start

TWell

  • Member
  • ****
  • Posts: 748
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #2 on: May 06, 2017, 02:58:51 AM »
W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer :biggrin:

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development ;)

jj2007

  • Member
  • *****
  • Posts: 7740
  • Assembler is fun ;-)
    • MasmBasic
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #3 on: May 06, 2017, 04:45:25 AM »
My code works even without __UNICODE__ with russian symbols.

It works with ANSI if your machine's codepage is cyrillic. My example works on all machines because it uses UTF-8.

newrobert

  • Regular Member
  • *
  • Posts: 38
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #4 on: May 06, 2017, 10:59:03 AM »
W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer :biggrin:

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development ;)

20 years too long.

jj2007

  • Member
  • *****
  • Posts: 7740
  • Assembler is fun ;-)
    • MasmBasic
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #5 on: May 06, 2017, 11:41:02 AM »
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

My examples above work because all Masm-compatible assemblers accept UTF-8. Open the attached *.asm source in a hex editor and check yourself.

Code: [Select]
.data
sCaption db "Заголовок", 0
sText db "Текст на русском!", 0

Btw it builds also with qEditor, although it looks like garbage. In RichMasm you can actually read the Russian text, and you can save edits.

aw27

  • Member
  • ****
  • Posts: 852
  • Let's Make ASM Great Again!
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #6 on: May 06, 2017, 05:00:38 PM »
My examples above work because all Masm-compatible assemblers accept UTF-8.
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case. If the UTF8 file has a BOM, it will be enough for the assembler to reject the file as not good.

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4925
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #7 on: May 06, 2017, 06:12:58 PM »
 :biggrin:

> Btw it builds also with qEditor, although it looks like garbage.

That just says that the method you use does not work in Quick Editor. Look in the example code for working UNICODE applications using the authodox Microsoft method. Then there is the QE accessory "MultiTool" that will do conversions for you and if all else fails, UniEdit lets you write anything you like in UNICODE which you then place in a UNICODE RC script.

QE is a pure ASCII editor, it does not pretend to write UTF8/16 or UNICODE.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

jj2007

  • Member
  • *****
  • Posts: 7740
  • Assembler is fun ;-)
    • MasmBasic
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #8 on: May 06, 2017, 07:19:50 PM »
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case.

Exactly. And I have yet to see a case where it didn't work.
Code: [Select]
include \masm32\include\masm32rt.inc
include uChrMacro.inc

.code
start:
  push MB_OK
  push uChr$("Заголовок!")
  push uChr$("Текст на русском!")
  push 0
  call MessageBoxW
  exit
end start

Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

aw27

  • Member
  • ****
  • Posts: 852
  • Let's Make ASM Great Again!
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #9 on: May 06, 2017, 09:10:08 PM »
Exactly. And I have yet to see a case where it didn't work.

Your MasmBasic is amazing. Congratulations  :t

nidud

  • Member
  • *****
  • Posts: 1408
    • https://github.com/nidud/asmc
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #10 on: May 06, 2017, 10:18:26 PM »
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case.

Exactly. And I have yet to see a case where it didn't work.
Code: [Select]
include \masm32\include\masm32rt.inc
include uChrMacro.inc

.code
start:
  push MB_OK
  push uChr$("Заголовок!")
  push uChr$("Текст на русском!")
  push 0
  call MessageBoxW
  exit
end start

Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

 :biggrin:

Copy and past to Notepad, Save...
Quote
This file contains characters in Unicode format which will be lost if you save this file as an ANSI encoded text file. To keep the Unicode information, click Cancel below and select one of the Unicode options from the Encoding drop down list. Continue?

My examples above work because all Masm-compatible assemblers accept UTF-8.

save it as UTF-8 then...
Code: [Select]
я╗┐include \masm32\include\masm32rt.inc
 error A2008: syntax error : я╗┐include

aw27

  • Member
  • ****
  • Posts: 852
  • Let's Make ASM Great Again!
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #11 on: May 06, 2017, 10:32:43 PM »
error A2008: syntax error : я╗┐include

The BOMs suck  :badgrin:
Intelligent editors don't need a BOM to find out that the text is UTF8.

jj2007

  • Member
  • *****
  • Posts: 7740
  • Assembler is fun ;-)
    • MasmBasic
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #12 on: May 06, 2017, 11:16:44 PM »
Intelligent editors don't need a BOM to find out that the text is UTF8.

I would even go one step further: Intelligent editors know when to put a BOM and when to save it as UTF-8 without BOM :bgrin:

(more precisely: ml & clones can't stand the BOM, rc.exe likes it)

hutch--

  • Administrator
  • Member
  • ******
  • Posts: 4925
  • Mnemonic Driven API Grinder
    • The MASM32 SDK
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #13 on: May 06, 2017, 11:36:56 PM »
Bottom line is the UNICODE spec does not require it, its there for text that may be used on other hardware that uses a different byte order. The UNICODE editor I supply specifically does not have it as its designed for Windows UNICODE only.
hutch at movsd dot com
http://www.masm32.com    :biggrin:  :biggrin:

nidud

  • Member
  • *****
  • Posts: 1408
    • https://github.com/nidud/asmc
Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #14 on: May 06, 2017, 11:42:03 PM »
Intelligent editors don't need a BOM to find out that the text is UTF8.

 :biggrin:

Maybe intelligent editors don't need UTF-8.

Quote from: WOLF
I use Notepad++ with ansi codepage.

Quote
Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

 :biggrin:

Quote
How to generate an Unicode string under MASM 6.15?
So in real MASM:
Code: [Select]
; Build: ml /c /coff test.asm
; link /subsystem:console /libpath:\masm32\lib kernel32.lib user32.lib test.obj
;
.386
.model flat, stdcall

ExitProcess proto :dword
MultiByteToWideChar proto :dword, :dword, :ptr, :dword, :ptr, :dword
MessageBoxW proto :ptr, :ptr, :ptr, :dword

.data

str_ru db "‡a£o«o¢oª",0
buffer db 128 dup(0)

.code
start:
invoke MultiByteToWideChar, 866, 0, addr str_ru,
lengthof str_ru, addr buffer, lengthof str_ru
invoke MessageBoxW, 0, addr buffer, addr buffer, 0
invoke ExitProcess, 0

end start

Keep in mind that the "text" is displayed correctly in Notepad++ (even in Doszip  :lol:), the forum however needs the Unicode string.