### Author Topic: Re: How to generate an Unicode string under MASM 6.15?  (Read 6718 times)

#### nidud

• Member
• Posts: 1424
##### Re: How to generate an Unicode string under MASM 6.15?
« on: May 05, 2017, 11:15:13 PM »
Yes, thanks, after adding "__UNICODE__ equ 1" ML began to write unicode builds, but it also now gives errors despite the fact that I declare the strings through TCHAR:

ML allow using a string array up to the given byte-size of the type:
Code: [Select]
` db "large byte array" dw "ab" dd "abcd"`
The last two is converted to little-endian numbers, so you can't use TCHAR to define WORD size string arrays in MASM. Asmc allows declarations of Unicode and ASCII string arrays in some cases, but this is restricted to keep the code compatible with MASM.

Code: [Select]
` invoke strchr,eax,"c" ; MASM compatible: number strcat(eax, "c") ; converted to string dw "ab" ; MASM compatible: number option wstring:on dw "ab" ; converted to Unicode string`
Example

Code: [Select]
`include stdio.incinclude stdlib.incinclude tchar.inc .datastring TCHAR "TCHAR string",10,0 .code_tmain proc _CDecl _tprintf(addr string) ret_tmain endp end _tstart`
This will work with Unicode/ASCII, 32/64-bit
Code: [Select]
`asmc -pe -D__PE__ test.asmasmc -pe -ws -D_UNICODE -D__PE__ test.asmasmc -pe -D_WIN64 -D__PE__ test.asmasmc -pe -D_WIN64 -ws -D_UNICODE -D__PE__ test.asm`

#### aw27

• Member
• Posts: 1036
• Let's Make ASM Great Again!
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #1 on: May 06, 2017, 01:13:35 AM »
Another example with a mix of various language. All you need is an editor that saves in UTF8, most will do. Tested with JWASM/HJWASM.

Code: [Select]
`.386.MODEL FLAT, Coption casemap:noneCP_UTF8 equ 65001MB_OK equ 0NULL equ 0option dllimport:<kernel32.dll>MultiByteToWideChar PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORDExitProcess   proto :dword option dllimport:<user32.dll>MessageBoxW PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD.datasKorean db "Korean: 한자"," "sJapanese db "Japanese: 漢字"," "sChinese db "Chinese: 汉字"," "sRussian db "Russian: Прощай", 0,0sCaption dw "L","a","n","g","u","a","g","e"," ","S","a","l","a","d",0,0.data?myBuffer      db 256 dup(?)     .code      start proc invoke MultiByteToWideChar, CP_UTF8, 0, offset sKorean, -1, offset myBuffer, 256 invoke MessageBoxW, NULL, addr myBuffer, addr sCaption, MB_OK invoke  ExitProcess, 0 retstart endpend start`

#### TWell

• Member
• Posts: 748
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #2 on: May 06, 2017, 02:58:51 AM »
W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development

#### jj2007

• Member
• Posts: 7990
• Assembler is fun ;-)
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #3 on: May 06, 2017, 04:45:25 AM »
My code works even without __UNICODE__ with russian symbols.

It works with ANSI if your machine's codepage is cyrillic. My example works on all machines because it uses UTF-8.

#### newrobert

• Regular Member
• Posts: 38
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #4 on: May 06, 2017, 10:59:03 AM »
W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development

20 years too long.

#### jj2007

• Member
• Posts: 7990
• Assembler is fun ;-)
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #5 on: May 06, 2017, 11:41:02 AM »
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.

My examples above work because all Masm-compatible assemblers accept UTF-8. Open the attached *.asm source in a hex editor and check yourself.

Code: [Select]
`.datasCaption db "Заголовок", 0sText db "Текст на русском!", 0`
Btw it builds also with qEditor, although it looks like garbage. In RichMasm you can actually read the Russian text, and you can save edits.

#### aw27

• Member
• Posts: 1036
• Let's Make ASM Great Again!
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #6 on: May 06, 2017, 05:00:38 PM »
My examples above work because all Masm-compatible assemblers accept UTF-8.
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case. If the UTF8 file has a BOM, it will be enough for the assembler to reject the file as not good.

#### hutch--

• Member
• Posts: 5095
• Mnemonic Driven API Grinder
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #7 on: May 06, 2017, 06:12:58 PM »

> Btw it builds also with qEditor, although it looks like garbage.

That just says that the method you use does not work in Quick Editor. Look in the example code for working UNICODE applications using the authodox Microsoft method. Then there is the QE accessory "MultiTool" that will do conversions for you and if all else fails, UniEdit lets you write anything you like in UNICODE which you then place in a UNICODE RC script.

QE is a pure ASCII editor, it does not pretend to write UTF8/16 or UNICODE.
hutch at movsd dot com
http://www.masm32.com

#### jj2007

• Member
• Posts: 7990
• Assembler is fun ;-)
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #8 on: May 06, 2017, 07:19:50 PM »
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case.

Exactly. And I have yet to see a case where it didn't work.
Code: [Select]
`include \masm32\include\masm32rt.incinclude uChrMacro.inc.codestart:  push MB_OK  push uChr\$("Заголовок!")  push uChr\$("Текст на русском!")  push 0  call MessageBoxW  exitend start`
Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

#### aw27

• Member
• Posts: 1036
• Let's Make ASM Great Again!
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #9 on: May 06, 2017, 09:10:08 PM »
Exactly. And I have yet to see a case where it didn't work.

#### nidud

• Member
• Posts: 1424
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #10 on: May 06, 2017, 10:18:26 PM »
The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case.

Exactly. And I have yet to see a case where it didn't work.
Code: [Select]
`include \masm32\include\masm32rt.incinclude uChrMacro.inc.codestart:  push MB_OK  push uChr\$("Заголовок!")  push uChr\$("Текст на русском!")  push 0  call MessageBoxW  exitend start`
Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

Copy and past to Notepad, Save...
Quote
This file contains characters in Unicode format which will be lost if you save this file as an ANSI encoded text file. To keep the Unicode information, click Cancel below and select one of the Unicode options from the Encoding drop down list. Continue?

My examples above work because all Masm-compatible assemblers accept UTF-8.

save it as UTF-8 then...
Code: [Select]
`я╗┐include \masm32\include\masm32rt.inc error A2008: syntax error : я╗┐include`

#### aw27

• Member
• Posts: 1036
• Let's Make ASM Great Again!
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #11 on: May 06, 2017, 10:32:43 PM »
error A2008: syntax error : я╗┐include

The BOMs suck
Intelligent editors don't need a BOM to find out that the text is UTF8.

#### jj2007

• Member
• Posts: 7990
• Assembler is fun ;-)
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #12 on: May 06, 2017, 11:16:44 PM »
Intelligent editors don't need a BOM to find out that the text is UTF8.

I would even go one step further: Intelligent editors know when to put a BOM and when to save it as UTF-8 without BOM

(more precisely: ml & clones can't stand the BOM, rc.exe likes it)

#### hutch--

• Member
• Posts: 5095
• Mnemonic Driven API Grinder
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #13 on: May 06, 2017, 11:36:56 PM »
Bottom line is the UNICODE spec does not require it, its there for text that may be used on other hardware that uses a different byte order. The UNICODE editor I supply specifically does not have it as its designed for Windows UNICODE only.
hutch at movsd dot com
http://www.masm32.com

#### nidud

• Member
• Posts: 1424
##### Re: Re: How to generate an Unicode string under MASM 6.15?
« Reply #14 on: May 06, 2017, 11:42:03 PM »
Intelligent editors don't need a BOM to find out that the text is UTF8.

Maybe intelligent editors don't need UTF-8.

Quote from: WOLF
I use Notepad++ with ansi codepage.

Quote
Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

Quote
How to generate an Unicode string under MASM 6.15?
So in real MASM:
Code: [Select]
`; Build: ml /c /coff test.asm; link /subsystem:console /libpath:\masm32\lib kernel32.lib user32.lib test.obj; .386 .model flat, stdcallExitProcess proto :dwordMultiByteToWideChar proto :dword, :dword, :ptr, :dword, :ptr, :dwordMessageBoxW proto :ptr, :ptr, :ptr, :dword .data str_ru db "‡a£o«o¢oª",0 buffer db 128 dup(0) .codestart: invoke MultiByteToWideChar, 866, 0, addr str_ru, lengthof str_ru, addr buffer, lengthof str_ru invoke MessageBoxW, 0, addr buffer, addr buffer, 0 invoke ExitProcess, 0 end start`
Keep in mind that the "text" is displayed correctly in Notepad++ (even in Doszip  ), the forum however needs the Unicode string.