Re: How to generate an Unicode string under MASM 6.15?

nidud · May 05, 2017, 11:15:13 PM

deleted

aw27 · May 06, 2017, 01:13:35 AM

Another example with a mix of various language. All you need is an editor that saves in UTF8, most will do. Tested with JWASM/HJWASM.

Code Select


.386

.MODEL FLAT, C
option casemap:none

CP_UTF8 equ 65001
MB_OK equ 0
NULL equ 0
option dllimport:<kernel32.dll>
MultiByteToWideChar PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD
ExitProcess   proto :dword 
option dllimport:<user32.dll>
MessageBoxW PROTO STDCALL :DWORD,:DWORD,:DWORD,:DWORD

.data


sKorean db "Korean: 한자"," "
sJapanese db "Japanese: 漢字"," "
sChinese db "Chinese: 汉字"," "
sRussian db "Russian: Прощай", 0,0
sCaption dw "L","a","n","g","u","a","g","e"," ","S","a","l","a","d",0,0

.data?
myBuffer      db 256 dup(?) 
    
.code 
     
start proc
	invoke MultiByteToWideChar, CP_UTF8, 0, offset sKorean, -1, offset myBuffer, 256
	invoke MessageBoxW, NULL, addr myBuffer, addr sCaption, MB_OK
	invoke  ExitProcess, 0
	ret
start endp

end start

TWell · May 06, 2017, 02:58:51 AM

W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development ;)

jj2007 · May 06, 2017, 04:45:25 AM

Quote from: W0LF on May 05, 2017, 10:15:09 PMMy code works even without __UNICODE__ with russian symbols.

It works with ANSI if your machine's codepage is cyrillic. My example works on all machines because it uses UTF-8.

newrobert · May 06, 2017, 10:59:03 AM

Quote from: TWell on May 06, 2017, 02:58:51 AM
W0LF,
npp is defaulted to UTF8 and in your examples was cyrillic text, so was it really ANSI8 text with KOI8 or Windows cp 1251 ?
The link of picture i wasn't able to see cyrillic text, so do i have to buy better eyeglasses or just more beer

Others:
jwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

When we see a real UNICODE capable assembler ?
Have to wait another 20 years ?

ml[64] is just a vintage product, a barrier for further development ;)

20 years too long.

jj2007 · May 06, 2017, 11:41:02 AM

Quote from: TWell on May 06, 2017, 02:58:51 AMjwasm and others should accept the fact that UTF-8 is default in source files in these days, as comments needs local language.
How about to just warn about the BOM ?

My examples above work because all Masm-compatible assemblers accept UTF-8. Open the attached *.asm source in a hex editor and check yourself.

Code Select

.data
sCaption	db "Заголовок", 0
sText		db "Текст на русском!", 0

Btw it builds also with qEditor, although it looks like garbage. In RichMasm you can actually read the Russian text, and you can save edits.

aw27 · May 06, 2017, 05:00:38 PM

Quote from: jj2007 on May 06, 2017, 11:41:02 AM
My examples above work because all Masm-compatible assemblers accept UTF-8.

The assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case. If the UTF8 file has a BOM, it will be enough for the assembler to reject the file as not good.

hutch-- · May 06, 2017, 06:12:58 PM

> Btw it builds also with qEditor, although it looks like garbage.

That just says that the method you use does not work in Quick Editor. Look in the example code for working UNICODE applications using the authodox Microsoft method. Then there is the QE accessory "MultiTool" that will do conversions for you and if all else fails, UniEdit lets you write anything you like in UNICODE which you then place in a UNICODE RC script.

QE is a pure ASCII editor, it does not pretend to write UTF8/16 or UNICODE.

jj2007 · May 06, 2017, 07:19:50 PM

Quote from: aw27 on May 06, 2017, 05:00:38 PMThe assemblers simply don't know, they believe they are dealing with ASCII and it works in almost every case.

Exactly. And I have yet to see a case where it didn't work.

Code Select

include \masm32\include\masm32rt.inc
include uChrMacro.inc

.code
start:
  push MB_OK
  push uChr$("Заголовок!")
  push uChr$("Текст на русском!")
  push 0
  call MessageBoxW
  exit
end start

Pure Masm32. It looks simple enough, right? Of course, there are more complicated solutions.

aw27 · May 06, 2017, 09:10:08 PM

Quote from: jj2007 on May 06, 2017, 07:19:50 PM
Exactly. And I have yet to see a case where it didn't work.

Your MasmBasic is amazing. Congratulations :t

nidud · May 06, 2017, 10:18:26 PM

deleted

aw27 · May 06, 2017, 10:32:43 PM

Quote from: nidud on May 06, 2017, 10:18:26 PM
error A2008: syntax error : я╗┐include

The BOMs suck

Intelligent editors don't need a BOM to find out that the text is UTF8.

jj2007 · May 06, 2017, 11:16:44 PM

Quote from: aw27 on May 06, 2017, 10:32:43 PMIntelligent editors don't need a BOM to find out that the text is UTF8.

I would even go one step further: Intelligent editors know when to put a BOM and when to save it as UTF-8 without BOM

(more precisely: ml & clones can't stand the BOM, rc.exe likes it)

hutch-- · May 06, 2017, 11:36:56 PM

Bottom line is the UNICODE spec does not require it, its there for text that may be used on other hardware that uses a different byte order. The UNICODE editor I supply specifically does not have it as its designed for Windows UNICODE only.

nidud · May 06, 2017, 11:42:03 PM

deleted

The MASM Forum

News:

Re: How to generate an Unicode string under MASM 6.15?

nidud

aw27

TWell

jj2007

newrobert

jj2007

aw27

hutch--

jj2007

aw27

nidud

aw27

jj2007

hutch--

nidud