Re: How to generate an Unicode string under MASM 6.15?

jj2007 · May 07, 2017, 01:20:24 AM

Quote from: hutch-- on May 06, 2017, 11:36:56 PMthe UNICODE spec does not require it

Actually, there is no safe way to distinguish UTF-8 from "pure" Ansi. Most of the sources posted here would work exactly the same way if they were saved as UTF-8; simply because they don't contain any "exotic" characters. This is why the BOMs make sense. What is really, really hard to understand is that neither MASM nor the Watcom clones take a few cycles to test if there are two bytes to skip at the beginning of the source. These things have been around for several decades now...

TWell · May 07, 2017, 02:01:42 AM

QuoteUTF-8 representation of the BOM is the (hexadecimal) byte sequence 0xEF,0xBB,0xBF

I think many people like idea to have that handled in hjwasm and asmc.
After that notepad is not an evil anymore and comments in native language are not a problem.

nidud · May 07, 2017, 02:16:02 AM

deleted

hutch-- · May 07, 2017, 02:39:30 AM

Until there is a wide acceptance of writing compilers and assemblers to read UNICODE, the methods that have been around since at least WinNT4 will keep doing the job. Code is still generally ASCII and most assemblers/compilers even restrict the upper 128 character set. The exception which has been this way since the earliest Win32 is RC.EXE that has always been capable of UNICODE in Win32.

jj2007 · May 07, 2017, 02:48:05 AM

Quote from: hutch-- on May 07, 2017, 02:39:30 AMUntil there is a wide acceptance of writing compilers and assemblers to read UNICODE, the methods that have been around since at least WinNT4 will keep doing the job.

Absolutely :t

And the handful of non-English coders who want to write their strings or comments in exotic alphabets can use RichMasm.

nidud · May 07, 2017, 03:02:52 AM

deleted

TWell · May 07, 2017, 03:11:52 AM

M$ Cpp don't scare about UTF-8 with BOM.
So this problem is inherited from a ml.exe.

hutch-- · May 07, 2017, 03:14:36 AM

> And the handful of non-English coders who want to write their strings or comments in exotic alphabets can use RichMasm.

Or anything else that can write UNICODE to a RC script. With an ASCII editor you can do these.

.data
align 4
[rename me] \
dw "T","h","i","s"," ","i","s"," ","a"," ","t","e","s","t",0,0
.code

Or this,

; ANSI string of 16 bytes converted to UNICODE
; at 34 bytes using MultiByteToWideChar

[rename me] \
db 84,0,104,0,105,0,115,0,32,0,105,0,115,0,32,0
db 97,0,32,0,116,0,101,0,115,0,116,0,13,0,10,0
db 0,0

Or this in a UNICODE RC sript.

STRINGTABLE
BEGIN
250, "早上好计算机程序员。\0"
251, "おはようのコンピュータのプログラマー。\0"
252, "Хороший программист утром.\0"
253, "Καλή προγραμματιστής ηλεκτρονικών υπολογιστών πρωί.\0"
254, "सुप्रभात कंप्यूटर प्रोग्रामर.\0"
255, "Chào buổi sáng lập trình máy tính.\0"
256, "დილა მშვიდობისა, კომპიუტერული პროგრამისტი.\0"
257, "Добро јутро компјутерски програмер.\0"
258, "Բարի լույս ծրագրավորող.\0"
259, "안녕하세요 컴퓨터 프로그래머.\0"
END

jj2007 · May 07, 2017, 03:51:32 AM

Quote from: nidud on May 07, 2017, 03:02:52 AM
Quote from: jj2007 on May 07, 2017, 02:48:05 AM
And the handful of non-English coders who want to write their strings or comments in exotic alphabets can use RichMasm.

Why do they have to use RichMasm?

Oops, that is a misunderstanding - see attachment.

Code Select

include \masm32\MasmBasic\MasmBasic.inc  ; Вы знаете, так, где же найти эту библиотеку.
  Init
  uMsgBox 0, "真的，没有人是被迫使用高级编辑。", "Important message:", MB_OK
EndOfCode

(if you can't see the text correctly, try an advanced browser like FireFox, Edge, MSIE, Safari, Opera, Chrome or Vivaldi - there are a few that can display non-Latin alphabets)

nidud · May 07, 2017, 04:31:14 AM

deleted

jj2007 · May 07, 2017, 04:37:49 AM

Why would any sane person want to work with such "text"?
What would you do if your browser showed you such gibberish?

nidud · May 07, 2017, 04:59:42 AM

deleted

jj2007 · May 07, 2017, 05:04:06 AM

Quote from: nidud on May 07, 2017, 04:59:42 AMThe question is then: why do YOU use it?

The question is: Why do ALL browsers display all kinds of exotic languages correctly? Wouldn't it be so much easier if your browser displayed only the Norwegian subset correctly, and showed you gibberish if, for whatever strange reason, you visited French or German or Russian websites?

Hey, it's 2017. Unicode was invented in the 20th Century.

nidud · May 07, 2017, 05:51:45 AM

deleted

nidud · May 07, 2017, 06:55:26 AM

deleted

The MASM Forum

News:

Re: How to generate an Unicode string under MASM 6.15?

jj2007

TWell

nidud

hutch--

jj2007

nidud

TWell

hutch--

jj2007

nidud

jj2007

nidud

jj2007

nidud

nidud