I think the big problem with full utf-8 / unicode support is that internally the string comparisons, symbol table, tokenizer and parser have all been designed around ascii only.. changing it to make it fully portable would probably require a rather major re-write.
It is
not a big problem, simply because we need to distinguish two entirely different things:
1. Do we need non-Latin chars in symbols, labels and commands?
2. Do we need non-Latin chars in strings?
Version 1 is this:
Печать "Hello World"Version 2 is this:
print "Привет Мир"1. With current assemblers,
Печать "Hello World" is impossible. No problem,
nobody needs that. Russian programmers do not use Russian commands in their code (if you don't agree, gimme a link to
one example at least); they use English, because that is the language of their compiler. Same for Chinese, Arabic, Japanese and North Korean coders.
2. With current assemblers,
print "Привет Мир" is possible. Even the old Masm 6.14 that comes along with the Masm32 SDK can flawlessly produce code that
displays Привет Мир a) in the editor, and b) in the console.
It is not a problem of the assembler - that is just a dumb tool, and it never had any interest in the stupid things that coders put "inside quotes". They are Chinese anyway for the assembler (no insult intended, I like Chinese characters).
Conclusion: THERE IS NO PROBLEM, except perhaps that
very few editors can handle UTF-8 correctly 8)