How important is unicode

sinsi · January 15, 2019, 06:48:57 PM

For our non-English speaking members, is EN-US good enough or would you prefer your native language?
I am considering moving to all unicode but it is a bit more effort, is the effort justified?

One problem is getting a good translation, forget google et al, it needs to be a native speaker imho.

aw27 · January 15, 2019, 07:36:01 PM

In my opinion people that matter understand enough English and prefer using the original versions because translations are commonly bad.
However, another sort of problems may arise, one example, access to folders with native names. :(

jj2007 · January 15, 2019, 08:27:50 PM

For various reasons, I've invested quite a bit of time in this (see e.g. the discussions with our Norwegian friend). IMHO the answer is not simple. If you are going non-US-EN for a good reason, you already have three choices:
1. the user's Ansi subset
2. Utf16
3. Utf8

Option 1 may often work but mixed sets as in the MsgBox below aren't possible. And communicating in a multilingual environment (like our forum, or a company) is difficult

Option 2, full Unicode, has been adopted by M$ for Windows but is not very common anywhere else

Option 3 is most frequently used on the web etc but in the biggest single software market, China, Utf8 takes more space than Utf16.

My own choice was to keep all options open; for all programming stuff, commands like MsgBox, Print etc use Ansi; for user interaction, there are uXX and wXX versions. Another question is what the IDE supports, of course.

TimoVJL · January 15, 2019, 08:48:52 PM

If text come from resources, UNICODE is a better choice, as it is a native format at there?

jj2007 · January 15, 2019, 11:08:39 PM

Sure, you can put it into resources, but why perform such acrobatics if a simple Print "Привет, Мир" does the job? It's a macro assembler after all.

aw27 · January 15, 2019, 11:20:02 PM

Always trolling, desinforming and deceiving. As far as I know Print is a macro that calls some crappy undocumented MasmBasic library function.

jj2007 · January 16, 2019, 02:56:00 AM

I am so sorry, it should have been "print" with lowercase:

Code Select

print "Вы никогда не должны кормить Хосе Паскоа"

Since this is plain Masm32, you must
a) use an advanced IDE to build the source (attached)
b) launch it from a command prompt
c) issue a chcp 65001, then
d) run donotfeed.exe (1024 bytes, this is purest 100% Real Men^TM assembly code!)

... and then you see Вы никогда не должны кормить Хосе Паскоа.

aw27 · January 16, 2019, 03:54:00 AM

Sure, but print is for console output and is not usual to have string resources in console applications. This implies that the suggestion was for a windows application. So, your reply was not helpful. :(

jj2007 · January 16, 2019, 05:23:30 AM

You are confused, José. This is a console application, it doesn't have resources, and it works nonetheless. Take it easy. Open a bottle of good red wine. Listen to Louis Armstrong and relax.

Raistlin · January 16, 2019, 05:32:32 AM

Why are you guys bolding 😁 Ermmmm me too. But
let's look again at the problem. UTF8 works at a higher
percentage of the time. Let me translate.... the world,
has an international language which equals ASM.

felipe · January 16, 2019, 05:40:10 AM

:redface: And i was thinking that unicode was as simple as putting above your includes something like __UNICODE__ equ 1 :idea:

Oh i think i get your point sinsi. Mmm, yes a good translator is needed indeed. :idea:

jj2007 · January 16, 2019, 05:44:13 AM

Quote from: felipe on January 16, 2019, 05:40:10 AM
:redface: And i was thinking that unicode was as simple as putting above your includes something like __UNICODE__ equ 1 :idea:

Maybe? Show me your version of print "Вы никогда не должны кормить Хосе Паскоа" 8)

felipe · January 16, 2019, 09:14:21 AM

well i don't use print yet...or unicode...

but maybe someday :idea:

hutch-- · January 16, 2019, 10:56:23 AM

I imagine it has some to do with your target, if you are pointing your software at a Chinese market you use the text format that best suits the character display you want. If you are a European non English language speaker, your OS language version will allow you to display text in your native character set. UNICODE is for cross character set display and while it does that job fine, the obvious trap is that it is twice the size.

It is easy enough to process, add 2 versus add 1, and allocate the character count x 2 for memory but when big files are involved, UNICODE is a problem. If you have to process very large log files the double size can run you our of memory.

jj2007 · January 16, 2019, 02:13:17 PM

Quote from: hutch-- on January 16, 2019, 10:56:23 AMthe obvious trap is that it is twice the size

For UTF16, yes. That's why the UTF8 representation of Unicode is so popular. Even the big log files on Chinese servers are mostly in English.

The MASM Forum

News:

How important is unicode

sinsi

aw27

jj2007

TimoVJL

jj2007

aw27

jj2007

aw27

jj2007

Raistlin

felipe

jj2007

felipe

hutch--

jj2007