Hi,
I have some difficulty imagining how to declare a Unicode string in MASM 6.15 with a simple directive (like DUS in GoAsm, for example). I know the classic system consisting in a loop lods/stosw (with ah=0) but I won't like having to generate code. I read a post from ragdog where he discussed about WSTR but I did not understand, even by consulting the Microsoft programming manual. Can someone give me a solution ?
Thank you in advance !
Ps: I'm afraid my English is disastrous... :greenclp:
Hi Iznogoode,
There are many solutions, see e.g. Unicode and displaying non-Latin alphabets (http://masm32.com/board/index.php?topic=4891.0).
The lodsb+stosw trick works only for pseudo unicode, as you certainly know. This is under the hood of the Masm32 macros uc$(), uni$(), and the MasmBasic macro wChr$() (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1250).
Even this works: Print (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm#Mb1244) "Введите текст здесь", CrLf$
What is your specific objective here? Short code, simple code, fast code, ...?
i remember masm can call windows api mutlitto... function convert asiic to unicode string;
Thank you for your responses !
@jj2007: I am a beginner in assembler under Windows after using a lot MASM under DOS in the early 90s. Being studying the structures DLGTEMPLATE and DLGITEMTEMPLATE, I discovered that the strings had to be specified in Unicode format. So I wondered about the possibility of directly declare these strings in Unicode. GoAsm (and probably NASM too) can, but not MASM 6.15 which is, if I'm not wrong, the only version to run autonomously (beyond that, Visual C ++ is required). So I do not have a specific project. Unfortunately, I don't have a PC for the moment and I will study your precious suggestions in a little more than 2 weeks.
@newrobert: thank you very much for your info. I know, indeed, the function MultiByteToWide that allows to perform this conversion but I'm rather interested in a direct declaration of the string in Unicode'
deleted
In old way
msg dw 'H','e','l','l','o',0
There are a number of ways to do unicode in MASM which does not directly support unicode. You can put unicode strings in a resource file and load them that way. MASM32 has a tool for converting unicode text to DW sequences that you put in the initialised data section. The last way is to use the API that converts ASCII to UNICODE
some editor also support unicode, such as notepad++ and ultra-edit, you can select menu from encode menu,
then you can select assic,utf-8 or usc2;
The MASM32 SDK already provides a unicode editor for exactly the purpose of editing unicode RC scripts so that characters sets from around the world can be used in Windows applications created with MASM.
Quote from: newrobert on May 04, 2017, 12:16:06 PM
some editor also support unicode
One editor also supports Formatted UniCode:
include \masm32\MasmBasic\MasmBasic.inc
Init
Inkey "Formatted UniCode: Введите
текст здесь"
EndOfCode
> assic,utf-8 or usc2
Btw, I've heard of utf-8, but what are the others? assic, usc2?
What formats as source file ml.exe 6.15 accept?
The snippet above builds and runs fine with ML 6.15 (and all other assemblers except ML 6.14). RichMasm exports it as UTF-8.
Even ML 10.0, AsmC and HJWasm cannot digest Unicode.
Attached three sources for testing:
- utf8
- unicode with BOM
- unicode without BOM
You can open them with RichMasm (http://masm32.com/board/index.php?topic=5314.0) or Notepad. Qeditor opens them but the display will look a bit garbled.
Hi all! May I ask in this tread about unicode? Thx :)
In a lot of masm's .inc-files I see macros: ifdef __UNICODE__ then bla-bla_1 else bla-bla_2.
So HOW I can define that "__UNICODE__" variable?
deleted
WOLF,
Have a read of the help files as they explain how to use the __UNICODE__ equate. When you define the equate you get the UNICODE API functions instead of the ANSI ones.
Yes, thanks, after adding "__UNICODE__ equ 1" ML began to write unicode builds, but it also now gives errors despite the fact that I declare the strings through TCHAR:
D:\Archives\Dropbox\src>ml /c /coff 1.asm
Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997. All rig
Assembling: 1.asm
*************
UNICODE Build
*************
1.asm(8) : error A2084: constant value too large
1.asm(9) : error A2084: constant value too large
This is my prog:
; Just an example!
__UNICODE__ equ 1
include D:\Programming\Masm32\include\masm32rt.inc
.data
sCaption TCHAR "Заголовок", 0, 0
sText TCHAR "Bla-bla-bla!", 0, 0
.code
start proc
invoke MessageBox, NULL, addr sText, addr sCaption, MB_OK
invoke ExitProcess, 0
start endp
end start
Quote from: hutch-- on May 05, 2017, 05:42:59 AM
WOLF,
Have a read of the help files as they explain how to use the __UNICODE__ equate. When you define the equate you get the UNICODE API functions instead of the ANSI ones.
Are you mean helps in "\Masm32\help\" ?
More specifically, \Masm32\help\hlhelp.chm
If you don't find a solution in the Masm32 SDK, try this:
include \masm32\MasmBasic\MasmBasic.inc ; download (http://masm32.com/board/index.php?topic=94.0)
Init
uMsgBox 0, "Заголовок", "Russian is cute:", MB_OK
EndOfCode
Source & exe attached; the MasmBasic library (http://www.webalice.it/jj2006/MasmBasicQuickReference.htm) is needed to build it. All commands starting with w (like wide) are Unicode.
Quote from: jj2007 on May 04, 2017, 04:57:03 PM
Quote from: newrobert on May 04, 2017, 12:16:06 PM
some editor also support unicode
One editor also supports Formatted UniCode:
include \masm32\MasmBasic\MasmBasic.inc
Init
Inkey "Formatted UniCode: Введите текст здесь"
EndOfCode
> assic,utf-8 or usc2
Btw, I've heard of utf-8, but what are the others? assic, usc2?
sorry for misspell, should be ansi and usc2, ansi means 8-bit ascii code, and usc2 can reference following link:
https://en.wikipedia.org/wiki/Universal_Coded_Character_Set
There is an editor in the MASM32 directory called "Uniedit" that write unicode text. It is supplied exactly for the purpose of writing UNICODE data to a unicode RC script.
@W0LF,
This version works; Just an example!
__UNICODE__ equ 1
include \Masm32\include\masm32rt.inc
.data
;sCaption TCHAR "Заголовок", 0, 0
;UCSTR sCaption, "Заголовок", 0 ; UTF-8 :(
sCaption dw 0417h,0430h,0433h,043Eh,043Bh,043Eh,0432h,043Eh,043Ah,0
;sText TCHAR "Bla-bla-bla!", 0, 0
UCSTR sText, "Bla-bla-bla!", 0 ; OK
.code
start proc
invoke MessageBox, NULL, addr sText, addr sCaption, MB_OK
invoke ExitProcess, 0
start endp
end start
PS: i used Notepad2 as editor, it support UTF-8 without BOM.
That is, in order to use the unicode in the program, should I use the unicode-editor? I think not.
I use Notepad++ with ansi codepage.
It works, but...
http://savepic.net/9286853.htm (http://savepic.net/9286853.htm)
:greensml:
; Just an example!
__UNICODE__ equ 1
include D:\Programming\Masm32\include\masm32rt.inc
.data
UCSTR sCaption, "Заголовок", 0, 0
UCSTR sText, "Текст на русском!", 0, 0
.code
start proc
invoke MessageBox, NULL, addr sText, addr sCaption, MB_OK
invoke ExitProcess, 0
start endp
end start
Thx all for help! 8)
Quote from: W0LF on May 05, 2017, 08:32:40 PM
It works, but...
Does the exe in Reply #17 work?
Quoteinclude D:\Programming\Masm32\include\masm32rt.inc
I am surprised that you can build something with a non-standard folder. The Masm32 SDK is built on the assumption that there is a folder \Masm32, which may look old-fashioned but has some advantages. Most of us use
include \Masm32\include\masm32rt.incP.S.: Your code can work, but you need the RichMasm editor to do that (note it doesn't use MasmBasic):
; Just an example!
__UNICODE__ equ 1
include \Masm32\include\masm32rt.inc ; Open in RichMasm (http://masm32.com/board/index.php?topic=5314.0), hit F6
.data
UCSTR sCaption, "Заголовок", 0, 0
; UCSTR sText, "Текст на русском!", 0, 0
sText db "Текст на русском!", 0, 0
.data?
buffer db 1000 dup(?)
.code
start proc
invoke MultiByteToWideChar, CP_UTF8, 0, offset sText, -1, offset buffer, 100
invoke MessageBox, NULL, addr buffer, addr sCaption, MB_OK
invoke ExitProcess, 0
start endp
end start
I thought both text should be russian language.;Заголовок
sCaption dw 0417h,0430h,0433h,043Eh,043Bh,043Eh,0432h,043Eh,043Ah,0
;Текст на русском!
sText dw 0422h,0435h,043Ah,0441h,0442h,20h,043Dh,0430h,20h,0440h,0443h,0441h,0441h,043Ah,043Eh,043Ch,21h,0
EDIT: picture removed, as Steve don't like it ;)
deleted
Using unicode in win32 means an ASCII/ANSI text editor for code and a unicode editor to create and/or edit a unicode RC script.
Quote from: jj2007 on May 05, 2017, 08:44:09 PM
Does the exe in Reply #17 work?
Yes, it work.
QuoteI am surprised that you can build something with a non-standard folder. The Masm32 SDK is built on the assumption that there is a folder \Masm32, which may look old-fashioned but has some advantages. Most of us use include \Masm32\include\masm32rt.inc
I was change all paths in masm32rt.inc. I don't want masm and sources in root folder, so they placed in subfolders.
QuoteP.S.: Your code can work, but you need the RichMasm editor to do that
My code works even without __UNICODE__ with russian symbols. I just wanted to see how it would look.
But I'm still grateful to all of you for your help and, I hope, you will help further.
I move the discussion to the Workshop.