News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Does MessageBoxW need a BOM?

Started by jj2007, October 31, 2017, 10:51:31 AM

Previous topic - Next topic

jj2007

Hi everybody,

Just stumbled over a weird problem:
  mov esi, wRec$("Please report if the caption displays correctly")
  mov edi, wRec$("xВведите текст") ; the x is a placeholder for 2 bytes
  invoke MessageBoxW, 0, esi, edi, MB_OK
  mov word ptr [edi],  0FEFFh ; Unicode BOM
  invoke MessageBoxW, 0, esi, edi, MB_OK


The strings look OK in the debugger, but the first MessageBoxW displays a bad caption - just x plus squares.

With the added BOM, the second MessageBoxW displays correctly. MSDN does not mention the need for a BOM. Furthermore, Russian text displays fine without the BOM for the second arg, i.e. the text.

This is on Windows 7-64. Can you please tell me what you see?

P.S.:
- on WinXP (VM), the first MsgBox displays correctly as xВведите текст, the second one (with BOM) shows a strange dot instead of the BOM, i.e. *Введите текст
- on Win10, the first MsgBox displays correctly as xВведите текст, the second shows Введите текст, i.e. the correct result

LiaoMi


LiaoMi


jj2007

Thanks, LiaoMi, that is also what I see on WinXP. The first one is Win10, I suppose? Below what I see on Win7-64 (I launched two instances to show the two boxes simultaneously). I tried a reboot now but still the same result. Interesting that your Win10 console output translates the BOM to an extra space ::)

So far it seems that XP and 10 handle it better than Win7. Anybody else, Win7 especially? It might be a hiccup on my machine only...

hutch--

US edition Win 10 Professional.

caption, text
$$edi           xВведите текст
$$esi           Please report if the caption displays correctly


Caption on MessageBox same as $$edi.

felipe

Quote from: jj2007 on October 31, 2017, 10:51:31 AM
- on Win10, the first MsgBox displays correctly as xВведите текст, the second shows Введите текст, i.e. the correct result

That's exactly what i get in windows 8.1.

jj2007

Thanks. Anybody with Windows 7? I want to be sure that it's not just a broken feature on my machine...

Note that this works fine:include \masm32\include\masm32rt.inc ; plain Masm32 for the fans of pure assembler

__UNICODE__=1
.code
start: invoke MessageBoxW, 0, chr$("Hello World"), chr$("Masm32:"), MB_OK
exit

end start


The bug shows only with non-Latin text. Here is a plain Masm32 example:include \masm32\include\masm32rt.inc ; plain Masm32 for the fans of pure assembler

__UNICODE__=1

.code
szCaptionWbom db 0FFh, 0FEh
szCaptionW db 012h, 004h, 032h, 004h, 035h, 004h, 034h, 004h, 038h, 004h, 042h, 004h, 035h, 004h, 0, 0
szCaptionW2z db  012h, 004h, 032h, 004h, 035h, 004h, 034h, 004h, 038h, 004h, 042h, 004h, 035h, 004h, 0, 41h, 41h

start:
invoke MessageBoxW, 0, chr$("without BOM"), addr szCaptionW, MB_OK
invoke MessageBoxW, 0, chr$("with BOM"), addr szCaptionWbom, MB_OK
invoke MessageBoxW, 0, chr$("with one nullbyte"), addr szCaptionW2z, MB_OK
exit

end start


The third box shows the correct text (Введите) followed by some Chinese characters.

Another test with Chinese and Arabic text worked fine. Apparently, only cyrillic text is affected.

P.S.: For "MessageBoxW" "cyrillic" Google found something: Unicode in MessageBoxW (GameDev.net, same problem, no solution)

Another good read (but not directly related to this Win7 bug): Should UTF-16 be considered harmful?

Only the caption is affected, argument #3 (the text) always displays fine. Btw MessageBoxW has been around for a while: it is mentioned in Petzold 1998.

sinsi


jj2007

Thanks, John. That leaves two options:
1. My machine is broken
2. The bug is limited to the Italian version of Win7-64

I've tested this also with FreeBasic and C/C++, same behaviour :(

aw27

This should spell Введите everywhere:


.386

.model flat, stdcall

includelib \masm32\lib\user32.lib
MessageBoxW PROTO :ptr, :ptr, :ptr, :dword

.data ; Введите
format0 dw 0412h, 0432h, 0435h, 0434h, 0438h, 0442h, 0435h,00
format1 dw 0412h, 0432h, 0435h, 0434h, 0438h, 0442h, 0435h,00

.code

main proc
invoke MessageBoxW,0, offset format0, offset format1, 0
ret
main endp

end main

jj2007

Quote from: aw27 on October 31, 2017, 08:55:41 PM
This should spell Введите everywhere

I agree, it should. But it doesn't :(

Just found out that all captions are apparently affected. RichMasm used to open files with Russian names just fine, and display their names in the caption. Now it still opens, for example, "\Masm32\MasmBasic\Res\ucTemp\Моя первая программа Hello World.asc", but the caption has the little squares - unless I prepend a BOM to the filename, then it works perfectly.

And it is system-wide: with a right-click in Explorer, open with, Notepad or MS Word, I get the same problem; and again, only Russian is affected.

Since I have no other explanation, I suspect it was the major Windows update that they did a few days ago.

aw27

JJ,
It works fine in all operating systems, except yours I guess. I hope you can fix it, soon.