News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Defining UNICODE string longer than 240 chars

Started by bluedevil, August 28, 2018, 10:08:59 PM

Previous topic - Next topic

bluedevil

Hello everyone, how'r you doing?
I am sorry to open another topic about UNICODE  :icon_confused:

But i searched google and forums:  :icon_cool:
Quote
http://masm32.com/board/index.php?topic=6206.msg66114#msg66114 -> nice topic i learnt a lot
http://masm32.com/board/index.php?topic=2785.msg29334 -> ascoder shared a string.inc file in this topic really nice
http://masm32.com/board/index.php?topic=6397.msg68563 -> this topic is about unicode chars in Console
http://masm32.com/board/index.php?topic=2054.0 -> This is a very nice topic about usage of TCHR macro of qWord


Under c:\masm32\help i've found and had alook at hlhelp.chm ofcourse and:
QuoteMASM does not natively support quoted unicode string data but its macro engine is capable of providing quoted UNICODE strings up to 240 characters long which is enough in most instances for quoted text in code.
My problem starts here:
I wanted to write unicode under quotes and can split my long sentences by escape characters or using Carriage turn line feed? But all i can do is only up to 240 chars?

I have tried several macros from our forum and from google but neither help me.
nidud and jj2007 have awesome solutions but i just want to code "masm"

1. Is it possible to define a variable with unicode strings longer than 240 chars?"
2. I can't make "A2WDAT" macro run. But this macro also only allows 240 chars right?
3. How can i allocate memory or define empty buffer for unicode strings. These are not working:
.data?
TCHR szBuff1, 128 DUP(0) ;->not working
UCSTR szBuff2, 128 DUP(0) ;->not working

4.jj2007 always shares awesome codes, thanks. But jj2007 your code can't gave the true output on my machine, output is Chinese chars instead of Turkish:
And jj2007, i wonder if, using MultiByteToWideChar API solves my 240 chars problem?
__UNICODE__ equ 1
include c:\masm32\include\masm32rt.inc
include String.inc

;tchr macro by qWord
;e.g. tchr szFileName,'\\?\C:\Users\A\Desktop\calc.exe',0
TCHR    MACRO   lbl,args:VARARG
    IFDEF __UNICODE__
        UCSTR lbl,args
    ELSE
        lbl db args
    ENDIF
        ENDM

    .DATA


.const
CAP_G equ 011Eh; Ğ LATIN CAPITAL LETTER G WITH BREVE
SML_G equ 011Fh; ğ LATIN SMALL LETTER G WITH BREVE
CAP_i equ 0130h; İ LATIN CAPITAL LETTER I WITH DOT ABOVE
SML_I equ 0131h; ı LATIN SMALL LETTER DOTLESS I
CAP_S equ 015Eh; Ş LATIN CAPITAL LETTER S WITH CEDILLA
SML_S equ 015Fh; ş LATIN SMALL LETTER S WITH CEDILLA

.data
TCHR msgTITLE, ".title: ",SML_G,0
UCSTR msgBODY , ".body:",13,10,\
"the purpose is to write something",13,10,\
"longer than 240 charachers in unicode",13,10,\
"strings. Because: why not?",13,10,13,10,\
"Some Turkish unicode characters:",SML_S," ",CAP_S,0
msgUNIBODY db "Try Line:ŞşĞğıİ",0

.data?
szBuff db 1024 DUP(?)

.code
start:
invoke MultiByteToWideChar,CP_UTF8,0,offset msgUNIBODY,-1,offset szBuff,100
invoke MessageBox,NULL,addr msgUNIBODY,addr msgTITLE,MB_OK
invoke ExitProcess,0

end start


Thanks a lot
..Dreams make the future
But the past never lies..
BlueDeviL // SCT
My Code Site:
BlueDeviL Github

fearless

you can define it manually if needed, something like the following will give you 512 spaces + couple CR LFs and ending in double null:
.CONST
WCRLF                   EQU 13,0,10,0 ; Wide CRLF pair
WNULL                   EQU 0,0,0,0 ; Wide NULL
.DATA
TestW                   DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,WCRLF
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,WCRLF
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB ' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0,' ',0
                        DB WNULL


just place the chars you require in the spaces between the quotes, and any sentences that require CRLF just add the ',WCRLF' at the end of the line and finally finish it off by having the zero terminating null WNULL.

HSE

QuoteIs it possible to define a variable with unicode strings longer than 240 chars?"
Assemblers can handle a limited size line. I think You must try AsmC, which standard release handle 4k lines. For longer lines you must rebuild AsmC or UAsm... but that is to give up to make a better code.   
Equations in Assembly: SmplMath

hutch--

Have a look at some of the tools in the MASM32 SDK.

Multitool.exe will convert ascii to a number of formats.
Uniedit.exe will read and write UNICODE. You can save it as a binary resource and include it in your EXE.

Now which form of Chinese do you want to use, traditional or simplified ?

jj2007

Example attached, in pure Masm32 (i.e. no MasmBasic). The only problem is that you need an editor that understands UTF8. I recommend RichMasm ;)

include \masm32\include\masm32rt.inc

.data?
buffer  db 1000 dup(?)

.data
title$  dw "H", "e", "l", "l", "o", 0
MyString        db "Now following 1700 chars: ", 13, 10
        REPEAT 100
                db "Добро пожаловать "
        ENDM
.code
start:
  invoke SetConsoleOutputCP, CP_UTF8
  print "simple print: Добро пожаловать", 13, 10
  print offset MyString
  invoke MultiByteToWideChar, CP_UTF8, 0, chr$("Welcome - 歡迎 - مرحبا بكم - Добро пожаловать"), -1, addr buffer, 1000
  invoke MessageBoxW, 0, addr buffer, addr title$, MB_OK
  exit

end start


P.S. If you want to try UniEdit, here is a converter:

include \masm32\MasmBasic\MasmBasic.inc         ; download
  Init
  Let esi=FileRead$("\Masm32\MasmBasic\AscUser\UnicodeInPureMasm32.asm")
  Let esi=wRec$(esi)
  lea ecx, [2*wLen(esi)]
  FileWrite "\Masm32\MasmBasic\AscUser\UnicodeInPureMasm32_uc.asm", esi, ecx
  Launch "\Masm32\uniedit.exe \Masm32\MasmBasic\AscUser\UnicodeInPureMasm32_uc.asm"
EndOfCode


P.P.S: Please PM me regarding the Turkish vs Chinese problem, I am curious what went wrong there.

hutch--

#5
See below for the turkish unicode version.

bluedevil

First things first:
1. I am using RadASM for a long time and i love using it. But i understand that it does not have UTF-8 support. So i can't get the output i want. @mrfearless are we going to bring UTF8 support to RadASM  :eusa_dance:

So i changed my platform to notepad2 with UTF8 encoding!

2. jj2007 thank you very much, the code below works like a charm - i even added my Turkish characters -  İıĞğŞş:
.data?
buffer  db 1000 dup(?)

.data
title$  dw "H", "e", "l", "l", "o", 0
MyString        db "Now following 1700 chars: ", 13, 10
        REPEAT 100
                db "Добро пожаловать İıĞğŞş"
        ENDM
.code
start:
  invoke SetConsoleOutputCP, CP_UTF8
  print "simple print: Добро пожаловать", 13, 10
  print offset MyString
  invoke MultiByteToWideChar, CP_UTF8, 0, chr$("Welcome - 歡迎 - مرحبا بكم - Добро пожаловать"), -1, addr buffer, 1000
  invoke MessageBoxW, 0, addr buffer, addr title$, MB_OK
  exit

end start

IMPORTANT : But i use notepad2 with UTF8 encoding while saving the asm file!

3. hutch-- i don't need Chinese. I need Turkish chars to print on dialogs messageboxes and console. And also read them. Can you share your chinese.zip source code with Turkish content. Here is for you:
QuoteSAMPLE TURKISH WORDS:
AĞAÇLANDIRILMAK, AĞDALAŞTIRILMAK, AĞIRLAŞTIRILMAK, AŞAĞILANABİLMEK, AŞAĞILAŞABİLMEK, AŞAĞILAYABİLMEK, ATKUYRUĞUGİLLER, BAĞDAŞTIRABİLME, BAĞDAŞTIRICILIK, BAĞIMLILAŞTIRMA, BAĞIŞLANABİLMEK, BAĞIŞLATABİLMEK, BAĞIŞLAYABİLMEK, BAĞIŞLAYIVERMEK, BAĞITLANABİLMEK, BAĞITLAYABİLMEK, BAĞNAZLAŞTIRMAK, BAYAĞILAŞTIRMAK, BEĞENDİREBİLMEK, BOĞAZLANABİLMEK, BOĞAZLATABİLMEK, BOĞAZLAYABİLMEK, BOĞAZLAYIVERMEK, BOĞUKLAŞABİLMEK, BOĞUMLANABİLMEK, BUKAĞILAYABİLME, ÇAĞCILLAŞTIRMAK, ÇAĞDAŞLAŞABİLME, ÇAĞDAŞLAŞTIRMAK, ÇAĞRIŞTIRABİLME, ÇAĞRIŞTIRIVERME, ÇOĞALTILABİLMEK, ÇOĞULLAŞTIRILMA, DARMADAĞINIKLIK, DEĞDİRİLEBİLMEK, DEĞERLENDİRİLİŞ, DEĞERLENDİRİLME, DEĞERLENEBİLMEK, DEĞERLENİVERMEK, DEĞİŞTİRİVERMEK, AĞAÇLANDIRIŞ, AĞAÇLANDIRMA, AĞARTABİLMEK, AĞBENEKLİLİK, AĞDALAŞTIRMA, AĞILANDIRMAK, AĞIRBAŞLILIK, AĞIRCANLILIK, AĞIRKANLILIK, AĞIRLAŞTIRMA, AĞIRŞAKLANMA, AĞIZLIKÇILIK, AĞLATABİLMEK, AĞLAYABİLMEK, AĞLAYIVERMEK, AĞRITABİLMEK, AĞRIYABİLMEK, AKCİĞERLİLER, ALABİLDİĞİNE, ARACILIĞIYLA, ASLANKUYRUĞU, BABAYİĞİTLİK, BAĞDAŞABİLME, BAĞDAŞMAZLIK, BAĞDAŞTIRICI, BAĞDAŞTIRMAK, BAĞIMLILAŞMA, BAĞINTICILIK, BAĞINTILILIK, BAĞIRABİLMEK, BAĞIRIVERMEK, BAĞIRTABİLME, BAĞIŞLAMAMAK, BAĞIŞLATILMA, BAĞLAMACILIK, BAĞLANABİLME, BAĞLANIVERME, BAĞLATABİLME, BAĞLAYABİLME, BAĞLAYICILIK

4. @mrfearless i really excited about your reply but i can only print first character of your TestW string buffer, look:
include C:\masm32\include\masm32rt.inc
.const
WCRLF                   EQU 13,0,10,0 ; Wide CRLF pair
WNULL                   EQU 0,0,0,0 ; Wide NULL
.data?
buffer  db 1000 dup(?)

.data
title$  dw "H", "e", "l", "l", "o", 0
TestW                   DB 'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,'İ',0,WCRLF
                        DB 'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,'Ş',0,WCRLF
                        DB 'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,'Ğ',0,WCRLF
                        DB 'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,'ı',0,WCRLF
                        DB 'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,'ş',0,WCRLF
                        DB 'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,'ğ',0,WCRLF
                        DB WNULL
.code
start:
  invoke SetConsoleOutputCP, CP_UTF8
  print "simple print: Добро пожаловать", 13, 10
  print offset TestW                                                        ; only prints first char
  invoke MultiByteToWideChar, CP_UTF8, 0,addr TestW ,-1, addr buffer, 1000  ; only prints first char
  invoke MessageBoxW, 0, addr buffer, addr title$, MB_OK
  exit

end start


Thanks everyone of you for your great replies thanks
..Dreams make the future
But the past never lies..
BlueDeviL // SCT
My Code Site:
BlueDeviL Github

hutch--

 :biggrin:

> hutch-- i don't need Chinese. I need Turkish chars to print on dialogs messageboxes and console.

That's easy, produce the UNICODE in Turkish characters and use the technique I have shown you in the demo.

jj2007

Quote from: blue_devil on August 29, 2018, 08:04:16 AMBut i use notepad2 with UTF8 encoding while saving the asm file!

Just tested it, seems a nice Notepad replacement, but where is the build command?

QuoteSAMPLE TURKISH WORDS

I made a quick test:

include \masm32\MasmBasic\MasmBasic.inc         ; download

$Data AĞAÇLANDIRILMAK, AĞDALAŞTIRILMAK, AĞIRLAŞTIRILMAK, AŞAĞILANABİLMEK, AŞAĞILAŞABİLMEK, AŞAĞILAYABİLMEK, ATKUYRUĞUGİLLER, BAĞDAŞTIRABİLME, BAĞDAŞTIRICILIK, BAĞIMLILAŞTIRMA, BAĞIŞLANABİLMEK, BAĞIŞLATABİLMEK, BAĞIŞLAYABİLMEK, BAĞIŞLAYIVERMEK, BAĞITLANABİLMEK, BAĞITLAYABİLMEK, BAĞNAZLAŞTIRMAK, BAYAĞILAŞTIRMAK, BEĞENDİREBİLMEK, BOĞAZLANABİLMEK, BOĞAZLATABİLMEK, BOĞAZLAYABİLMEK, BOĞAZLAYIVERMEK, BOĞUKLAŞABİLMEK, BOĞUMLANABİLMEK, BUKAĞILAYABİLME, ÇAĞCILLAŞTIRMAK, ÇAĞDAŞLAŞABİLME, ÇAĞDAŞLAŞTIRMAK, ÇAĞRIŞTIRABİLME, ÇAĞRIŞTIRIVERME, ÇOĞALTILABİLMEK, ÇOĞULLAŞTIRILMA, DARMADAĞINIKLIK, DEĞDİRİLEBİLMEK, DEĞERLENDİRİLİŞ
$Data DEĞERLENDİRİLME, DEĞERLENEBİLMEK, DEĞERLENİVERMEK, DEĞİŞTİRİVERMEK, AĞAÇLANDIRIŞ, AĞAÇLANDIRMA, AĞARTABİLMEK, AĞBENEKLİLİK, AĞDALAŞTIRMA, AĞILANDIRMAK, AĞIRBAŞLILIK, AĞIRCANLILIK, AĞIRKANLILIK, AĞIRLAŞTIRMA, AĞIRŞAKLANMA, AĞIZLIKÇILIK, AĞLATABİLMEK, AĞLAYABİLMEK, AĞLAYIVERMEK, AĞRITABİLMEK, AĞRIYABİLMEK, AKCİĞERLİLER, ALABİLDİĞİNE, ARACILIĞIYLA, ASLANKUYRUĞU, BABAYİĞİTLİK, BAĞDAŞABİLME, BAĞDAŞMAZLIK, BAĞDAŞTIRICI, BAĞDAŞTIRMAK, BAĞIMLILAŞMA, BAĞINTICILIK, BAĞINTILILIK, BAĞIRABİLMEK, BAĞIRIVERMEK, BAĞIRTABİLME, BAĞIŞLAMAMAK, BAĞIŞLATILMA, BAĞLAMACILIK, BAĞLANABİLME, BAĞLANIVERME, BAĞLATABİLME, BAĞLAYABİLME, BAĞLAYICILIK

  Init
  Read Turkish$()       ; blue_devil thread
  For_ ecx=0 To eax-1
        PrintLine Turkish$(ecx), Tb$, Lower$(Turkish$(ecx))
  Next
  Inkey
EndOfCode


Does the output for the Lower$() look correct for you?
AGAÇLANDIRILMAK agaçlandirilmak
AGDALASTIRILMAK agdalastirilmak
...
BAGLAYABILME    baglayabilme
BAGLAYICILIK    baglayicilik


Full project attached (the *.asc is the rtf version for use with RichMasm, but I added also plain *.asm for testing with Notepad2). Btw I had to split the $Data line above once because of the line characters limit (still, you need UAsm or AsmC to build it, Masm doesn't accept long lines).

P.S.: I sneaked in a test of the Instr_() function, it seems to work, although I get the impression that my console font does not handle the accents properly:
        If_ Instr_(Turkish$(ecx), "ĞIŞ") Then Print "* "      ; just for fun - testing Instr_()

This is odd: The text displays with Lucida Console, but it seems to lack some accents. When I try to set it to the Consolas font, the OS refuses that attempt and uses raster fonts instead ::)

hutch--

Same demo as before but with Turkish UNICODE text. Example is PURE MASM. You should be able to use the identical technique in RadAsm.

Press the "1" button to display the text in wordwrap form.

bluedevil

@hutch--, i have attached "unicode-utf8-cpPNG" image for you. If you copy and paste Turkish characters, it is ok. But there are problems while opening unicode and utf8 encoded txt files.

@jj2007
1. My build options are so simple:

c:\masm32\bin\ml.exe /c /coff Turkish.asm
c:\masm32\bin\link.exe /subsystem:console /release Turkish.obj


2. I also attached Mb-result-jj.png for you. I installed Masmbasic and tried to compile your last sample. But the output is unfortunately wrong. Please check the PNG image

Thanks.

P.S.
@jj2007 yesterday i assemble and link the source and i said i worked like a charm. But we didn't add the line:
__UNICODE__ equ 1
But our code worked and print Turkish characters, how??
..Dreams make the future
But the past never lies..
BlueDeviL // SCT
My Code Site:
BlueDeviL Github

jj2007

Quote from: blue_devil on August 29, 2018, 08:28:54 PM2. I also attached Mb-result-jj.png for you. I installed Masmbasic and tried to compile your last sample. But the output is unfortunately wrong. Please check the PNG image

That looks more like a problem with the console font. I attach a Gui version that draws to an edit control - please check. Note the *.asc file (to be opened in \Masm32\MasmBasic\RichMasm.exe) contains two sources; F6 builds the upper one, unless you selected the Init (then it builds the console version).

Quote@jj2007 yesterday i assemble and link the source and i said i worked like a charm. But we didn't add the line:
__UNICODE__ equ 1
But our code worked and print Turkish characters, how??

__UNICODE__ gets happily ignored by MasmBasic. Most string functions have an Ansi and a Unicode version preceded by w:
Print "Hello World (Ansi)"
wPrint "Hello World (wide)"

The reason why a Print "AĞAÇLANDIRILMAK", Tb$, Lower$("AĞAÇLANDIRILMAK"), CrLf$ works is that the console does indeed understand Utf8.

hutch--

> @hutch--, i have attached "unicode-utf8-cpPNG" image for you. If you copy and paste Turkish characters, it is ok. But there are problems while opening unicode and utf8 encoded txt files.

The test piece addressed your original question, how to store large strings (any size) which it does correctly from the DB data in the .DATA section. What you have mentioned is file IO which is a different matter.

The test piece can be set up with the following equates.

    __UNICODE__ equ 1           ; uncomment to enable UNICODE build
    UNICODE_EDIT equ 1

This will load and save unicode text.

The equate effects the following code in the richedit.asm file.

    IFNDEF UNICODE_EDIT
      invoke SendMessage,edit,EM_STREAMIN,SF_TEXT,ADDR est
    ELSE
      invoke SendMessage,edit,EM_STREAMIN,SF_TEXT or SF_UNICODE,ADDR est
    ENDIF


If you want to write your code in MASMbasic, let me know and I won't waste my time with pure MASM.