The MASM Forum

Projects => Poasm => Pelle's C compiler and tools => Topic started by: CommonTater on March 10, 2013, 02:13:26 AM

Title: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 02:13:26 AM
DELETED .... because some things just aren't worth the trouble.

Moderator ... please delete this entire thread. 
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 02:41:27 AM
Interesting. Can you give an example how to use it in assembler, similar to this one which reads a Unicode text file into a string array and displays it in the console? Testfile attached.

include \masm32\MasmBasic\MasmBasic.inc        ; download (http://masm32.com/board/index.php?topic=94.0)
        Init
        Recall "Wide.txt", L$()
        For_ ebx=0 To eax-1
                wPrint wRec$(L$(ebx)), wCrLf$
        Next
        Inkey
        Exit
end start

Output:
Введите текст  здесь
Нажмите на эту кнопку
Добро пожаловать
أدخل النص هنا
دفع هذا الزر
مرحبا بكم
在這裡輸入文字
按一下這個按鈕
歡迎
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 03:00:16 AM
Quote from: jj2007 on March 10, 2013, 02:41:27 AM
Interesting. Can you give an example how to use it in assembler

1) No I can't ... and I don't even pretend that I can. 
Unlike you, on Pelles Forums,  I don't climb into areas where I have little or no experience and pretend to know more than the experienced programmers do. Just because you code in ASM does not mean you're smart.

2) Do you even know what UTF8 is? 
Seriously do you understand it's importance in communications and text file storage?  UTF8 is a 32 bit, variable length text format that behaves like ANSI text on most languages and uses the most compact version (usually 2 bytes) only for non-English characters. It can uniquely encode millions of characters, even morse code. Using it in preference to wide strings can cut communication time (almost) in half and reduce disk storage by similar amounts.

3) I posted it here, in a C language sub-forum, because one of the moderators asked me to.

So, no, I'm not going to play your infantile "Wigger Waggling" games.
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 03:34:50 AM
Quote from: CommonTater on March 10, 2013, 03:00:16 AM
2) Do you even know what UTF8 is?

No, Tater, I have no idea what UTF8 is. My code is based on black magic.

Seriously: Why do most of your posts inevitably end in insulting others? I asked a simple question, assuming you were able to give an example, and assuming that you were aware this is an assembler forum. I would even accept a C example, but your library doesn't contain any test case...
Title: Re: UTF8 disk I/O library ....
Post by: Vortex on March 10, 2013, 04:33:42 AM
Hi Jochen,

QuoteI would even accept a C example, but your library doesn't contain any test case...

Did you download and open the attachment? There is already an example in the zip file. You can check the test folder.

CommonTater is a very good and experienced coder. He is not aiming to insult people. Take it easy Jochen.

Attached is a Poasm example using the UTF8 library.
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 05:18:48 AM
Quote from: Vortex on March 10, 2013, 04:33:42 AM
Did you download and open the attachment?
Yes, I had downloaded and opened the attachment. I haven't seen any asm file in there, and the C files lack documentation. Overall it's not impressive.

QuoteCommonTater ... is not aiming to insult people.

Tater started insulting others (Farabi, me, qWord, ...) as soon as he had joined the forum, see this thread (http://masm32.com/board/index.php?topic=1079.msg11297#msg11297). We have generally a good atmosphere here, we can discuss politics without insulting each other, we tease each other but never using words like "idiot", "stupidity", "jackoffs" and "bullshit attitudes" (http://masm32.com/board/index.php?topic=1618.0); we even tolerate a certain level of arrogance provided the poster otherwise contributes constructive advice and code. Tater is just bloat and insult - even in his "own" forum he got kicked out, and not by me.
Title: Re: UTF8 disk I/O library ....
Post by: Vortex on March 10, 2013, 05:32:33 AM
Hi Jochen,

QuoteI would even accept a C example,

QuoteI haven't seen any asm file in there.

There is a C example in the test folder. The utf8io.h file provides explanation and it's easy to understand.

CommonTater is also a member of Pelles C forum. Why would he need to attack people as soon as he joined the forum?
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 05:34:33 AM
Quote from: Vortex on March 10, 2013, 05:32:33 AMWhy would he need to attack people as soon as he joined the forum?

Good question, Erol.
Title: Re: UTF8 disk I/O library ....
Post by: Vortex on March 10, 2013, 05:38:56 AM
Hi Jochen,

Everything will be fine. No any problem. We are like a family here. The motto of FIDE -  Fédération internationale des échecs or World Chess Federation can be applied to our forum :

QuoteGens una sumus - We are one people.
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 06:22:22 AM
Quote from: jj2007 on March 10, 2013, 03:34:50 AM
Quote from: CommonTater on March 10, 2013, 03:00:16 AM
2) Do you even know what UTF8 is?

No, Tater, I have no idea what UTF8 is. My code is based on black magic.

Seriously: Why do most of your posts inevitably end in insulting others?

Because you just beg for it.

Quote
I asked a simple question, assuming you were able to give an example, and assuming that you were aware this is an assembler forum. I would even accept a C example, but your library doesn't contain any test case...

Ok, now I know you didn't even bother to look in the zip file... Not only are there a 32 bit and 64 bit test programs included with it... I gave you the source code for both!

Now kindly be a useful idiot and Smeg off!
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 06:25:28 AM
Quote from: Vortex on March 10, 2013, 05:32:33 AM
QuoteI would even accept a C example,
QuoteI haven't seen any asm file in there.
There is a C example in the test folder. The utf8io.h file provides explanation and it's easy to understand.

Hi Vortex ... is it not obvious he never even bothered to look?

Quote
CommonTater is also a member of Pelles C forum. Why would he need to attack people as soon as he joined the forum?

Since 2004 ....
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 06:39:35 AM
Quote from: jj2007 on March 10, 2013, 05:18:48 AM
Tater started insulting others (Farabi, me, qWord, ...) as soon as he had joined the forum, see this thread (http://masm32.com/board/index.php?topic=1079.msg11297#msg11297).

Get your facts straight.  I came here --by invitation-- to give you guys access to some software I was working on (Easy Build)... about 3 months into sharing that freely without question on the forums I ran into a software situation where I felt a) a little out of my depth and b) probably should be written in ASM for optimal performance. 

I asked very nicely if anyone was up to a challenge and would consider writing this code for me... Share and share alike... So, what did I get back?  Some guy goes "I can do that" only to admit 3 posts later that he'd lied; someone else tells me they'll do it for a fee; lots of people refer me elsewhere.... Not one offer to help out...

And there I am staring at my screen in utter disbelief...
This is NOT how this is supposed to work... and I said so.

Were the situation reversed I would have offered to help someone sharing with me,  free of charge. 

Now I've got you on my case in two forums....
And exactly what do you think you're trying to accomplish?
(Other than making a complete ass of yourself, that is)

FWIW ... I finally did get the needed DShow object... it came in around 1350bytes and works a treat. 
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 07:17:25 AM
Quote from: CommonTater on March 10, 2013, 06:39:35 AMSome guy goes "I can do that" only to admit 3 posts later that he'd lied

"Some guy" was our young friend Farabi, and nobody else here would ever call him a liar.

Back to content: I have compiled your test case, no error messages, fine, but when I feed a real UTF-8 file with Russian, Arabic and Chinese characters, I get this on screen:

Reading Utf8.txt
File open...
File content...
↕2548B5 B5:AB  745AL
↔06<8B5 =0 MBC :=>?:C
¶>1@> ?>60;>20BL
#/.D 'DF5 GF'
/A9 G0' 'D21
E1-(' (CE
(↓ß8eçW
         ♂↓♂    §
a╬
(null)
File closed...


Expected outcome is this:
UTF-8 text file created, now reading & printing
Введите текст здесь
Нажмите на эту кнопку
Добро пожаловать
أدخل النص هنا
دفع هذا الزر
مرحبا بكم
在這裡輸入文字
按一下這個按鈕
歡迎


I attach my version (the *.asc source opens best in RichMasm, but WordPad and Ms Word work, too).

Note that I tried it from the commandline, too, using chcp 65001 and Lucida Console. Inside the same console window, my proggie WideReadWrite.exe displays the exotic languages just fine, yours produces garbage. It's a mystery. Could it be the wprintf(L... ?

P.S.: With English UTF-8 text, your library works just fine :t
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 07:33:44 AM
Quote from: jj2007 on March 10, 2013, 07:17:25 AMNote that I tried it from the commandline, too, using chcp 65001 and Lucida Console. Inside the same console window, my proggie WideReadWrite.exe displays the exotic languages just fine, yours produces garbage. It's a mystery. Could it be the wprintf(L... ?

P.S.: With English UTF-8 text, your library works just fine :t

You feed it a word processor file and then try to claim my library doesn't work when it can't sort out the formatting marks and starts producing errors...



Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 07:45:04 AM
Quote from: CommonTater on March 10, 2013, 07:33:44 AM
You feed it a word processor file and then try to claim my library doesn't work when it can't sort out the formatting marks

No, Utf8.txt is a plain UTF-8 text file, of the type you claim to handle with your library. You can open it in Notepad, \Masm32\qeditor.exe, poide.exe, and it will display 9 lines of Russian, Arabic and Chinese text.
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 08:09:03 AM
Quote from: jj2007 on March 10, 2013, 07:45:04 AM
Quote from: CommonTater on March 10, 2013, 07:33:44 AMYou feed it a word processor file and then try to claim my library doesn't work when it can't sort out the formatting marks

No, Utf8.txt is a plain UTF-8 text file, of the type you claim to handle with your library. You can open it in Notepad, \Masm32\qeditor.exe, poide.exe, and it will display 9 lines of Russian, Arabic and Chinese text.

 The demo program was to show you how to use the function calls.  It's NOT a  word processor... it reads and writes wide strings in CONSOLE mode which does not support multiple code pages as you're asking it to.  There was NO attempt to provide anything except an example to show how to use the functions.

GOT THAT?

Make a GUI mode program... a little edit window and use the functions to save and load the edit window... and watch what happens...  OH MY GOD... look at that... it actually works!  WOW....

The actual conversions are done with Windows API calls that are known to work.... world wide... and are in use in literally thousands of programs.  If you'd even bothered to do anything except finding a way to foul it up then claiming it doesn't work you would know that. 



Title: Re: UTF8 disk I/O library ....
Post by: dedndave on March 10, 2013, 08:29:07 AM
the differences may be due to the absence or presence of a BOM marker at the beginning of the file
while not part of the UTF-8 spec, it is usually present in windows unicode text files
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 08:34:52 AM
Quote from: dedndave on March 10, 2013, 08:29:07 AM
the differences may be due to the absence or presence of a BOM marker at the beginning of the file
while not part of the UTF-8 spec, it is usually present in windows unicode text files

That's what I also suspected but (Main.c):

    // opening for read
    wprintf(L"Reading Utf8.txt\n");
    file = OpenUTF8(L"Utf8.txt", FM_READ);
    if(GetLastError() == ERR_BOM)
      wprintf(L"No BOM! ... ");


The Utf8.txt file has a BOM, and besides, Tater's code works if the UTF-8 strings are English only.
Title: Re: UTF8 disk I/O library ....
Post by: Gunther on March 10, 2013, 08:51:59 AM
Hi CommonTater,

I've read the entire thread and I'm not the referee nor the moderator here, but:

Quote from: CommonTater on March 10, 2013, 08:09:03 AM
Listen up ASSHOLE .... Now take your petty STUPID vendettas and ram them right up your FAT HOMO ASS!

that is not our way of speaking here!

Gunther
Title: Re: UTF8 disk I/O library ....
Post by: CommonTater on March 10, 2013, 11:39:55 AM
Quote from: Gunther on March 10, 2013, 08:51:59 AMthat is not our way of speaking here!

Obviously... and I apologize to everyone but JJ.
AND... if he starts his bullshit up again, he'll get exactly the same reaction from me everytime!

It's got nothing to do with the Byte Order Message ... My code consumes that without microsoft's text conversion routines ever seeing it... in fact they would be confused by it.

It's got nothing to do main.c .... the test program does exactly what it's supposed to do... it shows you how to call the functions.

It is about the limitations on windows console (cmd.exe).  The test??.exe files were used for debugging purposes and were provided with the archive as demonstrations. They show you how to call the library's 4 functions... It's NOT a word processor. It's NOT a multilanguage display system... it's a freaking demo program and nothing more.

The reason he thinks it doesn't work is he is trying to do multiple foreign languages in WINDOWS CONSOLE which is bound to a specific code page and after the first text output can only correctly display text within that one --singular-- codepage. 

Let me say this one more time for perfect clarity ...

It's the console window that can't display the text
The library works... or it would not have been online.



Windows GUI, on the other hand uses a different system and can and does work correctly with the UTF8 library.  And most importantly it works BECAUSE, I've used the windows native conversion apis... you know the ones in use all around the world with no problems at all. 

Don't believe me?  Well the source code is in the zip file... look up the functions used...  Nothing is hidden, it's all right there for anyone to examine...

This is a 100% manufactured problem that exists only in JJ's dementia and noplace else.

And, despite the people telling me to grow a thicker skin or buy a sense of humour....

NO I do not have to take that that crap from anyone. 
I never have in the past and I'm not going to start now.

Do we understand one another now?

NB: Edited severely by the Admin'
Title: Re: UTF8 disk I/O library ....
Post by: Dubby on March 10, 2013, 06:26:51 PM
whoa... what happen guys...??

first I don't want to insult anyone.. then I would like to apologize if do so...

OK here's my comment..
the CMD window or you call it console simply use it's selected font to do font rendering... without any font replacement in which GUI dialog boxes does...
for a simple test, just open up cmd and copy paste some Unicode character (not the English one).
Now choose different available font, in my win7 system the "consolas" font is available choose that one.. now copy and paste some Russian text.. it will displayed correctly...

why? simply... because the font support it...
now paste some Chinese character, it will displayed nothing but garbage...
then what about choosing another font?
but hey MS doesn't even let me choose another font..
just go here: http://blogs.msdn.com/b/oldnewthing/archive/2007/05/16/2659903.aspx
and here: http://support.microsoft.com/kb/247815
you'll see why...
but it will be another case if you direct the output to another window....


Some people might be better at something but not in another thing...

I'm not against anyone here, but the only one I support is the truth... (oops is it too rhetorical..?)
again my apologize to anyone who might got insulted...
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 07:15:57 PM
Quote from: CommonTater on March 10, 2013, 11:39:55 AMThe reason he thinks it doesn't work is that  the dumb shit is trying to do multiple foreign languages in WINDOWS CONSOLE which is bound to a specific code page and after the first text output can only correctly display text within that one --singular-- codepage.

Tater, you are confused. The Windows console is indeed able to display several languages simultaneously, using one codepage - UTF8. My examples demonstrate that very clearly. It is a bit challenging, yes, but it is feasible. Otherwise there would be no console windows in a country that has three times the population of the U.S.

Now if you truly convinced that it's only a console issue, you are free to post a GUI example with a simple edit control. I am really curious if it works, honestly :biggrin:
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 08:00:13 PM
Quote from: Dubby on March 10, 2013, 06:26:51 PM
for a simple test, just open up cmd and copy paste some Unicode character (not the English one).
Now choose different available font, in my win7 system the "consolas" font is available choose that one.. now copy and paste some Russian text.. it will displayed correctly...

why? simply... because the font support it...
now paste some Chinese character, it will displayed nothing but garbage...

Hi Dubby,

The copy & paste test works fine with Russian, but it is a bit trickier with the more exotic fonts like Arabic or Chinese. Even if the fonts are installed (like on my machines), pasting doesn't work - but you can print them to the console. Ask a Billion Chinese, they can confirm that ;-)
Title: Re: UTF8 disk I/O library ....
Post by: Dubby on March 10, 2013, 09:20:32 PM
I think I need to correct something in my previous post...
"it's not about the font but the locale..."

Isn't it because their codepage already set in Chinese?

simply change the system locale...
I guess almost all non English folks have their default locale sets to their language...
in the attachment below contain 2 images.. one is in English locale and one in Chinese locale both of them were using utf-8 set to the console output..
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 09:26:59 PM
Quote from: Dubby on March 10, 2013, 09:20:32 PM
"it's not about the font but the locale..."

Isn't it because their codepage already set in Chinese?

simply change the system locale...

Well, mine is set to Italian - and I can display simultaneously Italian, English, Russian, Chinese, Arabic and a number of others. The reason is simple: Codepage UTF-8 alias 65001 is meant for that. It can display every language for which fonts are installed.

It is actually a bit trickier if you look at the details, but it works :biggrin:
Title: Re: UTF8 disk I/O library ....
Post by: Dubby on March 10, 2013, 09:45:04 PM
okey... would you kind enough to provide an example... :D
or at least how to achieve it...
Title: Re: UTF8 disk I/O library ....
Post by: jj2007 on March 10, 2013, 09:48:22 PM
Quote from: Dubby on March 10, 2013, 09:45:04 PM
okey... would you kind enough to provide an example... :D
or at least how to achieve it...

See reply #1.

Just saw that the example there uses "true" Unicode. Here is the UTF-8 version (Utf8.txt attached):

include \masm32\MasmBasic\MasmBasic.inc        ; download (http://masm32.com/board/index.php?topic=94.0)
  Init
  Recall "Utf8.txt", MyStrings$()        ; read strings from file into an array
  ; remember that a) user needs to set console font to Lucida, b) Chinese & Arabic fonts must be installed
  SetCpUtf8                ; set the codepage
  For_ ebx=0 To eax-1
        Print MyStrings$(ebx), CrLf$        ; print to console
  Next
  Inkey CrLf$, "--- hit any key ---"        ; wait for a keypress
  Exit
end start