Ascii chart

aw27 · March 05, 2019, 11:23:36 PM

Quote from: HSE on March 05, 2019, 11:12:47 PM
:t Indeed I was thinking in "console output code page" (apparently by default "input console code page" is the same), but Jimg chart is using "system code page". In any case, usually you only access one code page.

I don't know about what you call "console output code page", "input console code page" and "system code page".
There are two groups of code pages in Windows systems: OEM and ANSI code pages. Why invent new names when we already have names?

HSE · March 05, 2019, 11:27:00 PM

Quote from: AW on March 05, 2019, 11:23:36 PM
There are two groups of code pages in Windows systems: OEM and ANSI code pages

That pages are used in different ways.

TimoVJL · March 05, 2019, 11:49:55 PM

Code Select

>chcp 437
Active code page: 437

>фысшш

word ascii from keyboard ::)

aw27 · March 06, 2019, 12:00:21 AM

Marketing people would say that all code pages cater to the same market.

I wrote some Portuguese text in Notepad++ ("Anões e cães é um caso ímpar disse o júri."), saved it with code page 860, then I went to the console:

Code Select


>chcp 860
Active code page: 860

>type lol.txt
Anões e cães é um caso ímpar disse o júri.

>chcp 1252
Active code page: 1252

>type lol.txt
An"es e c,,es , um caso ¡mpar disse o j£ri.

>chcp 437
Active code page: 437

>type lol.txt
Anöes e cäes é um caso ímpar disse o júri.

aw27 · March 06, 2019, 09:22:09 PM

Quote from: jj2007 on March 05, 2019, 10:22:02 PM
Quote from: HSE on March 05, 2019, 09:25:51 PMThe program show code page setted in your machine. Just that 8)

There is more than one code page in your system. Check HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage

I have 138, obtained with TCC (ex 4DOS):
10000 (MAC - Roman)
10001 (MAC - Japanese)
10002 (MAC - Traditional Chinese Big5)
10003 (MAC - Korean)
10004 (MAC - Arabic)
10005 (MAC - Hebrew)
10006 (MAC - Greek I)
10007 (MAC - Cyrillic)
10008 (MAC - Simplified Chinese GB 2312)
10010 (MAC - Romania)
10017 (MAC - Ukraine)
10021 (MAC - Thai)
10029 (MAC - Latin II)
10079 (MAC - Icelandic)
10081 (MAC - Turkish)
10082 (MAC - Croatia)
1026 (IBM EBCDIC - Turkish (Latin-5))
1047 (IBM EBCDIC - Latin-1/Open System)
1140 (IBM EBCDIC - U.S./Canada (37 + Euro))
1141 (IBM EBCDIC - Germany (20273 + Euro))
1142 (IBM EBCDIC - Denmark/Norway (20277 + Euro))
1143 (IBM EBCDIC - Finland/Sweden (20278 + Euro))
1144 (IBM EBCDIC - Italy (20280 + Euro))
1145 (IBM EBCDIC - Latin America/Spain (20284 + Euro))
1146 (IBM EBCDIC - United Kingdom (20285 + Euro))
1148 (IBM EBCDIC - International (500 + Euro))
1149 (IBM EBCDIC - Icelandic (20871 + Euro))
1250 (ANSI - Central Europe)
1251 (ANSI - Cyrillic)
1252 (ANSI - Latin I)
1253 (ANSI - Greek)
1254 (ANSI - Turkish)
1255 (ANSI - Hebrew)
1256 (ANSI - Arabic)
1257 (ANSI - Baltic)
1258 (ANSI/OEM - Viet Nam)
1361 (Korean - Johab)
20000 (CNS - Taiwan)
20001 (TCA - Taiwan)
20002 (Eten - Taiwan)
20003 (IBM5550 - Taiwan)
20004 (TeleText - Taiwan)
20005 (Wang - Taiwan)
20105 (IA5 IRV International Alphabet No.5)
20106 (IA5 German)
20107 (IA5 Swedish)
20108 (IA5 Norwegian)
20127 (US-ASCII)
20261 (T.61)
20269 (ISO 6937 Non-Spacing Accent)
20273 (IBM EBCDIC - Germany)
20277 (IBM EBCDIC - Denmark/Norway)
20278 (IBM EBCDIC - Finland/Sweden)
20280 (IBM EBCDIC - Italy)
20284 (IBM EBCDIC - Latin America/Spain)
20285 (IBM EBCDIC - United Kingdom)
20290 (IBM EBCDIC - Japanese Katakana Extended)
20297 (IBM EBCDIC - France)
20420 (IBM EBCDIC - Arabic)
20423 (IBM EBCDIC - Greek)
20424 (IBM EBCDIC - Hebrew)
20833 (IBM EBCDIC - Korean Extended)
20838 (IBM EBCDIC - Thai)
20866 (Russian - KOI8)
20871 (IBM EBCDIC - Icelandic)
20880 (IBM EBCDIC - Cyrillic (Russian))
20905 (IBM EBCDIC - Turkish)
20924 (IBM EBCDIC - Latin-1/Open System (1047 + Euro))
20932 (JIS X 0208-1990 & 0212-1990)
20936 (Simplified Chinese GB2312)
21025 (IBM EBCDIC - Cyrillic (Serbian, Bulgarian))
21027 (Ext Alpha Lowercase)
21866 (Ukrainian - KOI8-U)
28591 (ISO 8859-1 Latin I)
28592 (ISO 8859-2 Central Europe)
28593 (ISO 8859-3 Latin 3)
28594 (ISO 8859-4 Baltic)
28595 (ISO 8859-5 Cyrillic)
28596 (ISO 8859-6 Arabic)
28597 (ISO 8859-7 Greek)
28598 (ISO 8859-8 Hebrew: Visual Ordering)
28599 (ISO 8859-9 Latin 5)
28603 (ISO 8859-13 Latin 7)
28605 (ISO 8859-15 Latin 9)
37 (IBM EBCDIC - U.S./Canada)
38598 (ISO 8859-8 Hebrew: Logical Ordering)
437 (OEM - United States)
500 (IBM EBCDIC - International)
50220 (ISO-2022 Japanese with no halfwidth Katakana)
50221 (ISO-2022 Japanese with halfwidth Katakana)
50222 (ISO-2022 Japanese JIS X 0201-1989)
50225 (ISO-2022 Korean)
50227 (ISO-2022 Simplified Chinese)
50229 (ISO-2022 Traditional Chinese)
51949 (EUC-Korean)
52936 (HZ-GB2312 Simplified Chinese)
54936 (GB18030 Simplified Chinese)
55000 (SMS GSM 7bit)
55001 (SMS GSM 7bit Spanish)
55002 (SMS GSM 7bit Portuguese)
55003 (SMS GSM 7bit Turkish)
55004 (SMS GSM 7bit Greek)
57002 (ISCII - Devanagari)
57003 (ISCII - Bengali)
57004 (ISCII - Tamil)
57005 (ISCII - Telugu)
57006 (ISCII - Assamese)
57007 (ISCII - Odia (Oriya))
57008 (ISCII - Kannada)
57009 (ISCII - Malayalam)
57010 (ISCII - Gujarati)
57011 (ISCII - Punjabi (Gurmukhi))
708 (Arabic - ASMO)
720 (Arabic - Transparent ASMO)
737 (OEM - Greek 437G)
775 (OEM - Baltic)
850 (OEM - Multilingual Latin I)
852 (OEM - Latin II)
855 (OEM - Cyrillic)
857 (OEM - Turkish)
858 (OEM - Multilingual Latin I + Euro)
860 (OEM - Portuguese)
861 (OEM - Icelandic)
862 (OEM - Hebrew)
863 (OEM - Canadian French)
864 (OEM - Arabic)
865 (OEM - Nordic)
866 (OEM - Russian)
869 (OEM - Modern Greek)
870 (IBM EBCDIC - Multilingual/ROECE (Latin-2))
874 (ANSI/OEM - Thai)
875 (IBM EBCDIC - Modern Greek)
932 (ANSI/OEM - Japanese Shift-JIS)
936 (ANSI/OEM - Simplified Chinese GBK)
949 (ANSI/OEM - Korean)
950 (ANSI/OEM - Traditional Chinese Big5)
65000 (UTF-7)
65001 (UTF-8)

jj2007 · March 06, 2019, 11:22:07 PM

include \masm32\MasmBasic\MasmBasic.inc ; download
Init
GetRegArray "HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage", MyCP$(), MyData$()
For_ each esi in MyCP$(): <PrintLine esi, Tb$, MyData$(ForNextCounter)>
Inkey Str$("%i entries found", MyCP$(?))
EndOfCode

Code Select

...
20261   c_20261.nls
50229   c_is2022.dll
ACP     1252
OEMCP   850
MACCP   10000
137 entries found

jimg · March 08, 2019, 11:40:59 AM

Added quickie code pages dump. Unfortunately printing different code pages in a rich edit changes something I haven't found (unprintable characters 129, 141, 143, and 144 have incorrect widths) so next print of ascii table does not line up. The only way I found to fix it is to close the rich edit and make a new one, but I'm still looking.

If anyone knows any other way to print characters from multiple code pages at the same time, other than with a rich edit, please let me know.

edit:
combined with original in first post.

jimg · May 26, 2019, 06:28:43 AM

I have often needed a way to throw up a quick image to look at while working on another program. There are many programs out there to do that, but I wanted as little impact on the screen as possible, i.e. no title, no frame, just the image, made topmost. So I added the capability to my Ascii chart program for convenience. You can load any of the normal image types that gdiplus handles, (bmp, png, tif, gif, and jpg). The program also accepts drag and drop.

Download from the first post in this thread.

jimg · May 27, 2019, 03:50:00 AM

added image types .emf, .emf, and .ico
other minor bug fixes

Tedd · June 01, 2019, 11:18:39 PM

Quote from: jimg on March 08, 2019, 11:40:59 AM
If anyone knows any other way to print characters from multiple code pages at the same time, other than with a rich edit, please let me know.

Codepages are a way to squeeze a subset of characters into a 'page' of 8-bit sized values; as you've found, they're not meant to be mixed.
If only there were some way to take all of the world's characters and put them into one giant 'page' -- of course, there is. Windows uses "wide-characters" which are 16-bits each (note: this is not the same as Unicode, though it's similar), instead of trying to squeeze them into 8-bits.

So, what you need to do is convert from each codepage into wide-chararacters, and only actually use wide-characters for everything (you'll also need to use the ...W versions of functions, where appropriate).

Luckily, the hard part is already done for you: MultiByteToWideChar

jimg · June 02, 2019, 01:21:54 AM

Thanks Tedd :)

The MASM Forum

News:

Ascii chart

aw27

HSE

TimoVJL

aw27

aw27

jj2007

jimg

jimg

jimg

Tedd

jimg