ISO/IEC 8859-2 - Wikipedia

2003-06-27

ISO/IEC 8859-2 | MIME / IANA | ISO-8859-2 | |---|---| | Alias(es) | iso-ir-101, csISOLatin2, latin2, l2, IBM1111 | | Language | (see below) | | Standard | ECMA-94:1986, ISO/IEC 8859 | | Classification | Extended ASCII, ISO/IEC 8859 | | Extends | US-ASCII | | Based on | ISO-8859-1 | | Other related encodings | Windows-1250, MacCroatian | ISO/IEC 8859-2:1999, Information technology -- 8-bit single-byte coded graphic character sets -- Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central[1] or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.[2] Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8 (on the web). ISO-8859-2 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. Less than 0.04% of all web pages use ISO-8859-2 as of October 2022.[3][4] Microsoft has assigned code page 28592 a.k.a. Windows-28592 to ISO-8859-2 in Windows. IBM assigned code page 912 to ISO 8859-2,[5] until that code page was extended in 1999.[6] Code page 1111 is similar, but replaces byte B0 deg (degree sign) with U+02DA @ (ring above). Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike Windows-1252, which keeps all printable characters from ISO-8859-1 in the same place). Language coverage [edit]These code values can be used for the following languages: - ^ The missing letter A is officially a part of the Finnish alphabet, however it has no native use and its usage is limited to foreign names only. - ^ In 2017, the Council for German Orthography officially added a capital SS, but is not actually required as SS can be used instead. - ^ This character set unifies S and T (S,T with commas below) with S and T (S, T with cedillas), as did virtually all other character sets including Microsoft's Windows-1250 and the first version of Unicode. However, Unicode subsequently disunified them, which complicates processing of Romanian data, as pre-existing data and input methods still contain the older cedilla codepoints.[citation needed] Code page layout [edit]Differences from ISO-8859-1 have the Unicode code point number underneath. | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | | |---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---| | 0x | |||||||||||||||| | 1x | |||||||||||||||| | 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | | 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | | 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | | 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ | | 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | | 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | | | 8x | |||||||||||||||| | 9x | |||||||||||||||| | Ax | NBSP | A 0104 | V 02D8 | L 0141 | $? | L 013D | S 015A | SS | " | S 0160 | S 015E | T 0164 | Z 0179 | SHY | Z 017D | Z 017B | | Bx | deg | a 0105 | , 02DB | l 0142 | ' | l 013E | s 015B | V 02C7 | , | s 0161 | s 015F | t 0165 | z 017A | " 02DD | z 017E | z 017C | | Cx | R 0154 | A | A | A 0102 | A | L 0139 | C 0106 | C | C 010C | E | E 0118 | E | E 011A | I | I | D 010E | | Dx | D 0110 | N 0143 | N 0147 | O | O | O 0150 | O | x | R 0158 | U 016E | U | U 0170 | U | Y | T 0162 | ss | | Ex | r 0155 | a | a | a 0103 | a | l 013A | c 0107 | c | c 010D | e | e 0119 | e | e 011B | i | i | d 010F | | Fx | d 0111 | n 0144 | n 0148 | o | o | o 0151 | o | / | r 0159 | u 016F | u | u 0171 | u | y | t 0163 | . 02D9 | See also [edit]References [edit]- ^ "Microsoft Outlook Message Encodings". 10 January 2017. - ^ "The Czech and Slovak Character Encoding Mess Explained". luki.sdf-eu.org. Retrieved 2022-02-27. - ^ "Usage Statistics and Market Share of ISO-8859-2 for Websites, October 2022". w3techs.com. Retrieved 2022-10-23. - ^ "Historical trends in the usage statistics of character encodings for websites, February 2022". - ^ "Icu-data/Charset/Data/XML/Ibm-912_P100-1995.XML at main * unicode-org/Icu-data". GitHub. - ^ "Icu-data/Charset/Data/Ucm/Ibm-912_P100-1999.ucm at main * unicode-org/Icu-data". GitHub. External links [edit]- ISO/IEC 8859-2:1999 - Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986) - ISO-IR 101 Right-Hand Part of Latin Alphabet No.2 (February 1, 1986) - ISO 8859-2 (Latin 2) Resources

Source: en.wikipedia.org