Dark Mode

Jump to content

KOI8-U

From Wikipedia, the free encyclopedia
(Redirected from Code page 1168)
Character encoding for Ukrainian Cyrillic
KOI8-U
LanguagesUkrainian, Russian, Bulgarian
Classification8-bit KOI, extended ASCII
ExtendsKOI8-B
Based onKOI8-R
Other related encodingsKOI8-RU, KOI8-F

KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters G', Ie, I, and Yi in both upper case and lower case.

KOI8-RU is closely related, but adds U for Belarusian. In both, the letter allocations match those in KOI8-E, except for G' which is added to KOI8-F.

In Microsoft Windows, KOI8-U is assigned the code page number 21866. In IBM, KOI8-U is assigned code page/CCSID 1168.[1][2][3]

KOI8 remains much more commonly used than ISO 8859-5, which never really caught on.[citation needed] Another common Cyrillic character encoding is Windows-1251. In the future, both may eventually give way to Unicode.

KOI8 stands for Kod Obmena Informatsiey, 8 bit (Russian: Kod Obmena Informatsiei, 8 bit) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Cyrillic letters are in pseudo-Latin alphabetic order rather than Cyrillic alphabetical order as in ISO 8859-5. This has the useful effect that if the eighth bit is stripped and the text is presented in any character set based on ASCII including the KOI8 sets themselves, the text is still reasonably human readable as a case-reversed transliteration. For instance, the "KOI" acronym "Kod Obmena Informatsiei" becomes kOD oBMENA iNFORMACIEJ.

Character set

[edit]

The following table shows the KOI8-U encoding.[1][4] Each character is shown with its equivalent Unicode code point.

KOI8-U
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x -
2500
|
2502
+
250C
+
2510
+
2514
+
2518
+
251C
+
2524
+
252C
+
2534
+
253C
#
2580
#
2584
#
2588
#
258C
#
2590
9x #
2591
#
2592
#
2593

2320
#
25A0

2219

221A

2248
<=
2264
>=
2265
NBSP
2321
deg
00B0
2
00B2
*
00B7
/
00F7
Ax -
2550
|
2551
+
2552
io
0451
ie
0454
+
2554
i
0456
yi
0457
+
2557
+
2558
+
2559
+
255A
+
255B
g'
0491
+
255D
+
255E
Bx +
255F
+
2560
+
2561
Io
0401
Ie
0404
+
2563
I
0406
Yi
0407
+
2566
+
2567
+
2568
+
2569
+
256A
G'
0490
+
256C
(c)
00A9
Cx iu
044E
a
0430
b
0431
ts
0446
d
0434
e
0435
f
0444
g
0433
kh
0445
i
0438
i
0439
k
043A
l
043B
m
043C
n
043D
o
043E
Dx p
043F
ia
044F
r
0440
s
0441
t
0442
u
0443
zh
0436
v
0432
'
044C
y
044B
z
0437
sh
0448
e
044D
shch
0449
ch
0447
'
044A
Ex Iu
042E
A
0410
B
0411
Ts
0426
D
0414
E
0415
F
0424
G
0413
Kh
0425
I
0418
I
0419
K
041A
L
041B
M
041C
N
041D
O
041E
Fx P
041F
Ia
042F
R
0420
S
0421
T
0422
U
0423
Zh
0416
V
0412
'
042C
Y
042B
Z
0417
Sh
0428
E
042D
Shch
0429
Ch
0427
'
042A
Differences with KOI8-R (non-Russian letters)

Although RFC 2319 says that character 0x95 should be U+2219 (), it may also be U+2022 (*) to match the bullet character in Windows-1251.

Some references have a typo and incorrectly state that character 0xB4 is U+0403, rather than the correct U+0404. This typo is present in Appendix A of RFC 2319 (but the table in the main text of the RFC gives the correct mapping).

See also

[edit]

References

[edit]
  1. ^ a b "SBCS code page information - CPGID: 01168 / Name: Ukrainian KOI8-U". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. IBM. C-H 3-3220-050. Archived from the original on 2017-02-18. Retrieved 2017-02-18. [1] [2]
  2. ^ "CCSID information document; CCSID 1168; KOI8-U". IBM. Archived from the original on 2017-02-18. Retrieved 2017-02-18.
  3. ^ International Components for Unicode (ICU), ibm-1168_P100-2002.ucm, 2002-12-03
  4. ^ Verdy, Philippe; Richter, Helmut (2016-01-04) [2008-10-13]. "KOI8-U.TXT". 2.0. Retrieved 2016-12-09.

Further reading

[edit]
[edit]
Early
telecommunication
ISO/IEC 8859
Bibliographic use
National standards
ISO/IEC 2022
Code pages
Mac OS
("scripts")
DOS
IBM AIX
Windows
EBCDIC
DEC
terminals
(VTx)
Platform
specific
Other
Unicode,
ISO/IEC 10646
TeX typesetting
Control character
Related topics