In Thailand, the Thai Character Set standard, TIS 620-2533, is a
national standard for a primary set of graphic characters for Thai
information interchange. It was defined by the Thai Industrial
Standards Institute (TISI), Ministry of Industry, Royal Thai Government
in 1986 (Buddhist year 2529) and was revised in 1990 (Buddhist year
2533).
TIS 620 defines an eight-bit character environment. Assigned character
values are given below in a character table originally published by National Electronics and Computer
Technology Center(NECTEC):
Thai character classification in TIS 620 is meant to ease computer
processing when dealing with displaying and input sequence checking,
and
is not related to the Thai linguistic.
The Thai characters are classified into six classes:
These control characters are nondisplayable characters being used as control codes for output display or data communiction, totally 66 control characters.
They are (0x00) to (0x1F), (0x7F), (0x80) to (0x9F), (0xFF).
The TIS 620 character set contains 44 consonants, they are from (0xA1) to (0xCE), as shown in the table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
A1 | KO KAI | ¡ |
A2 | KHO KHAI | ¢ |
A3 | KHO KHUAT | £ |
A4 | KHO KHWAI | ¤ |
A5 | KHO KHON | ¥ |
A6 | KHO RAKHANG | ¦ |
A7 | NGO NGU | § |
A8 | CHO CHAN | ¨ |
A9 | CHO CHING | © |
AA | CHO CHANG | ª |
AB | SO SO | « |
AC | CHO CHOE | ¬ |
AD | YO YING | |
AE | DO CHADA | ® |
AF | TO PATAK | ¯ |
B0 | THO THOTHAN | ° |
B1 | THO NANGMONTHO | ± |
B2 | THO PH00 THAO | ² |
B3 | NOR NANE | ³ |
B4 | DOR DEK | ´ |
B5 | TO TAO | µ |
B6 | THO THUNG | ¶ |
B7 | THO THAHAN | · |
B8 | THO THONG | ¸ |
B9 | NO NU | ¹ |
BA | BO BAIMAI | º |
BB | PO PLA | » |
BC | PHO PHERNG | ¼ |
BD | FO FA | ½ |
BE | PO PAN | ¾ |
BF | FO FAN | ¿ |
C0 | PO SAMPOW | À |
C1 | MO MA | Á |
C2 | YO YAK | Â |
C3 | RO RUA | Ã |
C5 | LO LING | Å |
C7 | WO WAEN | Ç |
C8 | SO SALA | È |
C9 | SO RUSI | É |
CA | SO SUA | Ê |
CB | HO HEEP | Ë |
CC | LO CHULA | Ì |
CD | O ANG | Í |
CE | HO NOKHUK | Î |
The TIS 620 character set contains 18 vowels, divided into four groups.
(1). Leading vowels (LV):
These vowels are placed before consonants, totally 5 leading vowels, as shown in below table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
E0 | SARA E | à |
E1 | SARA AE | á |
E2 | SARA O | â |
E3 | SARA AI MAIMUAN | ã |
E4 | SARA AI MAIMALAI | ä |
(2). Following Vowels (FV):
These vowels are placed after consonants. totally 6 following vowels, and the 6 following vowels are further divided into two groups.
Normal following vowels:
Hexadecimal | Character Name | Thai Character |
---|---|---|
D0 | SARA A | Ð |
D2 | SARA AAT | Ò |
D3 | SARA AM | Ó |
E5 | LAKKHANGYAO | å |
Special following vowels:
Hexadecimal | Character Name | Thai Character |
---|---|---|
C4 | RU | Ä |
C6 | LU | Æ |
(3). Below Vowels (BV).
These vowels are placed below consonants, totally 2 below vowels, as shown in below table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
D8 | SARA U | Ø |
D9 | SARA UU | Ù |
(4). Above vowels (AV).
These vowels are placed above consonants, totally 5 above vowels, as
shown in below table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
D1 | MAI HAN-AKAT | Ñ |
D4 | SARA E | Ô |
D5 | SARA EE | Õ |
D6 | SARA UR | Ö |
D7 | SARA UUR | × |
The TIS 620 character set contains 4 tone marks:
Hexadecimal | Character Name | Thai Character |
---|---|---|
E8 | MAI EK | è |
E9 | MAI THO | é |
EA | MAI TRIE | ê |
EB | MAI CHATTAWA | ë |
The TIS 620 character set contains 5 diacritics divided into two groups.
(1) Above diacritics (AD).
These diacritics are placed above initial or final consonants, totally 4 above diacritics, shown as below table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
E7 | MAITAIKHU | ç |
EC | THANTHAKHAT | ì |
ED | NIKHAHIT | í |
EE | YAMAKKAN | î |
(2) Below diacritic (BD).
The below diacritic is placed below final or clustered consonants, totally only one below diacritic, shown as below table:
Hexadecimal | Character Name | Thai Character |
---|---|---|
DA | PHINTHU | Ú |
The TIS 620 character set contains 18 noncomposible characters.
These characters cannot be composed with above vowels, below vowels,
tone marks, above diacritics and below diacritic. Noncomposible
characters are divided into seven groups.
(1) Graphic characters.
There are 94 graphic characters, subdivided into 52 English alphabetic characters (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z), 10 digits (0 1 2 3 4 5 6 7 8 9), and 32 special characters which include !@#$%^&*()_+[]\{}|;',./:"<>?.`~
(2) Space (0x20).
(3) One no-break space:
Hexadecimal | Character Name | Thai Character |
---|---|---|
A0 | NO-BREAK SPACE |
(4) Ten Thai digits:
Hexadecimal | Character Name | Thai Character |
---|---|---|
F0 | THAI ZERO | ð |
F1 | THAI ONE | ñ |
F2 | THAI TWO | ò |
F3 | THAI THREE | ó |
F4 | THAI FOUR | ô |
F5 | THAI FIVE | õ |
F6 | THAI SIX | ö |
F7 | THAI SEVEN | ÷ |
F8 | THAI EIGHT | ø |
F9 | THAI NINE | ù |
(5) Six Thai special characters:
Hexadecimal | Character Name | Thai Character |
---|---|---|
CF | PAYANGNOI | Ï |
DF | BAHT (Thai currency sign) | ß |
E6 | MAIYAMOK | æ |
EF | FONGMAN | ï |
FA | ANGKHANKHU | ú |
FB | KHOMUT | û |
(6) One word separator:
Hexadecimal | Character Name | Thai Character |
---|---|---|
DC | WORD SEPARATOR |
This character is a nonprintable character. It is used for separating words in Thai sentences. Applications can make use of it to simplify Thai word processing.
(7) Reserved Characters.
Totally 6 reserved characters, they are (0xDB), (0xDD), (0xDE),
(0xFC), (0xFD), (0xFE).
In order to describle Thai input/output methods, character
classification for some classes (FV, BV, AV, and AV) have been
re-classified into subclass such as FV1, FV2 and FV3. Thus a total of
17 subclasses (Thaweesak et al (1991)), details shown as below
table:
Class | Number | Description |
CTRL | 66 | Control characters: (0x00) - (0x1F), (0x7F), (0x80) -(0x9F), (0xFF) |
NON | 119 | Non-composible
characters, include below characters: (1) All English alphabets, (0x20) - (0x7E). (2) TIS 620-2533 characters, such as (0xA0), (0xDC), (0xCF), (0xDF), (0xE6), (0xEF), (0xF0) - (0xF9), (0xFA), (0xFB) (3) Reserved characters, (0xDB), (0xDD), (0xDE), (0xFC), (0xFD), (0xFE) |
CONS | 44 | Thai consonants, (0xA1) - (0xC3), (0xC5), (0xC7) - (0xCE) |
LV | 5 | (0xE0), (0xE1), (0xE2), (0xE3), (0xE4) |
FV1 | 3 | (0xD0), (0xD2), (0xD3) |
FV2 | 1 | (0xE5) |
FV3 | 2 | (0xC4), (0xC6) |
BV1 | 1 | (0xD8) |
BV2 | 1 | (0xD9) |
BD | 1 | (0xDA) |
TONE | 4 | (0xE8), (0xE9), (0xEA), (0xEB) |
AD1 | 2 | (0xED), (0xEC) |
AD2 | 1 | (0xE7) |
AD3 | 1 | (0xEE) |
AV1 | 1 | (0xD4) |
AV2 | 2 | (0xD1), (0xD6) |
AV3 | 2 | (0xD5), (0xD7) |
Thai characterscan also be classified according to character levels. There are five character levels:
Prior to Unicode, there was a common convention agreed upon by vendors
for implementing Thai, called WTT 2.0, based on TIS-620 eight-bit
character set. (WTT, pronounced Wor Thor Thor, is a Thai
abbreviation of Wing Thook Thee which means Runs Everywhere
). It comprise 3 parts, defining the general facilities, Thai
input/output method, and printer identification number, respectively.
According to Wtt 2.0, There are some basic rules concerning the Thai input sequence:
WTT 2.0 defines 3 levels of syntactic strictness of input method as follow:
For strict check mode, it's input sequece checking rules follow the
syntax diagram below:
A single table, shared by the input method and output method, is
defined for describing the character sequence conditions:
C T R L |
N O N |
C O N S |
L V |
F V 1 |
F V 2 |
F V 3 |
B V 1 |
B V 2 |
B D |
T O N E |
A D 1 |
A D 2 |
A D 3 |
A V 1 |
A V 2 |
A V 3 |
|
CTRL | X | A | A | A | A | A | A | R | R | R | R | R | R | R | R | R | R |
NON | X | A | A | A | S | S | A | R | R | R | R | R | R | R | R | R | R |
CONS | X | A | A | A | A | S | A | C | C | C | C | C | C | C | C | C | C |
LV | X | S | A | S | S | S | S | R | R | R | R | R | R | R | R | R | R |
FV1 | X | S | A | S | A | S | A | R | R | R | R | R | R | R | R | R | R |
FV2 | X | A | A | A | A | S | A | R | R | R | R | R | R | R | R | R | R |
FV3 | X | A | A | A | S | A | S | R | R | R | R | R | R | R | R | R | R |
BV1 | X | A | A | A | A | S | A | R | R | R | C | C | R | R | R | R | R |
BV2 | X | A | A | A | S | S | A | R | R | R | C | R | R | R | R | R | R |
BD | X | A | A | A | S | S | A | R | R | R | R | R | R | R | R | R | R |
TONE | X | A | A | A | A | A | A | R | R | R | R | R | R | R | R | R | R |
AD1 | X | A | A | A | S | S | A | R | R | R | R | R | R | R | R | R | R |
AD2 | X | A | A | A | S | S | A | R | R | R | R | R | R | R | R | R | R |
AD3 | X | A | A | A | S | S | A | R | R | R | R | R | R | R | R | R | R |
AV1 | X | A | A | A | S | S | A | R | R | R | C | C | R | R | R | R | R |
AV2 | X | A | A | A | S | S | A | R | R | R | C | R | R | R | R | R | R |
AV3 | X | A | A | A | S | S | A | R | R | R | C | R | C | R | R | R | R |
The rows are for types of previous character, and the columns are for types of following character. The codes in table cells determine the condition of the order:
The original Thai keyboard layout just follows closely the layout of
the popular layout of Thai typewriter.
There are several keyboard layouts popular in Thailand:
(1) Ketmanee (TIS 820 - 2531) keyboard layout
In 1986, Thai Industrial Standards Institute (TISI) , announced TIS 620-2529 , the Thai standard character code for computers. Two years later TISI announced the Ketmanee layout as the standard layout for computers (TIS 820-2531).
The TIS 820 - 2531 (1988) keyboard layout shown as below:
This "Kedmanee" keyboard layout was designed for typewriter, due to
number of keys limitation of typewriter, some Thai special characters
were cutted off.
(2) TIS820 - 2538 (1995) keyboard layout
The TIS 820 - 2538 (1995) keyboard layout shown as below:
(3). Pattachote Keyboard Layout:
Pattachote keyboard was also designed for typewriter, but with better finger-load distribution. Pattachote used the statistics of Thai keystroke distributions to design a new keyboard layout using the following principles: