THAI.TXT

(14 KB) Pobierz
##Adobe File Version: 1.000
#=======================================================================
#   FTP file name:  THAI.TXT
#
#   Contents:       Map (external version) from Mac OS Thai
#                   character set to Unicode 2.1
#
#   Copyright:      (c) 1995-1999 by Apple Computer, Inc., all rights
#                   reserved.
#
#   Contact:        charsets@apple.com
#
#   Changes:
#
#       b02  1999-Sep-22    Update contact e-mail address. Matches
#                           internal utom<b1>, ufrm<b2>, and Text
#                           Encoding Converter version 1.5.
#       n07  1998-Feb-05    Update to match internal utom<n5>, ufrm<n13>
#                           and Text Encoding Converter version 1.3:
#                           Use standard Unicodes plus transcoding hints
#                           instead of single corporate characters; see
#                           details below. Also update header comments
#                           to new format.
#       n04  1995-Nov-17    First version (after fixing some typos).
#                           Matches internal ufrm<n6>.
#
# Standard header:
# ----------------
#
#   Apple, the Apple logo, and Macintosh are trademarks of Apple
#   Computer, Inc., registered in the United States and other countries.
#   Unicode is a trademark of Unicode Inc. For the sake of brevity,
#   throughout this document, "Macintosh" can be used to refer to
#   Macintosh computers and "Unicode" can be used to refer to the
#   Unicode standard.
#
#   Apple makes no warranty or representation, either express or
#   implied, with respect to these tables, their quality, accuracy, or
#   fitness for a particular purpose. In no event will Apple be liable
#   for direct, indirect, special, incidental, or consequential damages 
#   resulting from any defect or inaccuracy in this document or the
#   accompanying tables.
#
#   These mapping tables and character lists are subject to change.
#   The latest tables should be available from the following:
#
#   <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
#   <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
#
#   For general information about Mac OS encodings and these mapping
#   tables, see the file "README.TXT".
#
# Format:
# -------
#
#   Three tab-separated columns;
#   '#' begins a comment which continues to the end of the line.
#     Column #1 is the Mac OS Thai code (in hex as 0xNN)
#     Column #2 is the corresponding Unicode or Unicode sequence
#       (in hex as 0xNNNN or 0xNNNN+0xNNNN).
#     Column #3 is a comment containing the Unicode name
#
#   The entries are in Mac OS Thai code order.
#
#   Some of these mappings require the use of corporate characters.
#   See the file "CORPCHAR.TXT" and notes below.
#
#   Control character mappings are not shown in this table, following
#   the conventions of the standard UTC mapping tables. However, the
#   Mac OS Thai character set uses the standard control characters at
#   0x00-0x1F and 0x7F.
#
# Notes on Mac OS Thai:
# ---------------------
#
#   Codes 0xA1-0xDA and 0xDF-0xFB are the character set from Thai
#   standard TIS 620-2533, except that the following changes are
#   made:
#     0xEE is TRADE MARK SIGN (instead of THAI CHARACTER YAMAKKAN)
#     0xFA is REGISTERED SIGN (instead of THAI CHARACTER ANGKHANKHU)
#     0xFB is COPYRIGHT SIGN (instead of THAI CHARACTER KHOMUT)
#
#   Codes 0x80-0x82, 0x8D-0x8E, 0x91, 0x9D-0x9E, and 0xDB-0xDE are
#   various additional punctuation marks (e.g. curly quotes,
#   ellipsis), no-break space, and two special characters "word join"
#   and "word break".
#
#   Codes 0x83-0x8C, 0x8F, and 0x92-0x9C are for positional variants
#   of the upper vowels, tone marks, and other signs at 0xD1,
#   0xD4-0xD7, and 0xE7-0xED. The positional variants would normally
#   be considered presentation forms only and not characters. In most
#   cases they are not typed directly; they are selected automatically
#   at display time by the WorldScript software. However, using the
#   Thai-DTP keyboard, the presentation forms can in fact be typed
#   directly using dead keys. Thus they must be treated as real
#   characters in the Mac OS Thai encoding. They are mapped using
#   variant tags; see below.
#
#   Several code points are undefined and unused (they cannot be
#   typed using any of the Mac OS Thai keyboard layouts): 0x90, 0x9F,
#   0xFC-0xFE. These are not shown in the table below.
#
# Unicode mapping issues and notes:
# ---------------------------------
#
#   The goals in the Apple mappings provided here are:
#   - Ensure roundtrip mapping from every character in the Mac OS Thai
#   character set to Unicode and back
#   - Use standard Unicode characters as much as possible, to maximize
#   interchangeability of the resulting Unicode text. Whenever possible,
#   avoid having content carried by private-use characters.
#
#   To satisfy both goals, we use private use characters to mark variants
#   that are similar to a sequence of one or more standard Unicode
#   characters.
#
#   Apple has defined a block of 32 corporate characters as "transcoding
#   hints." These are used in combination with standard Unicode characters
#   to force them to be treated in a special way for mapping to other
#   encodings; they have no other effect. Sixteen of these transcoding
#   hints are "grouping hints" - they indicate that the next 2-4 Unicode
#   characters should be treated as a single entity for transcoding. The
#   other sixteen transcoding hints are "variant tags" - they are like
#   combining characters, and can follow a standard Unicode (or a sequence
#   consisting of a base character and other combining characters) to
#   cause it to be treated in a special way for transcoding. These always
#   terminate a combining-character sequence.
#
#   The transcoding coding hints used in this mapping table are four
#   variant tags in the range 0xF873-75. Since these are combined with
#   standard Unicode characters, some characters in the Mac OS Thai
#   character set map to a sequence of two Unicodes instead of a single
#   Unicode character. For example, the Mac OS Thai character at 0x83 is a
#   low-left positional variant of THAI CHARACTER MAI EK (the standard
#   mapping is for the abstract character at 0xE8). So 0x83 is mapped to
#   0x0E48 (THAI CHARACTER MAI EK) + 0xF875 (a variant tag).
#
# Details of mapping changes in each version:
# -------------------------------------------
#
#   Changes from version n04 to version n07:
#
#   - Changed mappings of the positional variants to use standard
#   Unicodes + transcoding hint, instead of using single corporate
#   zone characters. This affected the mappings for the following:
#   0x83-08C, 0x8F, 0x92-0x9C
#
#   - Just comment out unused code points in the table, instead
#   of mapping them to U+FFFD.
#
##################

0x20	0x0020	# SPACE
0x21	0x0021	# EXCLAMATION MARK
0x22	0x0022	# QUOTATION MARK
0x23	0x0023	# NUMBER SIGN
0x24	0x0024	# DOLLAR SIGN
0x25	0x0025	# PERCENT SIGN
0x26	0x0026	# AMPERSAND
0x27	0x0027	# APOSTROPHE
0x28	0x0028	# LEFT PARENTHESIS
0x29	0x0029	# RIGHT PARENTHESIS
0x2A	0x002A	# ASTERISK
0x2B	0x002B	# PLUS SIGN
0x2C	0x002C	# COMMA
0x2D	0x002D	# HYPHEN-MINUS
0x2E	0x002E	# FULL STOP
0x2F	0x002F	# SOLIDUS
0x30	0x0030	# DIGIT ZERO
0x31	0x0031	# DIGIT ONE
0x32	0x0032	# DIGIT TWO
0x33	0x0033	# DIGIT THREE
0x34	0x0034	# DIGIT FOUR
0x35	0x0035	# DIGIT FIVE
0x36	0x0036	# DIGIT SIX
0x37	0x0037	# DIGIT SEVEN
0x38	0x0038	# DIGIT EIGHT
0x39	0x0039	# DIGIT NINE
0x3A	0x003A	# COLON
0x3B	0x003B	# SEMICOLON
0x3C	0x003C	# LESS-THAN SIGN
0x3D	0x003D	# EQUALS SIGN
0x3E	0x003E	# GREATER-THAN SIGN
0x3F	0x003F	# QUESTION MARK
0x40	0x0040	# COMMERCIAL AT
0x41	0x0041	# LATIN CAPITAL LETTER A
0x42	0x0042	# LATIN CAPITAL LETTER B
0x43	0x0043	# LATIN CAPITAL LETTER C
0x44	0x0044	# LATIN CAPITAL LETTER D
0x45	0x0045	# LATIN CAPITAL LETTER E
0x46	0x0046	# LATIN CAPITAL LETTER F
0x47	0x0047	# LATIN CAPITAL LETTER G
0x48	0x0048	# LATIN CAPITAL LETTER H
0x49	0x0049	# LATIN CAPITAL LETTER I
0x4A	0x004A	# LATIN CAPITAL LETTER J
0x4B	0x004B	# LATIN CAPITAL LETTER K
0x4C	0x004C	# LATIN CAPITAL LETTER L
0x4D	0x004D	# LATIN CAPITAL LETTER M
0x4E	0x004E	# LATIN CAPITAL LETTER N
0x4F	0x004F	# LATIN CAPITAL LETTER O
0x50	0x0050	# LATIN CAPITAL LETTER P
0x51	0x0051	# LATIN CAPITAL LETTER Q
0x52	0x0052	# LATIN CAPITAL LETTER R
0x53	0x0053	# LATIN CAPITAL LETTER S
0x54	0x0054	# LATIN CAPITAL LETTER T
0x55	0x0055	# LATIN CAPITAL LETTER U
0x56	0x0056	# LATIN CAPITAL LETTER V
0x57	0x0057	# LATIN CAPITAL LETTER W
0x58	0x0058	# LATIN CAPITAL LETTER X
0x59	0x0059	# LATIN CAPITAL LETTER Y
0x5A	0x005A	# LATIN CAPITAL LETTER Z
0x5B	0x005B	# LEFT SQUARE BRACKET
0x5C	0x005C	# REVERSE SOLIDUS
0x5D	0x005D	# RIGHT SQUARE BRACKET
0x5E	0x005E	# CIRCUMFLEX ACCENT
0x5F	0x005F	# LOW LINE
0x60	0x0060	# GRAVE ACCENT
0x61	0x0061	# LATIN SMALL LETTER A
0x62	0x0062	# LATIN SMALL LETTER B
0x63	0x0063	# LATIN SMALL LETTER C
0x64	0x0064	# LATIN SMALL LETTER D
0x65	0x0065	# LATIN SMALL LETTER E
0x66	0x0066	# LATIN SMALL LETTER F
0x67	0x0067	# LATIN SMALL LETTER G
0x68	0x0068	# LATIN SMALL LETTER H
0x69	0x0069	# LATIN SMALL LETTER I
0x6A	0x006A	# LATIN SMALL LETTER J
0x6B	0x006B	# LATIN SMALL LETTER K
0x6C	0x006C	# LATIN SMALL LETTER L
0x6D	0x006D	# LATIN SMALL LETTER M
0x6E	0x006E	# LATIN SMALL LETTER N
0x6F	0x006F	# LATIN SMALL LETTER O
0x70	0x0070	# LATIN SMALL LETTER P
0x71	0x0071	# LATIN SMALL LETTER Q
0x72	0x0072	# LATIN SMALL LETTER R
0x73	0x0073	# LATIN SMALL LETTER S
0x74	0x0074	# ...
Zgłoś jeśli naruszono regulamin