KOREAN.TXT

(356 KB) Pobierz
##Adobe File Version: 1.000
#=======================================================================
#   FTP file name:  KOREAN.TXT
#
#   Contents:       Map (external version) from Mac OS Korean
#                   encoding to Unicode 2.1
#
#   Copyright:      (c) 1996-1999 by Apple Computer, Inc., all rights
#                   reserved.
#
#   Contact:        charsets@apple.com
#
#   Changes:
#
#       b02  1999-Sep-22    Update contact e-mail address. Matches
#                           internal utom<b1>, ufrm<b2>, and Text
#                           Encoding Converter version 1.5.
#       n04  1998-Feb-05    Update to match internal utom<n9>, ufrm<n11>
#                           and Text Encoding Converter version 1.3:
#                           Use single variant tags instead of multiple
#                           tags and add mappings for many more
#                           characters; see details below. Also delete
#                           the Unicode 1.1 mappings, reorder into a
#                           single list, and rewrite the initial
#                           comments.
#       n01  1996-Sep-24    Before internal ufrm.
#
# Standard header:
# ----------------
#
#   Apple, the Apple logo, and Macintosh are trademarks of Apple
#   Computer, Inc., registered in the United States and other countries.
#   Unicode is a trademark of Unicode Inc. For the sake of brevity,
#   throughout this document, "Macintosh" can be used to refer to
#   Macintosh computers and "Unicode" can be used to refer to the
#   Unicode standard.
#
#   Apple makes no warranty or representation, either express or
#   implied, with respect to these tables, their quality, accuracy, or
#   fitness for a particular purpose. In no event will Apple be liable
#   for direct, indirect, special, incidental, or consequential damages 
#   resulting from any defect or inaccuracy in this document or the
#   accompanying tables.
#
#   These mapping tables and character lists are subject to change.
#   The latest tables should be available from the following:
#
#   <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
#   <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
#
#   For general information about Mac OS encodings and these mapping
#   tables, see the file "README.TXT".
#
# Format:
# -------
#
#   Three tab-separated columns;
#   '#' begins a comment which continues to the end of the line.
#     Column #1 is the Mac OS Korean code (in hex as 0xNN or 0xNNNN)
#     Column #2 is the corresponding Unicode or Unicode sequence (in
#       hex as 0xNNNN, 0xNNNN+0xNNNN, etc.). Sequences of up to 5
#       Unicode characters are used here.
#     Column #3 is a comment containing the Unicode name.
#       In some cases an additional comment follows the Unicode name.
#
#   The entries are in Mac OS Korean code order. All one-byte
#   characters are at the beginning. The mappings are not complete
#   (about 140 characters are still unmapped); see below.
#
#   Some of these mappings require the use of corporate characters.
#   See the file "CORPCHAR.TXT" and notes below.
#
#   Control character mappings are not shown in this table, following
#   the conventions of the standard UTC mapping tables. However, the
#   Mac OS Korean encoding uses the standard control characters at
#   0x00-0x1F and 0x7F.
#
# Notes on Mac OS Korean:
# -----------------------
#
#	This table covers the standard Mac OS Korean encoding used in Mac OS
#	versions 7.1 and later, including the Korean Language Kit. The Mac OS
#   Korean encoding is based on EUC-KR, but it extends the low-byte range
#   and adds about 1140 characters using code points that are unassigned
#   in EUC-KR and code points using the extended low-byte range.
#
#   For Mac OS Korean, two-byte characters have first/lead/high byte in the
#   range 0xA1-0xFE, and second/trail/low byte in the range 0x41-0x7D or
#   0x81-0xFE (low bytes in the range 0x41-0x7D and 0x81-0xA0 are only used
#   with high bytes in the range 0xA1-0xAD).
#
# 1. Standard EUC-KR
#
#   This includes one-byte characters, which are usually the ASCII set. In
#   addition, it includes two-byte characters with both bytes in the range
#   0xA1-0xFE. The two-byte characters are from KSC 5601, but their code
#   points are transformed from the KSC 5601 range 0x2121-0xFEFE by adding
#   0x8080.
#   
# 2. Mac OS Korean additions
#
#    a)  One-byte additions
#
#          0x80  NO-BREAK SPACE
#          0x81  WON SIGN
#          0x82  EN DASH alternate version; standard at 0xA1A9
#          0x83  COPYRIGHT SIGN
#          0x84  FULLWIDTH LOW LINE alternate version; standard at 0xA3DF
#          0xFF  HORIZONTAL ELLIPSIS alternate version; standard at 0xA1A6
#
#    b)  Two-byte additions
#
#        These include various symbols and dingbat-like number and letter
#        forms. For all of these, the high byte is in the range 0xA1-0xAD.
#        Most of them use code points in the extended low-byte range
#        0x41-0x7D or 0x81-0xA0, although some use unassigned code points
#        in the standard EUC-KR range.
#
#        Many of these additional characters do not correspond to any
#        standard single Unicode character. See mapping issues, below.
#     
# Unicode mapping issues and notes:
# ---------------------------------
#
# 1. Mapping the Apple two-byte additions
#
#    The goals in the mappings provided here are:
#    - Ensure roundtrip mapping from every character in the Mac OS Korean
#    encoding to Unicode and back
#    - Use standard Unicode characters as much as possible, to maximize
#    interchangeability of the resulting Unicode text. Whenever possible,
#    avoid having content carried by private-use characters.
#
#    Since not all of the Mac OS Korean characters correspond to
#    distinct, single Unicode characters, we employ various strategies.
#
#    a)  Map a single Mac OS Korean character to a sequence of Unicode
#    characters
#
#    For example, the character 0xAA41 in the Apple additions is a
#    square Hangul dingbat. There is no single Unicode character for
#    this. However, it can be mapped to 0xC6B4+0x20DE, a Hangul syllable
#    + COMBINING ENCLOSING SQUARE
#
#    b)  Use private use characters to mark variants or groupings that
#    are similar to a sequence of one or more standard Unicode
#    characters.
#
#    Apple has defined a block of 32 corporate characters as "transcoding
#    hints." These are used in combination with standard Unicode characters
#    to force them to be treated in a special way for mapping to other
#    encodings; they have no other effect. Sixteen of these transcoding
#    hints are "grouping hints" - they indicate that the next 2-4 Unicode
#    characters should be treated as a single entity for transcoding. The
#    other sixteen transcoding hints are "variant tags" - they are like
#    combining characters, and can follow a standard Unicode (or a sequence
#    consisting of a base character and other combining characters) to
#    cause it to be treated in a special way for transcoding. These always
#    terminate a combining-character sequence.
#
#    The transcoding coding hints used in this mapping table are:
#
#    0xF860  group next 2 characters
#    0xF861  group next 3 characters
#    0xF862  group next 4 characters
#    0xF863  group next 4 characters, variant 1
#    0xF864  group next 4 characters, variant 2
#    0xF865  group next 4 characters, variant 3
#    0xF866  group next 4 characters, variant 4
#    0xF867  group next 2 characters, variant 1
#    0xF868  group next 2 characters, variant 2
#    0xF869  group next 2 characters, variant 3
#    0xF870-71  variant tags
#    0xF873-7D  variant tags
#    0xF87F  variant tag for other alternate forms
#
#    For example, the Apple addition character 0xA369 is a parenthesized
#    capital A. There is no single Unicode for this (although there are
#    single Unicodes for parenthesized small letters). Using the grouping
#    hint 0xF861 in combination with standard Unicodes, we can map this as
#    0xF861+0x0028+0x0041+0x0029, i.e. ( + A + ) .
#
#    NOTE: About 140 of the Apple two-byte additions are still unmapped
#    (the mappings are being worked out). These are shown as comment lines
#    with the Mac OS Korean code point followed by a Unicode mapping of
#    0xNNNN (or by a probable Unicode mapping in a few cases).
#
# 2. Mapping the basic EUC-KR characters
#
#    The mappings for KSC 5601-1987 Hangul to Unicode 2.1 are based on
#    the KSC5601.TXT mapping table provided by the Unicode Consortium (UTC),
#    dated 24 July 1995, which was created by Lori Hoerth and K.D.Chang.
#
#    The mappings for KSC 5601-1987 non-Hangul characters are based on the
#    OLD5601.TXT mapping table provided by the Unicode Consortium (UTC),
#    dated 6 December 1993, which was created by Glenn Adams and John
#    Jenkins. That table is Copyright 1991-1994 by Unicode, Inc.
#
#    Some of the non-Hangul mappings were changed from the UTC mappings.
#    There were two reasons for this:
#    - To better match the meaning of the KSC 5601 character as described
#    in the KSC 5601 spec.
#    - If the UTC table mapped the KSC character to a "fullwidth" version
#    but there was no mapping to the "basic" version, then the mapping was
#    changed to the "basic" version. This is more consistent with the other
#    UTC mapping tables, which only map to a compatibility character (such
#    as a fullwidth version) to preserve roundtrip fidelity - i.e. when
#    there is another character in the source encoding that i...
Zgłoś jeśli naruszono regulamin