HKSCS
Encyclopedia
The Hong Kong Supplementary Character Set is a set of Chinese character
Chinese character
Chinese characters are logograms used in the writing of Chinese and Japanese , less frequently Korean , formerly Vietnamese , or other languages...

s -- 4,702 in total in the initial release—used in Cantonese
Standard Cantonese
Cantonese, or Standard Cantonese, is a language that originated in the vicinity of Canton in southern China, and is often regarded as the prestige dialect of Yue Chinese....

, as well as when writing the names of some places in Hong Kong (whether in written Cantonese
Written Cantonese
Cantonese has the most well-developed written form of all Chinese varieties apart from the standard varieties of Mandarin and Classical Chinese. Standard written Chinese is based on Mandarin, but when spoken word for word as Cantonese, it sounds unnatural because its expressions are ungrammatical...

 or standard written Chinese
Vernacular Chinese
Written Vernacular Chinese refers to forms of written Chinese based on the vernacular language, in contrast to Classical Chinese, the written standard used from the Spring and Autumn Period to the early twentieth century...

 sentences). It evolved from the preceding Government Chinese Character Set or GCCS. GCCS is a set of supplementary Chinese character
Chinese character
Chinese characters are logograms used in the writing of Chinese and Japanese , less frequently Korean , formerly Vietnamese , or other languages...

s coded in the user-defined areas of the Big5
Big5
Big-5 or Big5 is a character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.Mainland China, which uses Simplified Chinese Characters, uses the GB instead.- Organization :...

 character set. It was originally used within the Hong Kong Government and later used by the public. It later evolved into Hong Kong Supplementary Character Set when the characters in the set were submitted to ISO-10646 for coding.

Development History

Due to the inherent differences between written Mandarin and written Cantonese
Written Cantonese
Cantonese has the most well-developed written form of all Chinese varieties apart from the standard varieties of Mandarin and Classical Chinese. Standard written Chinese is based on Mandarin, but when spoken word for word as Cantonese, it sounds unnatural because its expressions are ungrammatical...

, the Hong Kong Government recognized the need for a standardized set of proprietary characters that would allow for the streamlining of electronic communication; at the time, the Big5
Big5
Big-5 or Big5 is a character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.Mainland China, which uses Simplified Chinese Characters, uses the GB instead.- Organization :...

 Chinese encoding scheme did not contain a vast majority of these characters (some were erroneously cross-listed with similar characters).

The Government Chinese Character Set or GCCS was thus developed by the government. The character set consists of Chinese characters commonly used in Hong Kong. Some characters are Cantonese-specific, while some are alternative forms of characters. The set is not well-organised and the characters are not closely examined.

Subsequently, the HKSCS-1999 (HKSCS 1999 specification) was developed. Following its acceptance, newer revisions were released in 2001 (adding 116 new characters) and in 2004 (adding 123 new characters), totalling 4,941 characters.

The HKSCS is encoded in Big5
Big5
Big-5 or Big5 is a character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.Mainland China, which uses Simplified Chinese Characters, uses the GB instead.- Organization :...

 and ISO 10646. Starting from HKSCS-2004, all characters using to Private Use Area section of Unicode are remapped, with many of them reassigned to Extension B Block or Supplementary Ideographic Plane Compatibility Block. However, to preserve compatibility with programs that generated PUA code points, the allocated code points are reserved, and no new characters will be mapped to PUA.

Version history

VersionTotal charactersPublish date
GCCS 3,049 1995-?
HKSCS (now HKSCS-1999) 4,702 1999-9
HKSCS-2001 4,818 2001-12
HKSCS-2004 4,941 2005-5
HKSCS-2008 5,009 2009-12


Microsoft Windows

In Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 98, NT 4.0, 2000, XP, HKSCS support can be enabled using Microsoft's patch. In Microsoft's implementation, application using code page 950 automatically uses a hidden code page 951 table for the Big5
Big5
Big-5 or Big5 is a character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.Mainland China, which uses Simplified Chinese Characters, uses the GB instead.- Organization :...

 encoding of the HKSCS extensions. The table supports all code points in HKSCS-2001, except for the compatibility code points specified by the standard. In addition, the MingLiU font is altered using Microsoft's patch. This patch is known to create conflicts in applications such as Microsoft Office
Microsoft Office
Microsoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...

, or any application using fonts supporting simplified Chinese characters (e.g.: SimSun). If the target environment contains custom font mapped to the code points affected by Microsoft's patch, the custom fonts can undo Microsoft's patch. Furthermore, the patch breaks EUDC Editor supplied with the affected versions of Windows.

Starting with Windows Vista, HKSCS-2004 characters are only be supported as Unicode 4.1 or later. All characters are assigned standard, non-PUA codepoints. The characters are displayed with the MingLiU font, and these characters can be entered via the keyboard. The patch that provides Big5
Big5
Big-5 or Big5 is a character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.Mainland China, which uses Simplified Chinese Characters, uses the GB instead.- Organization :...

 encoding of HKSCS is unsupported in Windows Vista and later. A utility provided by Microsoft is available to convert HKSCS and Unicode PUA-encoded characters to Unicode 4.1 version.

In 2010, Microsoft published a HKSCS-2004 patch for Windows XP and Windows Server 2003. It replaces Windows XP version of MingLiu, PMingLiu, and MingLiu_HKSCS (if HKSCS-2001 patch was applied) with Windows 7 version of MingLiu, PMingLiu and MingLiu_HKSCS. In addition, MingLiU-ExtB, MingLiU_HKSCS-ExtB and PMingLiU-ExtB fonts will be added onto target system. However, IME is not updated as it was in the case of HKSCS-2001 patch, and the fonts are from pre-release of Windows 7.

For earlier versions of the OS, HKSCS support requires the use of Microsoft's patch, or the Hong Kong government's Digital 21's utilities.

Linux

HKSCS support was added to glibc in 2000, but it has not been updated since then. HKSCS-2004 support is handled as Unicode 4.1 and later.

For freedesktop.org
Freedesktop.org
freedesktop.org is a project to work on interoperability and shared base technology for free software desktop environments for the X Window System on Linux and other Unix-like operating systems. It was founded by Havoc Pennington from Red Hat in March 2000.The organisation focuses on the user....

 setup, AR PL ShanHeiSun Uni font fully supports HKSCS-2004 since 0.1-0.dot.1, with latest revision of HKSCS-2004 supported in version 0.1.20060903-1.

Mac OS

Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 10.0-10.2 supports HKSCS-1999. 10.3-10.4 supports HKSCS-2001. Some of the letters added to HKSCS-2004 is supported via Unicode PUA in OS X 10.4. Starting with OS X 10.5, all the HKSCS-2004 characters are supported via standard Unicode 4.1 code points.

Applications

Mozilla
Mozilla
Mozilla is a term used in a number of ways in relation to the Mozilla.org project and the Mozilla Foundation, their defunct commercial predecessor Netscape Communications Corporation, and their related application software....

 1.5 and above supports HKSCS, with HKSCS-2004 support added into Gecko 1.8.1 code base. Unlike the above mentioned patch, Mozilla uses its own code page table. However, the fix for bug 343129 does not support characters mapped to code points above Basic Multilingual Plane.

QT
Qt (toolkit)
Qt is a cross-platform application framework that is widely used for developing application software with a graphical user interface , and also used for developing non-GUI programs such as command-line tools and consoles for servers...

 3.x-based applications (e.g.: KDE
KDE
KDE is an international free software community producing an integrated set of cross-platform applications designed to run on Linux, FreeBSD, Microsoft Windows, Solaris and Mac OS X systems...

) only support characters mapped to code points FFFF or lower. In QT4, characters outside BMP are supported via surrogates. Big5-HKSCS Text Codec supports HKSCS-1999 back in Qt-2.3.x, but it was too late in Qt development schedule to be officially included in the Qt-2.3.x series, so it was officially supported in Qt-3.0.1. HKSCS-2001 support was added in Qt-3.0.5.

GNOME
GNOME
GNOME is a desktop environment and graphical user interface that runs on top of a computer operating system. It is composed entirely of free and open source software...

supports HKSCS characters in Unicode ranges, except those mapped to the Basic Multilingual Plane compatibility block. Patches to support characters mapped to above Basic Multilingual Plane was introduced during Pango 1.1.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK