![]() |
|
|
| Home | Research | CRL Staff | Publications | Resources | Employment | CRL Internal | ||
| Chinese | Resources |
The Chinese language is spoken by the major ethnic groups of China including both the
People's Republic of China and Taiwan. Significant Chinese-speaking populations are also found in Singapore,
Indonesia, Malaysia, and Thailand. Chinese language has features that distinguishes it from most Western
languages: it is monosyllabic, has little inflection, and is tonal, i.e. a tone is assigned to each word in order to
indicate differences in meaning between words similar in sound. Words are composed of single or multiple
characters. Although around 56,000 characters have been accumulated in Chinese, only a few thousand are needed to
write Modern Chinese.
Spoken Chinese comprises many regional dialect groups. Each dialect group consists of a large number of dialects, Most Chinese speak one of the dialects. Mandarin is an official language in both Mainland China and Taiwan. Regardless of the variants in spoken-Chinese, written Chinese uses the same script with two variants: traditional and simplified writing systems. The simplified writing system differs in two ways from the traditional writing system: (1) a reduction of the number of strokes per character; (2) the reduction of the number of characters in common use. In mainland China a simplified writing system is used, whereas in Hong Kong, Taiwan, and Southeast Asia overseas regions the traditional script is being used. For further information about Chinese, please visit the Wikipedia. |
Dictionaries: Chinese-English dictionary contains 44299 entries including proper names. Chinese Segmenter: NMSU Chinese segmenter is a Chinese language tool recognizing words in Chinese corpus. it also can segment Chinese corpus mixed with English words. Over the years the system has been used by many research institute and companies nation-wide with good comments. input text: 李坚持又把同厂的陈小冬、范思学和周文林介绍进来,形成一个四五个人的小 圈子。这其中,还有范思学的妹妹范婷,有一段时间她把东北知青的一些诗带 给大家看(当时她在东北下乡)。其中一首写嫩江平原风雪的诗给大家留下深 刻印象。李坚持也带来一些手抄本的小说、诗歌。后来,雷明、陈乐平等人也 出入于这个圈子。 The output text: 李坚持┆又┆把┆同┆厂┆的┆陈小冬┆、┆范思学┆和┆周文林┆介绍┆ 进来┆,┆形成┆一┆个┆四五┆个┆人┆的┆小┆圈子┆。┆这┆其中┆, ┆还┆有┆范思学┆的┆妹妹┆范婷┆,┆有┆一┆段┆时间┆她┆把┆东北┆ 知青┆的┆一┆些┆诗┆带┆给┆大家┆看┆(┆当时┆她┆在┆东北┆下乡┆ )┆。┆其中┆一┆首┆写┆嫩江┆平原┆风雪┆的┆诗┆ 给┆大家┆留下┆深刻┆印象┆。┆李坚持┆也┆带来┆一┆些┆手抄本┆的┆ 小说┆、┆诗歌┆。┆后来┆,┆雷明┆、┆陈乐平┆等┆人┆也┆出入┆于┆ 这┆个┆圈子┆。
Chinese Segmenter for Linux
Chinese Segmenter for Solaris
Chinese Segmenter Source |