Preserving the literary past, looking to the future:the first Hong Kong Literature Database Preserving the literary past, looking to the future:the first Hong Kong Literature Database

Preserving the literary past, looking to the future:the first Hong Kong Literature Database

  • 期刊名字:浙江大学学报A(英文版)
  • 文件大小:
  • 论文作者:MA Leo F.H,WONG Rita,LAU Paul
  • 作者单位:University Library System
  • 更新时间:2022-04-06
  • 下载次数:
论文简介

In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergent need of the resources on Hong Kong literature, the University Library System of the Chinese University of Hong Kong set up the Hong Kong Literature Database (the "Database"), which was the first Chinese literature database in the Internet in 2000. The paper will examine how the database is constructed using XML technology and metadata schema. The database also employs Unicode UTF-8 as the internal code. A mapping table fortraditional and simplified Chinese characters wascreated based on Unihan and is used behind the scene so that a user can either input traditional or simplified Chinese characters and retrieval will give both traditional and simplified Chinese characters. Currently 65% of journals use OCR technology so that full-text searching is possible. The Chinese OCR technology will be examined in greater detail. Special features of the Database such as, page-by-page browse mode, position-highlight for full-page newspaper, linking Table-Of-Contents and bnokjackets from the Library catalogue, etc. are described.The paper will also bring out the problem of massive downloading and compare the state-of-the-art technology and their shortcomings. This paper shows how the Hong Kong Literature Database facilitates future collaboration and data exchange by using open standard, shareable structure and the latest technology.

论文截图
版权:如无特殊注明,文章转载自网络,侵权请联系cnmhg168#163.com删除!文件均为网友上传,仅供研究和学习使用,务必24小时内删除。