Wiki saga: an approach for the digitisation, processing and visualisation of historical documents

Tan, Daniel Yong Wen (2015) Wiki saga: an approach for the digitisation, processing and visualisation of historical documents. Masters thesis, Universiti Malaysia Sarawak, (UNIMAS).

[img] PDF (Please get the password by email to repository@unimas.my , or call ext: 082-583914/3973/3933)
Daniel.pdf
Restricted to Registered users only

Download (3MB)

Abstract

A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers.

Item Type: Thesis (Masters)
Additional Information: Thesis (M.Sc.) -- Universiti Malaysia Sarawak, 2015.
Uncontrolled Keywords: Wiki Saga, Visualisation of Historical Documents, Semantics, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, Postgraduate, research, Universiti Malaysia Sarawak
Subjects: T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Karen Kornalius
Date Deposited: 03 Mar 2016 04:27
Last Modified: 24 Aug 2023 02:12
URI: http://ir.unimas.my/id/eprint/10769

Actions (For repository members only: login required)

View Item View Item