Saga: Encoding The Structures Of Historical Documents Using Tei-Xml Schema

Muhammad Adib Fikri, Johari (2023) Saga: Encoding The Structures Of Historical Documents Using Tei-Xml Schema. [Final Year Project Report] (Unpublished)

[img] PDF (Please get the password by email to repository@unimas.my , or call ext: 3914 / 3942 / 3933)
Muhammad Adib Fikri Bin Johari (fulltext).pdf
Restricted to Registered users only

Download (4MB)

Abstract

Sarawak Gazette is the oldest newspaper published in Sarawak which comprised of rich historical information related to Sarawak and as essential source of historical information on Sarawak affairs, particularly from 1870 to 1941. Previously, a project to create a website for Sarawak Gazette named as “e-Sarawak Gazette” was conducted as an initiative to preserve Sarawak Gazette collection digitally. However, the current version of digital form of Sarawak Gazette collection is saved as PDF as image form. This cause limitation to explore the enriching content of the historical document. Therefore, this project aims to discover and retrieve the information from the Sarawak Gazette collection and perform data annotation to discover the meaningful context contained in the historical document. Besides, this project will also mainly work on utilizing the work done from digitization and data annotation to create a TEI XML format document for Sarawak Gazette to perform structuring the XML data for the construction of indexing and implementation of search functionality to contribute to development of dynamic web portal for Sarawak Gazette. The structure of Sarawak Gazette document will be structured in compliance with the XML standard and the Text Encoding Initiative's P5 recommendations. This effort intends to achieve an explicit and semantic markup of historical information, which is intended to provide various of benefits, including the ability to validate the structure of the Sarawak Gazette document and enable advanced data processing, such as index construction and enable search features to be integrated.

Item Type: Final Year Project Report
Additional Information: Project report (B.Sc.) -- Universiti Malaysia Sarawak, 2023.
Uncontrolled Keywords: historical information, Sarawak Gazette collection, digitization
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Patrick
Date Deposited: 11 Jan 2024 09:38
Last Modified: 11 Jan 2024 09:38
URI: http://ir.unimas.my/id/eprint/44087

Actions (For repository members only: login required)

View Item View Item