LSI-based semantic characterisation for automated text categorisation

Tan, Ping Ping (2009) LSI-based semantic characterisation for automated text categorisation. Masters thesis, Universiti Malaysia Sarawak.

[img] PDF (Please get the password from Digital Collection Development Unit, ext: 3932 / 3914)
LSI-based semantic characterization for automated text categorization (fulltext).pdf
Restricted to Registered users only

Download (17MB)

Abstract

As knowledge acquisition remains a bottleneck, incorporating human judgement within intelligent systems is still a challenge. Supervised learning methods have shown to be able to assist humans in automated text categorization (ATC). However, the performance of such systems is largely dependent on the characteristics of the datasets. Without the understanding of why a classifier works well for certain datasets, it is difficult to generalise its application across domains. Furthermore, most training sets used in supervised ATC have category labels provided by human experts. Expert knowledge used in the task of categorization is often not captured via the mere process of manipulating category labels. This has resulted in lose of intended meanings while performing supervised ATC. Besides that, large text datasets often contain a greater deal of noise.

Item Type: E-Thesis (Masters)
Uncontrolled Keywords: LSI-based semantic characterization, automated text categorization, Universiti Malaysia Sarawak, UNIMAS, research, postgraduate, IPTA, education, sarawak, kuching, malaysia, samarahan, universiti, university
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Karen Kornalius
Date Deposited: 04 Dec 2013 07:44
Last Modified: 03 Mar 2020 06:04
URI: http://ir.unimas.my/id/eprint/167

Actions (For repository members only: login required)

View Item View Item