Evaluation of Visual Network Algorithms on Historical Documents

Khairunnisa, Binti Ibrahim (2020) Evaluation of Visual Network Algorithms on Historical Documents. Masters thesis, Universiti Malaysia Sarawak (UNIMAS).

[img] PDF (Please get the password from Digital Collection Development Unit, ext: 3933 / 3914)
Evaluation of Visual Network Algorithms on Historical Documents.pdf
Restricted to Registered users only

Download (3MB) | Request a copy

Abstract

Visual network is a special type of graph representing real life systems where the vertices are accompanied with attributes and the edges represent relationships between them. Network visualisation facilitate comprehension of texts, especially for historical documents, where important events, facts and relationships are recorded. This study proposed a generic framework to perform evaluation of visual network algorithms to find the best network representation of a document. The framework suggests to evaluate both graph layout and clustering algorithm in order to produce a good network. The framework has been used to evaluate three graph layout and three graph clustering algorithms on the historical SAGA dataset. The evaluation found that FA2 algorithm when combined with MC algorithm produce the best network representation for SAGA. The evaluation also demonstrates that the scores given by evaluation metrics can disagree with one another as they each are invented based on different opinions on how to indicate a good cluster. The proposed framework is also applied on Biotext and dBPedia dataset and the findings implied that the performance of an algorithm, be it a layout or a clustering algorithm, actually depends on the structure of the document itself. Therefore, for a new document, evaluation of algorithms is ineluctable. The study also proposed a simple but reliable cluster evaluation metric called NPL-C metric. The metric is able to rate both the internal and external structure of clusters in a given network by using the concept of average path length and conductance.

Item Type: Thesis (Masters)
Additional Information: Thesis (MSc.) - Universiti Malaysia Sarawak, 2020.
Uncontrolled Keywords: Generic evaluation framework, reliable cluster evaluation metric, network visualisation, historical network, graph algorithm evaluation, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, Postgraduate, research, Universiti Malaysia Sarawak.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: KHAIRUNNISA BINTI IBRAHIM
Date Deposited: 17 Jun 2020 05:01
Last Modified: 19 Nov 2020 08:20
URI: http://ir.unimas.my/id/eprint/29965

Actions (For repository members only: login required)

View Item View Item