Classifying good and bad websites

Koo, Ee Woon (2015) Classifying good and bad websites. [Final Year Project Report] (Unpublished)

[img] PDF
Koo.pdf

Download (8MB)
[img] PDF (Please get the password by email to repository@unimas.my , or call ext: 3914 / 3942 / 3933)
Koo full.pdf
Restricted to Registered users only

Download (17MB)

Abstract

Websites classification has become a vital subject matter as most websites are increasingly being used as a platform for various applications. These web pages often contain semi-structured content which make the classification process challenging. This paper addresses the use of machine learning techniques to classify good and bad websites. The classification process is made easy by using set of features generated from HTML codes. The performance ofthe 21 features were evaluated by using three machine learning techniques: support vector machine (SVM), naIve bayes, and nearest neighbor classifiers. The good and bad websites were distinguished by the set of features obtained through counting ofthe HTML tags. A total of200 websites were collected from machine learning task. The results obtained indicate that the features are useful for classification tasks with average accuracy of 80.50% for SVM classifier, 77.00% for naIve bayes classifier, and 72.50% nearest neighbor classifier. Hence, SVM classifier achieved the highest accuracy among all. This project illustrates that it is possible to classify websites as good or bad by using the underlying tags along with the machine learning algorithms.

Item Type: Final Year Project Report
Additional Information: Project Report (B.Sc.) -- Universiti Malaysia Sarawak, 2015.
Uncontrolled Keywords: web classification, machine learning techniques, HTML tags, SVM, Gaussian naIve bayes, k-nearest neighbor, supervised machine learning approach, World Wide Web, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, undergraduate, research, Universiti Malaysia Sarawak
Subjects: T Technology > T Technology (General)
Divisions: Academic Faculties, Institutes and Centres > Faculty of Cognitive Sciences and Human Development
Faculties, Institutes, Centres > Faculty of Cognitive Sciences and Human Development
Academic Faculties, Institutes and Centres > Faculty of Cognitive Sciences and Human Development
Depositing User: Karen Kornalius
Date Deposited: 20 May 2016 07:18
Last Modified: 08 Aug 2023 03:42
URI: http://ir.unimas.my/id/eprint/12117

Actions (For repository members only: login required)

View Item View Item