A graph-theoretic approach for the detection of phishing webpages

Tan, Choon Lin and Chiew, Kang Leng and Yong, Kelvin S.C. and Sze, San Nah and Abdullah, Johari and Sebastian, Yakub (2020) A graph-theoretic approach for the detection of phishing webpages. Computers & Security, 95. p. 101793. ISSN 0167-4048

Full text not available from this repository. (Request a copy)
Official URL: https://www.sciencedirect.com/science/article/pii/...

Abstract

Over the years, various technical means have been developed to protect Internet users from phishing attacks. To enrich the anti-phishing efforts, we capitalise on concepts from graph theories, and propose a set of novel graph features to improve the phishing detection accuracy. The initial phase of the proposed technique involved the extraction of hyperlinks in the webpage under scrutiny and fetching the corresponding neighbourhood webpages. During this process, the page linking data were collected, and used to construct a web graph which models the overall hyperlink and network structure of the webpage. From the web graph, graph measures were computed and extracted as graph features to derive a classifier for detecting phishing webpages. Experimental results show that the proposed graph features achieve an improved overall accuracy of 97.8% when C4.5 was utilised as classifier, outperforming the existing conventional features derived from the same data samples. Unlike conventional features, the proposed graph features leverage inherent phishing patterns that are only visible at a higher level of abstraction, thus making it robust and difficult to be evaded by direct manipulations on the webpage contents. Our proposed graph-based technique also shows promising results when benchmarked against a prominent phishing detection technique. Hence, the proposed technique is an important contribution to the existing anti-phishing research towards improving the detection performance.

Item Type: Article
Additional Information: Information, Communication and Creative Technology
Uncontrolled Keywords: Phishing detection, Hyperlinks, Web graph, Graph features, Machine learning, unimas, university, universiti, Borneo, Malaysia, Sarawak, Kuching, Samarahan, ipta, education, research, Universiti Malaysia Sarawak
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: COLIN TAN CHOON LIN
Date Deposited: 18 Aug 2020 06:32
Last Modified: 19 Jan 2021 04:00
URI: http://ir.unimas.my/id/eprint/31278

Actions (For repository members only: login required)

View Item View Item