Improving Support Vector Machine Performance using Modified Similarity Distance Plotting-Data Reduction

Abdul Muqtasid, Rushdi and Mohammad, Hossin and Norita, Norwawi (2025) Improving Support Vector Machine Performance using Modified Similarity Distance Plotting-Data Reduction. International Journal of Computing and Digital Systems, 18 (1). pp. 1-12. ISSN 2210-142X

[img] PDF
Improving Support.pdf

Download (310kB)
Official URL: https://journal.uob.edu.bh/items/d9950900-7e0b-4bc...

Abstract

The Support Vector Machine (SVM) is well-regarded for its high classification accuracy, but its computational efficiency is often challenged by large datasets. To address this, we introduce the Similarity Distance Plotting-Data Reduction (SDP-DR) method, a novel instance selection technique aimed at enhancing SVM's efficiency and generalization. SDP-DR utilizes similarity distances within the dataset to reduce the number of training instances, thereby lowering both the time and space complexities of the SVM algorithm. With a linear time complexity, which scales proportionally with the size of the dataset, SDP-DR is well-suited for large-scale datasets as it ensures faster processing times and lower computational costs. Our evaluation, conducted on 30 diverse datasets, compares the performance of SDP-DR with standard SVM, Edited Nearest Neighbour (ENN) + SVM, and Condensed Nearest Neighbour (CNN) + SVM models. The results reveal that SDP-DR, especially its Mean of Each Column (MEC) variant, achieves higher accuracy and competitive reduction rates while maintaining reasonable classification times compared to other benchmark algorithms. This balance positions SDP-DR as a promising approach for improving SVM performance, particularly in resource-constrained environments and large-scale datasets. By effectively reducing training instances, SDP-DR offers a pathway to more efficient machine learning models, making it a valuable contribution to instance selection research and advancing the development of scalable SVM classifiers. Its potential applications extend to large-scale data analysis, big data environments, and real-time machine learning systems where computational efficiency is critical.

Item Type: Article
Uncontrolled Keywords: Data Reduction, Instance Selection, Support Vector Machine, Euclidean Distance, Data Classification.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Faculties, Institutes, Centres > Faculty of Computer Science and Information Technology
Academic Faculties, Institutes and Centres > Faculty of Computer Science and Information Technology
Depositing User: Hossin
Date Deposited: 07 Dec 2025 23:30
Last Modified: 07 Dec 2025 23:30
URI: http://ir.unimas.my/id/eprint/50777

Actions (For repository members only: login required)

View Item View Item