Improving Support vector machine for Imbalanced big data classification
Alaa Abdulazeez Qanbar1, Zakariya Yahya Algamal2
1 Department of Statistics and Informatics, University of Mosul, Mosul, Iraq
2Department of Statistics and Informatics, University of Mosul, Mosul, Iraq
Emails: alaa.22csp59@student.uomosul.edu.iq; zakariya.algamal@uomosul.edu.iq
*Corresponding Author: alaa.22csp59@student.uomosul.edu.iq
|
Abstract A significant proportion of one type of pattern and a relatively small quantity of another type of pattern can be found in many unbalanced real data sets. In addition, finding significant observations and excluding influential observations is effectively accomplished through diagnostic analysis. Support vector machines (SVM), a common classification technique, perform poorly on imbalanced datasets and when influential observations exist. In this research, the pigeon optimization algorithm as a metaheuristic algorithm is employed to address the influence observation issues in SVM. Experiments are done on three real sets of data. Our approach provides higher classification accuracy compared to other widely used algorithms. This approach could be used for further biological, chemical, and medical datasets. |
Received: August 17, 2023 Revised: November 11, 2023 Accepted: January 11, 2024
Keywords: Pigeon optimization algorithm; meta-heuristic algorithm; imbalanced data; support vector machine.