557 369
Full Length Article
Journal of Artificial Intelligence and Metaheuristics
Volume 4 , Issue 2, PP: 08-17 , 2023 | Cite this article as | XML | Html |PDF

Title

A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining

  Ahmed T. Alhasani 1 * ,   Hussein Alkattan 2 ,   Alhumaima Ali Subhi 3 ,   El-Sayed M. El-Kenawy 4 ,   Marwa M. Eid 5

1  Al-Furat Al-Awsat Technical University Computer Center Administrator, Najaf, Iraq
    (ahmed.alhasani@atu.edu.iq)

2  Department of System Programming, South Ural State University, Chelyabinsk 454080, Russia
    (alkattan.hussein92@gmail.com)

3  Electronic Computer Center University of Diyala, Diyala, Iraq
    (alhumaimaali@uodiyala.edu.iq)

4  Faculty of Artificial Intelligence, Delta University for Science and Technology, Mansoura, Egypt
    (mmm@ieee.org)

5  Faculty of Artificial Intelligence, Delta University for Science and Technology, Mansoura, Egypt
    (mmm@ieee.org)


Doi   :   https://doi.org/10.54216/JAIM.040201

Received: October 28, 2022 Revised: April 12, 2023 Accepted: June 24, 2023

Abstract :

Breast cancer is a significant public health concern worldwide, and early detection is crucial for its treatment. Although breast cancer has been extensively studied, there is still room for improvement in its classification accuracy. This study aims to improve the classification accuracy of breast cancer by applying information gain feature selection and machine learning techniques to the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. The information gain method is utilized to reduce feature characteristics, and machine learning algorithms such as support vector machine (SVM), naive Bayes (NB), and C4.5 decision tree are employed for breast cancer classification. The study also conducts a comparison analysis based on accuracy value. The proposed model achieves maximum classification accuracy (100%) and a weighted average for precision (100%) and recall (100%) using a C4.5 decision tree, while SVM accuracy (98.42%) and weighted average for precision (98.17%) and recall (98.58%) are achieved using a C4.5 decision tree. The NB algorithm attains an accuracy of 96%, with a weighted average for precision (18.57%) and recall (50%). The proposed model's results are compared to similar studies and demonstrate significant progress, indicating new opportunities for breast cancer detection.

Keywords :

Information Gain Feature Selection; Machine learning; classifier support vector machine; classifier naïve Bayes; classifier C4.5 decision tree; Performance evaluation tests

References :

[1] Hyuna Sung, Jacques Ferlay, MSc, ME2; Rebecca L. Siegel, Mathieu Laversanne, Isabelle Soerjomataram, Ahmedin Jemal, Freddie Bray, “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA CANCER J CLIN 2021;71:209–249.

[2] Samuel O. Azubuike, Colin Muirhead, Louise Hayes and Richard McNally, Rising global burden of breast cancer: the case of sub-Saharan Africa (with emphasis on Nigeria) and implications for regional development,” Azubuike et al. World Journal of Surgical Oncology, 16-63, 2018.

[3] Agarap, A. F. M., On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset. In Proceedings of the 2nd international conference on machine learning and soft computing, 5-9, 2018.

[4] Ak, M. F., A comparative analysis of breast cancer detection and diagnosis using data visualization and machine learning applications. Healthcare, 8(2), 1-11, 2020.

[5] Wang, H., & Yoon, S. W., Breast cancer prediction using data mining method. In IIE Annual Conference. Proceedings, 8-18, 2015.

[6] Street, W. N., Wolberg, W. H., & Mangasarian, O. L., Nuclear feature extraction for breast tumor diagnosis. In Biomedical image processing and biomedical visualization, 1905, 861-870, 1993.

[7] Aalaei, S., Shahraki, H., Rowhanimanesh, A., & Eslami, S., Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iranian journal of basic medical sciences, 19(5), 476, 2016.

[8] Liu K., Kang G., Zhang N.,Hou B., Breast cancer classification based on fully-connected layer first convolutional neural networks. IEEE Access, 6, 23722-23732, 2018.

[9] Al Bataineh, A., A comparative analysis of nonlinear machine learning algorithms for breast cancer detection. International Journal of Machine Learning and Computing, 9(3), 248-254, 2019.

[10] Khairunnahar L., Hasib M. A., Rezanur R. H. B., Islam M. R., Hosain M. K., Classification of malignant and benign tissue with logistic regression. Informatics in Medicine Unlocked, 16, 100189, 2019.

[11] Rahman M. A., Muniyandi R. C., An Enhancement in cancer classification accuracy using a two-step feature selection method based on artificial neural networks with 15 neurons. Symmetry, 12(2), 271, 2020.

[12] Idri A., Bouchra E. O., Hosni M., Abnane, I., Assessing the impact of parameters tuning in ensemble based breast cancer classification. Health and Technology, 10(5), 1239-1255, 2020.

[13] Alharbi AH et al., Diagnosis of Monkeypox Disease Using Transfer Learning and Binary Advanced Dipper Throated Optimization Algorithm. Biomimetics, 8(3),313, 2023.

[14] Ibrahim Abdelhameed, El-Sayed M. El-kenawy, Applications and datasets for superpixel techniques: A survey. Journal of Computer Science and Information Systems, 15(3),1-6, 2020.

[15] M. Saber, Efficient phase recovery system, Indonesian Journal of Electrical Engineering and Computer Science (lJEECS), 5(1), 123-129, 2017.

[16] M Saber, Y Jitsumatsu, MTA Khan, A simple design to mitigate problems of conventional digital phase locked loop, Signal Processing: An international journal (SPIJ), 6(2), 65-77, 2012.

[17] Mohamed Saber, A novel design and Implementation of FBMC transceiver for low power applications, Indonesian Journal of Electrical Engineering and Informatics (IJEEI), 8(1), 83-93, 2020.

[18] Amin Samy, Sayed A. Ward, Mahmud N Ali, Conventional Ratio and Artificial Intelligence (AI) Diagnostic methods for DGA in Electrical Transformers. International Electruical Engineering Journal, 6, 2096-2102, 2015.

[19] Al-Salihy, N. K., & Ibrikci, T., Classifying breast cancer by using decision tree algorithms. In Proceedings of the 6th International Conference on Software and Computer Applications, 144-148, 2017.

[20] Mohamed A. Abouelatta, et al. , Measurement and assessment of corona current density for HVDC bundle conductors by FDM integrated with full multigrid technique. Electric Power Systems Research, 199, 2021.

[21] Venkatesan E., Velmurugan T., Performance analysis of decision tree algorithms for breast cancer classification. Indian Journal of Science and Technology, 8(29), 1-8, 2015.

[22] T. Makarovskikh, A. Salah, A. Badr, A. Kadi, H. Alkattan and M. Abotaleb, Automatic classification Infectious disease X-ray images based on Deep learning Algorithms, 2022 VIII International Conference on Information Technology and Nanotechnology (ITNT), Samara, Russian Federation, pp. 1-6, 2022.

[23] E. Akbari et al., Improved Salp Swarm Optimization Algorithm for Damping Controller Design for Multimachine Power System. IEEE Access, 10, 82910-82922, 2022.

[24] M. Abotaleb, T. Makarovskikh, A. Ali Subhi, H. Alkattan and A. O. Adebayo, Forecasting and modeling on average rainwater and vapor pressure in Chelyabinsk Russia using deep learning models," 6th Smart Cities Symposium (SCS 2022), Hybrid Conference, Bahrain, 362-367, 2022.

[25] H. Alkattan, M. Abotaleb, A. Ali Subhi, O. A. Adelaja, A. Kadi and H. K. Ibrahim Al-Mahdawi, The prediction of students' academic performances with a classification model built using data mining techniques. 6th Smart Cities Symposium (SCS 2022), Hybrid Conference, Bahrain, 353-356, 2022.

[26] H. K. I. Al-Mahdawi, M. Abotaleb, H. Alkattan, A.-M. Z. Tareq, A. Badr, and A. Kadi, Multigrid Method for Solving Inverse Problems for Heat Equation,” Mathematics, 10(15), 2022.

[27] Doaa S. Khafaga,Hussein Alkattan,Alhumaima A. Subhi, Evaluating the Effect of Optimized Voting Using Hybrid Particle Swarm and Grey Wolf Algorithm on the Classification of the Zoo Dataset, Journal of Journal of Artificial Intelligence and Metaheuristics, 2(1), 2022.

[28] Louloua M. AL-Saedi,Methaq Talib Gaata,Mostafa Abotaleb,Hussein Alkattan, New Approach of Estimating Sarcasm based on the percentage of happiness of facial Expression using Fuzzy Inference System, Journal of Journal of Artificial Intelligence and Metaheuristics, 1(1), 2022.

[29] Rawat D., et al., Modeling of rainfalltime series using NAR and ARIMA model over western Himalaya, India. Arab. J. Geosci. 2022, 15, 1696.

[30] Eid Marwa M, Fawaz Alassery, Abdelhameed Ibrahim, and Mohamed Saber, Metaheuristic optimization algorithm for signals classification of electroencephalography channels. Computers, Materials & Continua, 71(3), 4627-4641, 2022.


Cite this Article as :
Style #
MLA Ahmed T. Alhasani , Hussein Alkattan , Alhumaima Ali Subhi , El-Sayed M. El-Kenawy , Marwa M. Eid. "A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining." Journal of Artificial Intelligence and Metaheuristics, Vol. 4, No. 2, 2023 ,PP. 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)
APA Ahmed T. Alhasani , Hussein Alkattan , Alhumaima Ali Subhi , El-Sayed M. El-Kenawy , Marwa M. Eid. (2023). A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining. Journal of Journal of Artificial Intelligence and Metaheuristics, 4 ( 2 ), 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)
Chicago Ahmed T. Alhasani , Hussein Alkattan , Alhumaima Ali Subhi , El-Sayed M. El-Kenawy , Marwa M. Eid. "A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining." Journal of Journal of Artificial Intelligence and Metaheuristics, 4 no. 2 (2023): 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)
Harvard Ahmed T. Alhasani , Hussein Alkattan , Alhumaima Ali Subhi , El-Sayed M. El-Kenawy , Marwa M. Eid. (2023). A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining. Journal of Journal of Artificial Intelligence and Metaheuristics, 4 ( 2 ), 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)
Vancouver Ahmed T. Alhasani , Hussein Alkattan , Alhumaima Ali Subhi , El-Sayed M. El-Kenawy , Marwa M. Eid. A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining. Journal of Journal of Artificial Intelligence and Metaheuristics, (2023); 4 ( 2 ): 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)
IEEE Ahmed T. Alhasani, Hussein Alkattan, Alhumaima Ali Subhi, El-Sayed M. El-Kenawy, Marwa M. Eid, A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining, Journal of Journal of Artificial Intelligence and Metaheuristics, Vol. 4 , No. 2 , (2023) : 08-17 (Doi   :  https://doi.org/10.54216/JAIM.040201)