174 103
Full Length Article
Fusion: Practice and Applications
Volume 9 , Issue 2, PP: 19-26 , 2022 | Cite this article as | XML | Html |PDF


Ensemble of Machine Learning Fusion Models for Breast Cancer Detection Based on the Regression Model

Authors Names :   Hamzah A. Alsayadi   1 *     Abdelaziz A. Abdelhamid   2     El-Sayed M. El-Kenawy   3     Abdelhameed Ibrahim   4     Marwa M. Eid   5  

1  Affiliation :  Computer Science Department, Faculty of Sciences, Ibb University, Yemen

    Email :  hamzah.sayadi@cis.asu.edu.eg

2  Affiliation :  Computer Science Department, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, 11566, Egypt

    Email :  abdelaziz@cis.asu.edu.eg

3  Affiliation :  Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology, Mansoura, 35111, Egypt

    Email :  skenawy@ieee.org

4  Affiliation :  Computer Engineering and Control Systems Department, Faculty of Engineering, Mansoura University, 35516, Mansoura Egypt

    Email :  afai79@mans.edu.eg

5  Affiliation :  Faculty of Artifcial Intelligence, Delta University for Science and Technology, Mansoura, Egypt

    Email :  mmm@ieee.org

Doi   :   https://doi.org/10.54216/FPA.090202

Received: May 16, 2022 Accepted: October 12, 2022

Abstract :

Breast cancer is one of the deadliest cancers among women worldwide and one of the main causes of mortality for women in the United States. Breast cancer can be detected earlier and with more accuracy, extending life expectancy at a lower cost. To do this, the efficiency and precision of early breast cancer detection can be increased by evaluating the large data that is currently available utilizing technologies like machine learning fusion-based decision support systems. In this paper, we investigate the prediction performance of various regression models and a decision support system based on these models that provided the predicted category along with a prediction confidence measure. The various machine learning (ML) algorithms applied include decision tree regressor, MLP regressor, SVR, random forest regressor, and K-Neighbors regressor. The models are enhanced by average ensemble and ensemble using K-Neighbors regressor. We used the Breast Cancer Wisconsin Dataset from Wisconsin Prognostic Breast Cancer (WPBC) with 569 digitized images of a fine needle aspirate (FNA) of breast mass and 10 real-valued feature information. Among all five machine learning methods, K-Neighbors regressor had the best performance and ensemble using K-Neighbors regressor gave the best accuracy. The results show that there is a decrease in RMSE, MAE, MBE, R, R2, RRMSE, NSE, and WI when compared to the traditional methods.

Keywords :

Breast Cancer; Ensemble model; Machine learning Fusion; Regression model.

References :

[1] Kaustubh Chakradeo, Sanyog Vyawahare, and Pranav Pawar. Breast cancer recurrence prediction

using machine learning. In 2019 IEEE Conference on Information and Communication Technology,

pp. 1-7. IEEE, 2019.

[2] Mochen Li, Gaurav Nanda, Santosh Chhajedss, and Raji Sundararajan. Machine Learning-Based

Decision Support System for Early Detection of Breast Cancer. Indian Journal of Pharmaceutical

Education and Research 54, no. 3 (2020): S705-S715.

[3] Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal. Cancer statistics, 2019. CA: a cancer

journal for clinicians 69, no. 1 (2019): 7-34.

[4] L. Ghasem Ahmad, A. T. Eshlaghy, A. Poorebrahimi, M. Ebrahimi, and A. R. Razavi. Using three

machine learning techniques for predicting breast cancer recurrence. J Health Med Inform 4, no. 124

(2013): 3.

[5] Anusha Bharat, N. Pooja, and R. Anishka Reddy. Using machine learning algorithms for breast

cancer risk prediction and diagnosis. In 2018 3rd International Conference on Circuits, Control,

Communication and Computing (I4C), pp. 1-4. IEEE, 2018.

[6] Dursun Delen, Glenn Walker, and Amit Kadam. Predicting breast cancer survivability: a

comparison of three data mining methods. Artificial intelligence in medicine 34, no. 2 (2005): 113-


[7] Turgay Ayer, Oguzhan Alagoz, Jagpreet Chhatwal, Jude W. Shavlik, Charles E. Kahn Jr, and

Elizabeth S. Burnside. Breast cancer risk estimation with artificial neural networks revisited:

discrimination and calibration. Cancer 116, no. 14 (2010): 3310-3321.

[8] El-Kenawy, El-Sayed M., Marwa Eid, and Alshimaa H. Ismail. "A New Model for Measuring

Customer Utility Trust in Online Auctions." International Journal of Computer Applications 975:


[9] Yen-Chen Chen, Wan-Chi Ke, and Hung-Wen Chiu. Risk classification of cancer survival using

ANN with gene expression data from multiple laboratories. Computers in biology and medicine 48

(2014): 1-7.

[10] Jithendra PR Nayak, B. D. Parameshachari, KM Sunjiv Soyjaudah, Reshma Banu, and A. C.

Nuthan. Identification of PCB faults using image processing. In 2017 International Conference on

Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), pp.

1-4. IEEE, 2017.

[11] Hakizimana Leopord, W. Kipruto Cheruiyot, and Stephen Kimani. A survey and analysis on

classification and regression data mining techniques for diseases outbreak prediction in datasets. Int.

J. Eng. Sci 5, no. 9 (2016): 1-11.

[12] Liang Hong, Mengqi Luo, Ruixue Wang, Peixin Lu, Wei Lu, and Long Lu. Big data in health care:

Applications and challenges. Data and information management 2, no. 3 (2018): 175-197.

[13] J Sivapriya, V Aravind Kumar, S Siddarth Sai, and S Sriram. Breast cancer prediction using

machine learning. Int. J. Recent Technol. Eng. 8 (4) (2019).

[14] A. Peretti, and F. Amenta. Breast Cancer Prediction by Logistic Regression with CUDA Parallel

Programming Support. Breast Can Curr Res 1, no. 111 (2016): 2..

[15] Maren E. Shipe, Stephen A. Deppen, Farhood Farjah, and Eric L. Grogan. Developing prediction

models for clinical use using logistic regression: an overview. Journal of thoracic disease 11, no.

Suppl 4 (2019): S574.

[16] M. N. Sohail, Ren Jiadong, M. M. Uba, M. Irshad, Musavir Bilal, Usman Akbar, and Tahir Rizwan.

Forecast Regression analysis for Diabetes Growth: An inclusive data mining approach. Int. J. Adv.

Res. Comput. Eng. Technol.(IJARCET) 7, no. 9 (2018): 715-721.

[17] Reddy Prasad, Pidaparthi Anjali, S. Adil, and N. Deepa. Heart disease prediction using logistic

regression algorithm using machine learning. International journal of Engineering and Advanced

Technology 8, no. 3S (2019): 659-662.

[18] Mogana Darshini Ganggayah, Nur Aishah Taib, Yip Cheng Har, Pietro Lio, and Sarinder Kaur

Dhillon. Predicting factors for survival of breast cancer patients using machine learning

techniques. BMC medical informatics and decision making 19, no. 1 (2019): 1-17.

[19] Ch Shravya, K. Pravalika, and Shaik Subhani. Prediction of breast cancer using supervised machine

learning techniques. International Journal of Innovative Technology and Exploring Engineering

(IJITEE) 8, no. 6 (2019): 1106-1110.

[20] K. Polaraju, and D. Durga Prasad. Prediction of heart disease using multiple linear regression

model. International Journal of Engineering Development and Research Development 5, no. 4

(2017): 1419-1425.

[21] El-kenawy, El-Sayed M., Marwa M. Eid, and Abdelhameed Ibrahim. "Anemia estimation for covid-

19 patients using a machine learning model." Journal of Computer Science and Information Systems

17, no. 11 (2021): 2535-1451.

[22] Somayeh Momenyan, Ahmad Reza Baghestani, Narges Momenyan, Parisa Naseri, and Mohammad

Esmaeil Akbari. Survival prediction of patients with breast cancer: comparisons of decision tree and

logistic regression analysis. International Journal of Cancer Management 11, no. 7 (2018).

[23] Xiaobo Zhou, Kuang-Yu Liu, and Stephen TC Wong. Cancer classification and prediction using

logistic regression with Bayesian gene selection. Journal of Biomedical Informatics 37, no. 4

(2004): 249-259.

[24] Matthew E. Levine, David J. Albers, and George Hripcsak. Methodological variations in lagged

regression for detecting physiologic drug effects in EHR data. Journal of biomedical informatics 86

(2018): 149-159..

[25] Ibrahim, Abdelhameed, and El-Sayed M. El-kenawy. "Applications and datasets for superpixel

techniques: A survey." Journal of Computer Science and Information Systems 15, no. 3 (2020): 1-6.

[26] Wen-Tao Wu, Yuan-Jie Li, Ao-Zi Feng, Li Li, Tao Huang, An-Ding Xu, and Jun Lyu. Data mining in

clinical big data: the frequently used databases, steps, and methodological models. Military Medical

Research 8, no. 1 (2021): 1-12.

[27] Mohammed Amine Naji, Sanaa El Filali, Kawtar Aarika, EL Habib Benlahmar, Rachida Ait

Abdelouhahid, and Olivier Debauche. Machine learning algorithms for breast cancer prediction and

diagnosis. Procedia Computer Science 191 (2021): 487-492..

[28] Ibrahim, Abdelhameed, Seyedali Mirjalili, Mohammed El-Said, Sherif SM Ghoneim, Mosleh M. Al-

Harthi, Tarek F. Ibrahim, and El-Sayed M. El-Kenawy. "Wind speed ensemble forecasting based on

deep learning using adaptive dynamic optimization algorithm." IEEE Access 9 (2021): 125787-


[29] El-Sayed Towfek, M., and M. Saber El-kenawy. "Reham Arnous. An Integrated Framework to

Ensure Information Security Over the Internet." International Journal of Computer Applications 178,

no. 29 (2019): 13-15.

[30] Hamzah A. Alsayadi, Nima Khodadadi, and Sunil Kumar. Improving the Regression of

Communities and Crime Using Ensemble of Machine Learning Models. Journal of Artificial

Intelligence and Metaheuristics 1.1 (2022): 27-34.

[31] Wenjuan Wei, Olivier Ramalho, Laeticia Malingre, Sutharsini Sivanantham, John C Little, and

Corinne Mandin. Machine learning and statistical models for predicting indoor air quality. Indoor

Air, 29(5):704– 726, 2019.

[32] S Abdullah, M Ismail, and AN Ahmed. Multi-layer perceptron model for air quality prediction.

Malaysian Journal of Mathematical Sciences, 13:85–95, 2019.

[33] Ruizhi Zhong, Raymond L Johnson Jr, and Zhongwei Chen. Using machine learning methods to

identify coals from drilling and logging-while-drilling lwd data. In Asia Pacifc Unconventional

Resources Technology Conference, Brisbane, Australia, 18-19 November 2019, pages 970–994.

Unconventional Resources Technology Conference, 2020.

[34] Zhi-Hua Zhou. Machine learning. Springer Nature, 2021.

[35] Eid, Marwa M., El-Sayed M. El-kenawy, and Abdelhameed Ibrahim. "A binary sine cosine-modified

whale optimization algorithm for feature selection." In 2021 National Computing Colleges

Conference (NCCC), pp. 1-6. IEEE, 2021.

[36] Mochen Li, Gaurav Nanda, Santosh Chhajedss, and Raji Sundararajan. Machine Learning-Based

Decision Support System for Early Detection of Breast Cancer. Indian Journal of Pharmaceutical

Education and Research 54, no. 3 (2020): S705-S715.

Cite this Article as :
Hamzah A. Alsayadi , Abdelaziz A. Abdelhamid , El-Sayed M. El-Kenawy , Abdelhameed Ibrahim , Marwa M. Eid, Ensemble of Machine Learning Fusion Models for Breast Cancer Detection Based on the Regression Model, Fusion: Practice and Applications, Vol. 9 , No. 2 , (2022) : 19-26 (Doi   :  https://doi.org/10.54216/FPA.090202)