267 159
Full Length Article
Journal of Artificial Intelligence and Metaheuristics
Volume 7 , Issue 1, PP: 19-37 , 2024 | Cite this article as | XML | Html |PDF

Title

Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning

  Faris H. Rizk 1 * ,   Mahmoud Elshabrawy 2 ,   Basant Sameh 3 ,   Karim Mohamed 4 ,   Ahmed Mohamed Zaki 5

1  Computer Science and Intelligent Systems Research Center, Blacksburg 24060, Virginia, USA
    (faris.rizk@jcsis.org)

2  Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology, Mansoura, 35111, Egypt
    (CH1900052@dhiet.edu.eg)

3  Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology, Mansoura, 35111, Egypt
    (CH1900072@dhiet.edu.eg)

4  Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology, Mansoura, 35111, Egypt
    (CH1900193@dhiet.edu.eg)

5  Computer Science and Intelligent Systems Research Center, Blacksburg 24060, Virginia, USA
    (azaki@jcsis.org)


Doi   :   https://doi.org/10.54216/JAIM.070102

Received: April 27, 2023 Revised: August 11, 2023 Accepted: January 01, 2024

Abstract :

This paper deals with a pivotal part of educational data analytics, aiming to increase the accuracy and interpretability of student performance prediction models. The cornerstone of our method is the innovative application of binary waterwheel plant algorithm bWWPA in the feature selection. As we can see, an essential part of any model is the predicted values, which correctly define all the characteristics of this model. Practically, we begin with solid data pre-processing, which incorporates data cleaning and missing values, duplicate removal, and data transformation in order to get model input as optimally as possible. Preceding the application of bWWPA, we employ an ensemble of regression machine learning models. Set up a baseline for predictive capability, getting initial outcomes with an average Mean Squared Error (MSE) of 0.064. The following feature selection phase proceeds, showing the algorithm. Ability to recognize important elements and, as a result, improve model effectiveness and explain power. The comparative analyses after feature selection point to refined gains in the model, and the performance is reporting a lower MSE of 0.032 with the refined models. These findings, methodologically, add to student performance prediction. Accordingly, it emphasizes the decisive status of feature selection in improving models. The paper's significance extends to teachers, institutions, and researchers, giving insights into more precise and relevant student success-supporting interventions.

Keywords :

feature selection; student performance prediction; Optimization; educational data analysis; regression models; Waterwheel Plant Optimization Algorithm

References :

[1]    Mishra, P., Biancolillo, A., Roger, J. M., Marini, F., & Rutledge, D. N. (2020). New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends in Analytical Chemistry, 132, 116045. https://doi.org/10.1016/j.trac.2020.116045

[2]    Naser, M. Z., & Alavi, A. H. (2023). Error Metrics and Performance Fitness Indicators for Artificial Intelligence and Machine Learning in Engineering and Sciences. Architecture, Structures and Construction, 3(4), 499–517. https://doi.org/10.1007/s44150-021-00015-8

[3]    Karasu, S., Altan, A., Bekiros, S., & Ahmad, W. (2020). A new forecasting model with wrapper-based feature selection approach using multi-objective optimization technique for chaotic crude oil time series. Energy, 212, 118750. https://doi.org/10.1016/j.energy.2020.118750

[4]    Cerqueira, V., Torgo, L., & Mozetič, I. (2020). Evaluating time series forecasting models: An empirical study on performance estimation methods. Machine Learning, 109(11), 1997–2028. https://doi.org/10.1007/s10994-020-05910-7

[5]    Vargas-Madriz, L. F., & Nocente, N. (2023). Exploring students’ willingness to provide feedback: A mixed methods research on end-of-term student evaluations of teaching. Social Sciences & Humanities Open, 8(1), 100525. https://doi.org/10.1016/j.ssaho.2023.100525

[6]    Gordeeva, T., Sheldon, K., & Sychev, O. (2020). Linking academic performance to optimistic attributional style: Attributions following positive events matter most. European Journal of Psychology of Education, 35(1), 21–48. https://doi.org/10.1007/s10212-019-00414-y

[7]    Khan, A., & Ghosh, S. K. (2018). Data mining based analysis to explore the effect of teaching on student performance. Education and Information Technologies, 23(4), 1677–1697. https://doi.org/10.1007/s10639-017-9685-z

[8]    An, M., Zhang, X., Wang, Y., Zhao, J., & Kong, L. (2022). Reciprocal relations between achievement goals and academic performance in a collectivist higher education context: A longitudinal study. European Journal of Psychology of Education, 37(3), 971–988. https://doi.org/10.1007/s10212-021-00572-y

[9]    Li, C., Yao, J., Tang, Z., Tang, Y., & Zhang, Y. (2023). The Influence of the Student’s Online Learning Behaviors on the Learning Performance. In B. Li, L. Yue, C. Tao, X. Han, D. Calvanese, & T. Amagasa (Eds.), Web and Big Data (pp. 28–36). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-25158-0_3

[10] Khan, A., & Ghosh, S. K. (2021). Student performance analysis and prediction in classroom learning: A review of educational data mining studies. Education and Information Technologies, 26(1), 205–240. https://doi.org/10.1007/s10639-020-10230-3

[11] Shum, A., & Fryer, L. K. (2023). Grade goal effects on the interplay between motivation and performance in undergraduate gateway mathematics courses. Contemporary Educational Psychology, 75, 102228. https://doi.org/10.1016/j.cedpsych.2023.102228

[12] Waters, L. E., Loton, D., & Jach, H. K. (2019). Does Strength-Based Parenting Predict Academic Achievement? The Mediating Effects of Perseverance and Engagement. Journal of Happiness Studies, 20(4), 1121–1140. https://doi.org/10.1007/s10902-018-9983-1

[13] Schweder, S., & Raufelder, D. (2022). Students’ interest and self-efficacy and the impact of changing learning environments. Contemporary Educational Psychology, 70, 102082. https://doi.org/10.1016/j.cedpsych.2022.102082

[14] Otchere, D. A., Ganat, T. O. A., Ojero, J. O., Tackie-Otoo, B. N., & Taki, M. Y. (2022). Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. Journal of Petroleum Science and Engineering, 208, 109244. https://doi.org/10.1016/j.petrol.2021.109244

[15] Ji, C., Zou, X., Hu, Y., Liu, S., Lyu, L., & Zheng, X. (2019). XG-SF: An XGBoost Classifier Based on Shapelet Features for Time Series Classification. Procedia Computer Science, 147, 24–28. https://doi.org/10.1016/j.procs.2019.01.179

[16] Singh Kushwah, J., Kumar, A., Patel, S., Soni, R., Gawande, A., & Gupta, S. (2022). Comparative study of regressor and classifier with decision tree using modern tools. Materials Today: Proceedings, 56, 3571–3576. https://doi.org/10.1016/j.matpr.2021.11.635

[17] Xue, L., Liu, Y., Xiong, Y., Liu, Y., Cui, X., & Lei, G. (2021). A data-driven shale gas production forecasting method based on the multi-objective random forest regression. Journal of Petroleum Science and Engineering, 196, 107801. https://doi.org/10.1016/j.petrol.2020.107801

[18] Maqbool, J., Aggarwal, P., Kaur, R., Mittal, A., & Ganaie, I. A. (2023). Stock Prediction by Integrating Sentiment Scores of Financial News and MLP-Regressor: A Machine Learning Approach. Procedia Computer Science, 218, 1067–1078. https://doi.org/10.1016/j.procs.2023.01.086

[19] Ossai, C. I., & Egwutuoha, I. P. (2020). Anomaly Detection and Extra Tree Regression for Assessment of the Remaining Useful Life of Lithium-Ion Battery. In L. Barolli, F. Amato, F. Moscato, T. Enokido, & M. Takizawa (Eds.), Advanced Information Networking and Applications (pp. 1474–1488). Springer International Publishing. https://doi.org/10.1007/978-3-030-44041-1_124

[20] Schmidt, A. F., & Finan, C. (2018). Linear regression and the normality assumption. Journal of Clinical Epidemiology, 98, 146–151. https://doi.org/10.1016/j.jclinepi.2017.12.006

[21] Jabeur, S. B., Gharib, C., Mefteh-Wali, S., & Arfi, W. B. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. https://doi.org/10.1016/j.techfore.2021.120658

[22] Anand, P., Rastogi, R., & Chandra, S. (2020). A class of new Support Vector Regression models. Applied Soft Computing, 94, 106446. https://doi.org/10.1016/j.asoc.2020.106446

[23] Ortiz-Bejar, J., Graff, M., Tellez, E. S., Ortiz-Bejar, J., & Jacobo, J. C. (2018). K-Nearest Neighbor Regressors Optimized by using Random Search. 2018 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), 1–5. https://doi.org/10.1109/ROPEC.2018.8661399

[24] Akinlar, M. A., Tchier, F., & Inc, M. (2020). Chaos control and solutions of fractional-order Malkus waterwheel model. Chaos, Solitons & Fractals, 135, 109746. https://doi.org/10.1016/j.chaos.2020.109746

[25] Qasim, O. S., & Algamal, Z. Y. (2018). Feature selection using particle swarm optimization-based logistic regression model. Chemometrics and Intelligent Laboratory Systems, 182, 41–46. https://doi.org/10.1016/j.chemolab.2018.08.016

[26] Rana, N., Latiff, M. S. A., Abdulhamid, S. M., & Chiroma, H. (2020). Whale optimization algorithm: A systematic review of contemporary applications, modifications and developments. Neural Computing and Applications, 32(20), 16245–16277. https://doi.org/10.1007/s00521-020-04849-z

[27] Tikhamarine, Y., Souag-Gamane, D., Najah Ahmed, A., Kisi, O., & El-Shafie, A. (2020). Improving artificial intelligence models accuracy for monthly streamflow forecasting using grey Wolf optimization (GWO) algorithm. Journal of Hydrology, 582, 124435. https://doi.org/10.1016/j.jhydrol.2019.124435

[28] Zhang, L., Mistry, K., Lim, C. P., & Neoh, S. C. (2018). Feature selection using firefly optimization for classification and regression models. Decision Support Systems, 106, 64–85. https://doi.org/10.1016/j.dss.2017.12.001

[29] Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143, 106839. https://doi.org/10.1016/j.csda.2019.106839

[30] Weissgerber, T. L., Garcia-Valencia, O., Garovic, V. D., Milic, N. M., & Winham, S. J. (2018). Why we need to report more than “Data were Analyzed by t-tests or ANOVA.” eLife, 7, e36163. https://doi.org/10.7554/eLife.36163


Cite this Article as :
Style #
MLA Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki. "Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning." Journal of Artificial Intelligence and Metaheuristics, Vol. 7, No. 1, 2024 ,PP. 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)
APA Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki. (2024). Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning. Journal of Journal of Artificial Intelligence and Metaheuristics, 7 ( 1 ), 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)
Chicago Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki. "Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning." Journal of Journal of Artificial Intelligence and Metaheuristics, 7 no. 1 (2024): 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)
Harvard Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki. (2024). Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning. Journal of Journal of Artificial Intelligence and Metaheuristics, 7 ( 1 ), 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)
Vancouver Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki. Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning. Journal of Journal of Artificial Intelligence and Metaheuristics, (2024); 7 ( 1 ): 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)
IEEE Faris H. Rizk, Mahmoud Elshabrawy, Basant Sameh, Karim Mohamed, Ahmed Mohamed Zaki, Optimizing Student Performance Prediction Using Binary Waterwheel Plant Algorithm for Feature Selection and Machine Learning, Journal of Journal of Artificial Intelligence and Metaheuristics, Vol. 7 , No. 1 , (2024) : 19-37 (Doi   :  https://doi.org/10.54216/JAIM.070102)