Solving Unconstrained Minimization Problems and Training Neural Networks via Enhanced Conjugate Gradient Algorithms
Bassim A. Hassan1, Issam A. R. Moghrabi2, Talal M. Alharbi3,*, Alaa Luqman Ibrahim4
1College of Computer Science and Mathematics, University of Mosul, Mosul, Iraq
2Department of Information Systems and Technology, Kuwait Technical College, Kuwait City, Kuwait
3Department of Mathematics, College of Science, Buraydah, Qassim University, Saudi Arabia
4Department of Mathematics, College of Science, University of Zakho, Zakho, Kurdistan Region, Iraq
Emails: basimah@uomosul.edu.iq; i.moghrabi@ktech.edu.kw; ta.alharbi@qu.edu.sa; alaa.ibrahim@uoz.edu.krd
Abstract
Artificial neural networks have become a cornerstone of modern artificial intelligence, powering progress in a wide range of fields. Their effective training heavily depends on techniques from unconstrained optimization, with iterative methods based on gradients being especially common. This study presents a new variant of the conjugate gradient method tailored specifically for unconstrained optimization tasks. The method is carefully designed to meet the sufficient descent condition and ensures global convergence. Comprehensive numerical testing highlights its advantages over traditional conjugate gradient techniques, showing improved performance in terms of iteration counts, function evaluations, and overall computational time across a variety of problem sizes. Additionally, this new approach has been successfully used to improve neural network training. Experimental results show faster convergence and better accuracy, with fewer training iterations and reduced mean squared error compared to standard methods. Overall, this work offers a meaningful contribution to optimization strategies in neural network training, displaying the method is potential to tackle the complex optimization problems often encountered in machine learning.
Keywords: Optimization; Conjugate Gradient; Quasi-Newton; conjugacy condition; Neural Networks
1. Introduction