Novel Prediction on Breast Cancer through Lazy Learning Approach by Linear Neural Network Search with Distance with Euclidean
S. Amsavalli1,*, Vetripriya M.2, R. Sivasankari2, Vetri Selvan M.3, Vijayakumar K.3
1Dept. of Computer Application, B.S.Abdur Rahman Crescent Institute of Science and Technology, India
2Dept. of Computer Science and Engineering, B.S.Abdur Rahman Crescent Institute of Science and Technology, India
3Dept. of Artificial Intelligence and Data Science, Panimalar Engineering College, Tamil Nadu, India
Email: amsavalli@crescent.education; vetripriya@crescent.education; sivasankari.rp@crescent.education; vetrinelson7@gmail.com; vijayakumarkadumbadi23@gmail.com
Abstract
Breast cancer is the most prevalent cancer-affecting women worldwide and remains a major cause of mortality. Early detection and accurate prognosis are critical to improving survival outcomes. This study introduces a novel predictive model for breast cancer diagnosis that integrates a lazy learning paradigm with the K-Nearest Neighbors (KNN) algorithm, optimized through a Linear Nearest Neighbor (NN) Search technique and the use of Euclidean distance as the similarity measure. The dataset, comprising 4,024 patient records with 15 clinical and demographic attributes, was obtained from a public repository and underwent rigorous preprocessing, including handling of missing values, normalization, and categorical encoding. The classification model was trained and evaluated using 1:9 cross-validation, with K values ranging from 1 to 9 and a constant batch size of 100 to identify the optimal configuration. Among various configurations tested, the model with K=5 demonstrated the highest performance, achieving an accuracy of 88.02%, precision of 0.87, and recall of 0.88. Additional performance metrics such as F-measure, Matthews Correlation Coefficient (MCC), and Kappa statistic further confirmed the robustness of the selected configuration. The proposed model shows superior predictive capability compared to traditional settings and can serve as a decision-support tool for clinicians. The findings suggest that the combination of lazy learning, effective neighbor search strategy, and robust distance metric can substantially enhance the predictive accuracy of breast cancer diagnosis. This study highlights the potential of machine learning-based tools in clinical oncology, offering a data-driven approach for early intervention and patient outcome improvement.
Keywords: Breast Cancer; Distance with Euclidean; Batch size; KNN; Linear NN Search