Feature Selection and Stability Analysis using Ensemble Techniques

 

Dipti Theng1,*, K. K. Bhoyar2, Prashant Pawade3

1Department of Computer Science and Engineering SIT Pune, India

2Department of Computer Science and Engineering YCCE Nagpur, India

3Department of Civil Engineering GHRCE Nagpur, India

Text Box: Abstract
Selecting the most relevant feature subset for a task is demanded and recommended for high accuracy and reduced model training time. Ensemble learning has shown superior results in classification; hence, we propose an ensemble method for feature selection and shown stability analysis for the selected feature set. The research question being investigated is whether ensemble methods are effective at selecting informative features in a dataset and if the selected features are stable compared to other feature selection methods. This paper presented a tree-based ensemble learning approach for feature selection. Our approach for ensemble feature selection includes function perturbation with the voting ensemble, an ensemble with a fixed number of features, and an ensemble with a contiguous number of features. Ensemble learning is found to be superior to other traditional feature selection algorithms. Ensemble learning algorithms are implemented on two high-dimensional microarray biomedical datasets. From our experimental study, it is observed that the voting ensemble outperforms other ensemble techniques, thereby reducing feature subset size and achieving higher accuracy. Stability analysis of all the algorithms has been studied and it is found that all ensemble techniques have higher stability than the traditional feature selection methods. Thus, ensemble learning proves to be a superior technique for feature selection. Our results demonstrate that the proposed method is effective in identifying relevant features and stable features and can improve the performance of machine learning models. 
Emails: deepti.theng@gmail.com; kkbhoyar@gmail.com; prashant.pawade@raisoni.net  


 


Received: October 24, 2024 Revised: January 02, 2025 Accepted: January 31, 2025

 

Keywords: Feature selection; Ensemble technique; Stability; Microarray dataset; Biomarker selection