A Reinforcement Learning Framework for Adaptive Detection of Phishing Attack

Sharvari Patil^1,*, Narendra M. Shekokar¹, Aditya Surve¹, Priyanka Ramchandran¹

¹Dwarkadas J. Sanghvi College of Engineering, Mumbai, India

Emails: sharvarichorghe@gmail.com; narendra.shekokar@djsce.ac.in; surveaditya521@gmail.com; priyankaar25@gmail.com

Abstract

Phishing is one of the most dominant forms of cybercrime, with over half a billion incidents occurring annually. It remains one of the most insidious forms of fraud due to its effectiveness. Phishing attacks are on the rise with increasingly deceptive tactics, often leading unwitting victims to divulge personal information. Phishing frauds also involve website phishing, which mimics legitimate sites. Despite the best user training and practices, people still fall for these frauds. The methodology of detecting phishing attacks using the blacklisting approach was not very effective since these URLs are active for a limited period. Hence, Machine Learning methods were used for detecting the phishing attempt. Machine learning solutions are not adaptive to changes in the approach and are biased towards the developed solution. In addition, there is a need to develop a solution to this constantly evolving phishing attack. The proposed system is an attempt to use reinforcement-learning methodology as the solution to detect phishing. It has trained an adaptive intelligent learning system based on previous experiences using the Q-learning algorithm. The system focuses on dynamically selecting the relevant features and the classification model. The agent is trained to select optimal features and classification models dynamically based on Q-learning algorithm. In contrast to static methods, the proposed system continuously adapts its strategy of combinations feature subsets and classification models as defense against the rapidly evolving attacks. The system aims to supplement existing cybersecurity measures with an adaptable tool capable of countering sophisticated phishing schemes. The experimental analysis shows that the proposed methodology attained an accuracy of 99.25%, demonstrating its high performance in phishing detection.

Keywords: Reinforcement Learning; Phishing Detection; Internet Security; Feature Engineering; Website Classification; Q-Learning; Adaptive Learning