Detection of Fake News on Twitter Using a Novel Data-Mining Algorithm

Dena Kadhim Muhsen^1,*, Azhar F. Al-zubidi², Gheed Tawfeeq Waleed³

_‎^‎¹Computer Science College, University of Technology - Iraq, 10066 Baghdad, Iraq

²Computer Science Department, College of Sciences, AL Nahrain University, Jadriya, Baghdad, Iraq

³Federal public service council, Baghdad, Iraq

Emails: dena.k.muhsen@uotechnology.edu.iq; azhar.flaih@nahrainuniv.edu.iq; gheed94@gmail.com

Abstract

Social media has supplanted conventional media as one of the most important venues for information exchange. Because of the internet's accessibility and simplicity, news on community media tends to spread quicker and simpler than a conventional news source. Still, not all of the information shared on ‘social media’ is true and/or comes from untrustworthy sources. Fake news may readily be manufactured and disseminated throughout ‘social media’, and this counterfeit news has the potential to mislead or misinform readers. Though several physical fact-inspection websites have been built to determine if the news is reliable, they cannot keep up with the amount of rapidly circulated internet information, particularly on social media. Twitter, being one of the most well-known continuing news sources, also happens to be one of the most dominating news disseminating media. Topic models facilitate the detection of the most relevant vocabulary and concept within a text corpus. This paper proposes a model for recognizing fake news messages from twitter posts using a novel data-mining algorithm. Here initially the twitter dataset is collected preprocessing is done by using word embedding. ‘Term Frequency Inverse Document Frequency ‘(TF-IDF)’ and Latent Semantic Analysis (LSA) do feature extraction. Feature selection is based on the Adaptive Whale Optimized Wrapper (AWOW) method. We proposed Fine-tuned Weighted Probabilistic Bayesian Neural Network (FWP-BNN) for the classification of the normal and the fake news. The proposed method is compared with existing approaches and the metrics are evaluated. The efficacy of the suggested technique in recognizing fake tweets is shown by test findings on a large miscellaneous events dataset.

Keywords: Social media; Twitter; Fake news; Term Frequency Inverse Document Frequency; Latent Semantic Analysis; Adaptive Whale optimized Wrapper method