A Transfer Learning Framework for Sentiment Analysis in Indian Vernaculars

 

Kumal Kumar*1, Shivam Kumar 2

 

1 Mizoram University, Aizawl-796004, India

2 Mizoram University, Aizawl-796004, India

 

Emails: kunal9900fice@gmail.com; kingshivam854@gmail.com

 

Abstract

This paper explores sentiment analysis in Indian languages through a deep learning approach, combining machine learning techniques with natural language processing (NLP). Three neural network architectures—CNN, LSTM, and GRU—are employed to construct sentiment analysis models. Additionally, transfer learning is utilized via FastText, MURIL, and IndicBERT embeddings. The models are trained and evaluated on a translated dataset derived from the Sentiment140 dataset from Kaggle. Performance metrics such as accuracy, precision, recall, and F1-score are used to evaluate the models. The study addresses the challenges of sentiment analysis in Indian languages by leveraging deep learning techniques and linguistic diversity, providing insights into sentiment analysis across diverse languages and cultures. Furthermore, this project extends its analysis to include work on Gujarati, Marathi, and Sindhi languages, contributing to the understanding of sentiment analysis in a broader spectrum of Indian languages

Keywords: Sentiment Analysis; Convolutional Neural Network (CNN); Long Short-Term Memory (LSTM); Gated Recurrent Units (GRUs); Deep Learning;