315 232
Full Length Article
Fusion: Practice and Applications
Volume 7 , Issue 2, PP: 100-109 , 2022 | Cite this article as | XML | Html |PDF

Title

Vocal Analysis and Sentiment Discernment using AI

Authors Names :   Praveen Singh   1 *     Preeti Nagrath   2  

1  Affiliation :  Bharati Vidyapeeth's College of Engineering, India

    Email :  praveensingh3129@gmail.com


2  Affiliation :  Bharati Vidyapeeth's College of Engineering, India

    Email :  preeti.nagrath@bharatividyapeeth.edup



Doi   :   https://doi.org/10.54216/FPA.070204

Received: January 19, 2022 Accepted: April 15, 2022

Abstract :

One of the major factors for personal development and growth is understanding human emotions, and therefore it plays an important role in imitating human intelligence. Vocal and Sentiment analysis are the major focus points for advancement in Artificial Intelligence (AI). Sentiment analysis provides major help to data analysts of big enterprises to measure public opinion, conducting market research, understanding customers experience and viewing brand and product reputation. Emotion recognition provides an opportunity to grasp the general people’s sentiments about social events, marketing strategies, political views and product liking. In this paper, we have used various AI models on a variety of audio datasets to recognise and analyse the sentiments of the speaker. Our dataset includes some audio songs sung by some singers and some audio clips of few actors. We trained CNN and LSTM models to analyse our dataset and predict their accuracy. The ever-growing need of sentiment analysis coincides greatly with the extension of social media such as forum discussions, social networks like Facebook, Twitter, Instagram and many other similar platforms.

Keywords :

Vocal Analysis; Sentiment Discernment; Artificial Intelligence; Personal development

References :

[1] Buyukyilmaz, M., & Cibikdiken, A. O. (2016). Voice Gender Recognition Using Deep Learning.

[2] Byun, S. W., & Lee, S. P. (2021). A Study on a Speech Emotion Recognition System with Effective

Acoustic Features Using Deep Learning Algorithms. Applied Sciences 2021( Vol. 11,No. 4,p. 1890).

[3] Bhatti, M. W., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion

recognition in speech. Proceedings - IEEE International Symposium on Circuits and Systems, 2.

[4] Langari, S., Marvi, H., & Zahedi, M. (2020). Efficient speech emotion recognition using modified

feature extraction. Informatics in Medicine Unlocked( Vol. 20,p. 100424)

[5] Huang, C., Gong, W., Fu, W., & Feng, D. (2014). A research of speech emotion recognition based on

deep belief network and SVM. Mathematical Problems in Engineering, 2014.

[6] KoĊ‚akowska, A., Landowska, A., Szwoch, M., Szwoch, W., & Wróbel, M. R. (2014). Emotion

Recognition and Its Applications. Advances in Intelligent Systems and Computing, 300, 51–62.

[7] Mauchand, M., & Pell, M. D. (2020). Emotivity in the Voice: Prosodic, Lexical, and Cultural

Appraisal of Complaining Speech. Frontiers in Psychology, 11

[8] Tawari, A., & Trivedi, M. M. (2010). Speech emotion analysis: Exploring the role of context. IEEE

Transactions on Multimedia, 12(6), 502–509.

[9] Pérez-Espinosa, H., Zatarain-Cabada, R., & Barrón-Estrada, M. L. (2022). Emotion recognition: from

speech and facial expressions. Biosignal Processing and Classification Using Computational Learning

and Intelligence(pp. 307–326).

[10] Ramdinmawii, E., Mohanta, A., & Mittal, V. K. (2017). Emotion recognition from speech signal.

IEEE Region 10 Annual International Conference, Proceedings/TENCON-2017 (pp.1562–1567).

[11] Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural

networks. Neural Computing and Applications, 9(4), 290–296.

[12] Al-Talabani, A., Sellahewa, H., & Jassim, S. A. (2015). Emotion recognition from speech: tools and

challenges. Mobile Multimedia/Image Processing, Security, and Applications 2015, 9497, 94970N.

[13] Luoh, L., Su, Y. Z., & Hsu, C. F. (2010). Speech signal processing based emotion recognition. 2010

International Conference on System Science and Engineering, ICSSE 2010, 487–490

[14] Soltani, K., & Ainon, R. N. (2007). Speech emotion detection based on neural networks. 2007 9th

International Symposium on Signal Processing and Its Applications, ISSPA 2007, Proceedings.

[15] Nam, Y., & Lee, C. (2021). Cascaded Convolutional Neural Network Architecture for Speech

Emotion Recognition in Noisy Conditions. Sensors 2021( Vol. 21, No. 13, p. 4399)

[16] Kurpukdee, N., Koriyama, T., Kobayashi, T., Kasuriya, S., Wutiwiwatchai, C., & Lamsrichan, P.

(2018). Speech emotion recognition using convolutional long short-term memory neural network and

support vector machines. Proceedings - 9th Asia-Pacific Signal and Information Processing

Association Annual Summit and Conference (pp. 1744–1749)

[17] Yu, Y., & Kim, Y. J. (2020). Attention-LSTM-attention model for speech emotion recognition and

analysis of IEMOCAP database. Electronics (Switzerland), 9(5).

[18] Lech, M., Stolar, M., Best, C., & Bolia, R. (2020a). Real-Time Speech Emotion Recognition Using a

Pre-trained Image Classification Network: Effects of Bandwidth Reduction and Companding.

Frontiers in Computer Science, 2.

[19] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020b). Automatic

Speech Emotion Recognition Using Machine Learning. Social Media and Machine Learning.

[20] Tzirakis, P., Trigeorgis, G., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2017). End-to-End

Multimodal Emotion Recognition using Deep Neural Networks. IEEE Journal on Selected Topics in

Signal Processing, 11(8), 1301–1309.

[21] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., & Mahjoub, M. A. (2018). Speech emotion

recognition: Methods and cases study. ICAART 2018 - Proceedings of the 10th International

Conference on Agents and Artificial Intelligence, 2, 175–182.

[22] Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Ali Mahjoub, M., & Cleder, C. (2020a). Automatic

Speech Emotion Recognition Using Machine Learning. Social Media and Machine Learning.

[23] Home | Cheriton School of Computer Science | University of Waterloo. (n.d.).

[24] MFCC Technique for Speech Recognition - Analytics Vidhya. (n.d.).

[25] Kadiri, S. R., & Alku, P. (2019). Mel-frequency cepstral coefficients of voice source waveforms for

classification of phonation types in speech. Proceedings of the Annual Conference of the International

Speech Communication Association, INTERSPEECH, (pp.2508–2512).

[26] Lech, M., Stolar, M., Best, C., & Bolia, R. (2020b). Real-Time Speech Emotion Recognition Using a

Pre-trained Image Classification Network: Effects of Bandwidth Reduction and Companding.

Frontiers in Computer Science (Vol. 2, p. 14).

[27] Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., … Farhan, L.

(2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future

directions. Journal of Big Data 2021 8:1, 8(1), 1–74.

[28] LSTM | Introduction to LSTM | Long Short Term Memor. (n.d.).

[29] Narv e , F. . 2021,Decemeber). Smart technologies, systems and applications : Second

International Conference, SmartTech-IC 2021.

[30] Guo, J. (2022). Deep learning approach to text analysis for human emotion detection from big data.

Journal of Intelligent Systems, 31(1), 113–126.


Cite this Article as :
Praveen Singh , Preeti Nagrath, Vocal Analysis and Sentiment Discernment using AI, Fusion: Practice and Applications, Vol. 7 , No. 2 , (2022) : 100-109 (Doi   :  https://doi.org/10.54216/FPA.070204)