/
/
/
Emotion Classification from Speech Waveform Using Machine Learning and Deep Learning Techniques

Emotion Classification from Speech Waveform Using Machine Learning and Deep Learning Techniques

Original Research ArticleOct 16, 2024Vol. 25 No. 1 (2025) 10.55003/cast.2024.257184

Abstract

Emotions play a key role in determining the human mental state and indirectly express an individual’s well- being. A speech emotion recognition system can extract a person’s emotions from his/her speech inputs. There are some universal emotions such as anger, disgust, fear, happiness, pleasantness, sadness and neutral. These emotions are of significance especially in a situation like the Covid pandemic, when the aged or sick are vulnerable to depression. In the current paper, we examined various classification models with finite computational strength and resources in order to determine the emotion of a person from his/her speech. Speech prosodic features like pitch, loudness, and tone of speech, and work spectral features such as Mel Frequency Capstral Coefficients (MFCCs) of the voice were used to analyze the emotions of a person. Although sequence to sequence state of the art models for speech detection that offer high levels of accuracy and precision are currently in use, the computational needs of such approaches are high and inefficient. Therefore, in this work, we emphasised analysis and comparison of different classification algorithms such as multi layer perceptron, decision tree, support vector machine, and deep neural networks such as convolutional neural network and long short term memory. Given an audio file, the emotions that were exhibited by the speaker were recognized using machine learning and deep learning techniques. A comparative study was performed to identify the most appropriate algorithms that could be used to recognize emotions. Based on the experiment results, the MLP classifier and convolutional neural network model offered better accuracy with smaller variations when compared with other models used for the study.

References

1
Abdu, S. A., Yousef, A. H., & Salem, A. (2021). Multimodal video sentiment analysis using deep learning approaches, a survey. Information Fusion, 76, 204-226. https://doi.org/10.1016/j.inffus.2021.06.003
2
Abdusalomov, A. B., Safarov, F., Rakhimov, M., Turaev, B., & Whangbo, T. K. (2022). Improved feature parameter extraction from speech signals using machine learning algorithm. Sensors, 22(21), Article 8122. https://doi.org/10.3390/s22218122
3
Akinpelu, S., & Viriri, S. (2023). Speech emotion classification using attention based network and regularized feature selection. Scientific Reports, 13(1), Article 11990. https://doi.org/10.1038/s41598-023-38868-2
4
Ancilin, J., & Milton, A. (2021). Improved speech emotion recognition with Mel frequency magnitude coefficient. Applied Acoustics, 179, Article 108046. https://doi.org/10.1016/j.apacoust.2021.108046
5
Aouani, H., & Ayed, Y. B. (2020). Speech emotion recognition with deep learning. Procedia Computer Science, 176, 251-260. https://doi.org/10.1016/j.procs.2020.08.027

Author Information

Smitha Narendra Pai

Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India

Punnath Balakrishnan Shanthi

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India

Shivaprasad Hegde

Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India

About this Article

Current Journal

Vol. 25 No. 1 (2025)

Type of Manuscript

Original Research Article

Keywords

speech emotion detection
support vector machine
decision tree
multi-layer perceptron
convolutional neural network
long short-term memory

Published

16 October 2024

DOI

10.55003/cast.2024.257184

Current Journal

Journal Cover
Vol. 25 No. 1 (2025)

Search

Latest Articles

Review Article
Oct 24, 2024

Protein Folding in the Presence of Osmolytes - a Complex Interplay of Multiple Forces

Original Research Article
Oct 17, 2024

Seawater Characteristics and Their Influence on Green Mussel (Perna viridis) Production Potential

Review Article
Oct 17, 2024

Pharmacophore-based SAR Analysis and Synthetic Route Review of Imidazole Core Analogues

Original Research Article
Oct 17, 2024

Mechanical and Physical Properties of Binderless Particleboard from Rice Straw and Banana Pseudostem