Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

Z. B. M. D. Shah; SHAN Zhiyong; Adnan

doi:10.38124/ijisrt/ijisrt24apr872

Abstract

Speech is essential to human communication for expressing and understanding feelings. Emotional speech processing has challenges with expert data sampling, dataset organization, and computational complexity in large-scale analysis. This study aims to reduce data redundancy and high dimensionality by introducing a new speech emotion recognition system. The system employs Diffusion Map to reduce dimensionality and includes Decision Trees and K-Nearest Neighbors(KNN)ensemble classifiers. These strategies are suggested to increase voice emotion recognition accuracy. Speech emotion recognition is gaining popularity in affective computing for usage in medical, industry, and academics. This project aims to provide an efficient and robust real-time emotion identification framework. In order to identify emotions using supervised machine learning models, this work makes use of paralinguistic factors such as intensity, pitch, and MFCC. In order to classify data, experimental analysis integrates prosodic and spectral information utilizing methods like Random Forest, Multilayer Perceptron, SVM, KNN, and Gaussian Naïve Bayes. Fast training times make these machine learning models excellent for real-time applications. SVM and MLP have the highest accuracy at 70.86% and 79.52%, respectively. Comparisons to benchmarks show significant improvements over earlier models.

Keywords

Speech recognitionComputer scienceEmotion detectionPsychologyEmotion recognition

Related Publications

Exploratory analysis and visualization of speech and music by locally linear embedding

Vaibhav Jain , L.K. Saul

Many problems in voice recognition and audio processing involve feature extraction from raw waveforms. The goal of feature extraction is to reduce the dimensionality of the audi...

2004 46 citations

Using Hashtags to Capture Fine Emotion Categories from Tweets

Saif M. Mohammad , Svetlana Kiritchenko

Detecting emotions in microblogs and social media posts has applications for industry, health, and security. Statistical, supervised automatic methods for emotion detection rely...

2014 Computational Intelligence 405 citations

Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition

Dong Yu , Sabato Marco Siniscalchi , Li Deng +1 more

Generation of high-precision sub-phonetic attribute (also known as phonological features) and phone lattices is a key frontend component for detection-based bottom-up speech rec...

2012 64 citations

EEG Emotion Recognition Using Dynamical Graph Convolutional Neural Networks

Tengfei Song , Wenming Zheng , Peng Song +1 more

In this paper, a multichannel EEG emotion recognition method based on a novel dynamical graph convolutional neural networks (DGCNN) is proposed. The basic idea of the proposed E...

2018 IEEE Transactions on Affective Computing 1286 citations

PIXOR: Real-time 3D Object Detection from Point Clouds

Bin Yang , Wenjie Luo , Raquel Urtasun

We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Speed is critical as detection is a necessary component for safet...

2018 1284 citations

Publication Info

Year: 2024
Type: article
Pages: 1526-1534
Citations: 1659
Access: Closed

External Links

Download PDF (Free) View on DOI.org Semantic Scholar

Social Impact

Altmetric

Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1659

OpenAlex

Influential

CrossRef

Cite This

APA Style

                            
                                    Z. B. M. D. Shah, 
                                
                                    SHAN Zhiyong, 
                                
                                    Adnan
                                
                            (2024). 
                            Enhancements in Immediate Speech Emotion Detection: Harnessing Prosodic and Spectral Characteristics. 
                            International Journal of Innovative Science and Research Technology (IJISRT)
                            
                            , 1526-1534.
                            https://doi.org/10.38124/ijisrt/ijisrt24apr872

Identifiers

DOI: 10.38124/ijisrt/ijisrt24apr872

Data Quality

Data completeness: 77%