Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition

下载PDF

导出

摘要 Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emoti

作者 Fatma Harby Mansor Alohali Adel Thaljaoui Amira Samy Talaat

机构地区 Computer Science Department Department of Computer Science and Information College of Science at Zulfi Preparatory Institute for Engineering Studies of Gafsa Computers and Systems Department

出处《Computers, Materials & Continua》 SCIE EI 2024年第2期2689-2719,共31页 计算机、材料和连续体（英文）

关键词 Artificial intelligence application multi features sequential selection speech emotion recognition deep Bi-LSTM

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1Sandeep Kumar,MohdAnul Haq,Arpit Jain,C.Andy Jason,Nageswara Rao Moparthi,Nitin Mittal,Zamil S.Alzamil.Multilayer Neural Network Based Speech Emotion Recognition for Smart Assistance[J].Computers, Materials & Continua,2023(1):1523-1540. 被引量：2

共引文献1

1Liya Yue,Pei Hu,Shu-Chuan Chu,Jeng-Shyang Pan.Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition[J].Computers, Materials & Continua,2024,78(2):1957-1975.

1毛稚霞.“EMO”了并非抑郁症[J].中医健康养生,2024,10(3):36-38.
2本刊编辑部.编者的话[J].世界制造技术与装备市场,2023(5):7-7.
3单帅杰,刘建宝,李厚朴.基于GMM和改进MFCCs的电机轴承故障诊断方法研究[J].舰船电子工程,2023,43(10):156-161.
4Liya Yue,Pei Hu,Shu-Chuan Chu,Jeng-Shyang Pan.Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition[J].Computers, Materials & Continua,2024,78(2):1957-1975.
5美丽.《阿莱钦柏之歌》的搜集整理及研究概况[J].西部蒙古论坛,2023(4):117-124.
6Dali Wang,Xiaochong Tong,Chenguang Dai,Congzhou Guo,Yi Lei,Chunping Qiu,He Li,Yuekun Sun.Voxel modeling and association of ubiquitous spatiotemporal information in natural language texts[J].International Journal of Digital Earth,2023,16(1):868-890.
7LI Yi,ZHU LiXiang,ZHANG ZiQian,GUO MingFei,LI ZhiXin,LI YanBiao,HASHIMOTO Minoru.Humanoid robot heads for human-robot interaction:A review[J].Science China(Technological Sciences),2024,67(2):357-379.
8Akmalbek Abdusalomov,Alpamis Kutlimuratov,Rashid Nasimov,Taeg Keun Whangbo.Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation[J].Computers, Materials & Continua,2023,77(12):2915-2933.
9Bayi Xu,Lei Sun,Xiuqing Mao,Chengwei Liu,Zhiyi Ding.Strengthening Network Security: Deep Learning Models for Intrusion Detectionwith Optimized Feature Subset and Effective Imbalance Handling[J].Computers, Materials & Continua,2024,78(2):1995-2022.
10李瑞昌.行为公共管理与政策:新起点与新知识[J].复旦公共行政评论,2022(2):1-5.

Computers, Materials & Continua

2024年第2期

浏览历史

内容加载中请稍等...

Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition

参考文献1

共引文献1

相关作者

相关机构

相关主题

浏览历史