ترجمه مقاله نقش ضروری ارتباطات 6G با چشم انداز صنعت 4.0
- مبلغ: ۸۶,۰۰۰ تومان
ترجمه مقاله پایداری توسعه شهری، تعدیل ساختار صنعتی و کارایی کاربری زمین
- مبلغ: ۹۱,۰۰۰ تومان
Abstract
Feature selection is a crucial step in the development of a system for identifying emotions in speech. Recently, the interaction between features generated from the same audio source was rarely considered, which may produce redundant features and increase the computational costs. To solve this problem, feature selection method based on correlation analysis and Fisher is proposed, which can remove the redundant features that have close correlations with each other. To improve the recognition performance of the feature subset after proposal feature selection further, an emotion recognition method based on extreme learning machine (ELM) decision tree is proposed according to the confusion degree among different basic emotions. A framework of speech emotion recognition is proposed and the classification experiments based on proposed classification method by using Chinese speech database from institute of automation of Chinese academy of sciences (CASIA) are performed. And the experimental results show that the proposal achieved 89.6% recognition rate on average. By proposal, it would be fast and efficient to discriminate emotional states of different speakers from speech, and it would make it possible to realize the interaction between speaker-independent and computer/robot in the future.
7. Conclusion
In this paper, a framework of speech emotion recognition from feature extraction to emotion classification was proposed according to the fact that speech emotion recognition is difficult to be applied in human-computer interaction, in which the feature selection method based on correlation analysis and Fisher criterion, and the ELM decision tree recognition method based on confusion degree among basic emotions were proposed. And the validity of the proposal was verified through a series of contrast experiments respectively, which contained four groups of experiment under different experimental condition. The ELM is more suitable for the decision tree algorithm through the experimental comparison of recognition rate or recognition time under different experimental condition. And the utility of the feature selection based on correlation analysis and the Fisher criterion was fully verified through the experimental comparison of recognition rate under different emotional feature set.
In future research, some aspects of the experiment would be improved. Firstly, some novel speech emotional features that contain sufficient emotional information such as Teager energy operator feature [44] would be extracted. In addition, other feature extraction methods, e.g., deep learning [45], would be adopted in our future research. Secondly, we will be studying on feature selection based on evolutionary computation, by which the information of emotional labels in feature set can be fully utilized. Thirdly, different speech databases could be taken into account to verify the practicability of the proposed method.