Abstract

In speech emotion recognition, training and test data used for system development usually tend to fit each other perfectly, but further 'similar' data may be available. Transfer learning helps to exploit such similar data for training despite the inherent dissimilarities in order to boost a recogniser's performance. In this context, this paper presents a sparse auto encoder method for feature transfer learning for speech emotion recognition. In our proposed method, a common emotion-specific mapping rule is learnt from a small set of labelled data in a target domain. Then, newly reconstructed data are obtained by applying this rule on the emotion-specific data in a different domain. The experimental results evaluated on six standard databases show that our approach significantly improves the performance relative to learning each source domain independently.

Keywords

AutoencoderComputer scienceTransfer of learningArtificial intelligenceFeature (linguistics)Speech recognitionDomain (mathematical analysis)ExploitContext (archaeology)EncoderTest dataPattern recognition (psychology)Training setSet (abstract data type)Data setMachine learningDeep learningMathematics

Affiliated Institutions

Related Publications

Universal Sentence Encoder

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate pe...

2018 arXiv (Cornell University) 1289 citations

Publication Info

Year
2013
Type
article
Pages
511-516
Citations
358
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

358
OpenAlex
12
Influential
252
CrossRef

Cite This

Jun Deng, Zixing Zhang, Erik Marchi et al. (2013). Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction , 511-516. https://doi.org/10.1109/acii.2013.90

Identifiers

DOI
10.1109/acii.2013.90

Data Quality

Data completeness: 81%