Abstract

This paper introduces the sparse multilayer perceptron (SMLP) which jointly learns a sparse feature representation and nonlinear classifier boundaries to optimally discriminate multiple output classes. SMLP learns the transformation from the inputs to the targets as in multilayer perceptron (MLP) while the outputs of one of the internal hidden layers is forced to be sparse. This is achieved by adding a sparse regularization term to the cross-entropy cost and updating the parameters of the network to minimize the joint cost. On the TIMIT phoneme recognition task, SMLP-based systems trained on individual speech recognition feature streams perform significantly better than the corresponding MLP-based systems. Phoneme error rate of 19.6% is achieved using the combination of SMLP-based systems, a relative improvement of 3.0% over the combination of MLP-based systems.

Keywords

Computer sciencePattern recognition (psychology)TIMITArtificial intelligenceMultilayer perceptronSparse approximationSpeech recognitionClassifier (UML)PerceptronWord error rateFeature vectorArtificial neural networkHidden Markov model

Affiliated Institutions

Related Publications

Publication Info

Year
2011
Type
article
Volume
20
Issue
1
Pages
23-29
Citations
65
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

65
OpenAlex

Cite This

G. S. V. S. Sivaram, Hynek Heřmanský (2011). Sparse Multilayer Perceptron for Phoneme Recognition. IEEE Transactions on Audio Speech and Language Processing , 20 (1) , 23-29. https://doi.org/10.1109/tasl.2011.2129510

Identifiers

DOI
10.1109/tasl.2011.2129510