Sparse Multilayer Perceptron for Phoneme Recognition

Abstract

This paper introduces the sparse multilayer perceptron (SMLP) which jointly learns a sparse feature representation and nonlinear classifier boundaries to optimally discriminate multiple output classes. SMLP learns the transformation from the inputs to the targets as in multilayer perceptron (MLP) while the outputs of one of the internal hidden layers is forced to be sparse. This is achieved by adding a sparse regularization term to the cross-entropy cost and updating the parameters of the network to minimize the joint cost. On the TIMIT phoneme recognition task, SMLP-based systems trained on individual speech recognition feature streams perform significantly better than the corresponding MLP-based systems. Phoneme error rate of 19.6% is achieved using the combination of SMLP-based systems, a relative improvement of 3.0% over the combination of MLP-based systems.

Keywords

Computer sciencePattern recognition (psychology)TIMITArtificial intelligenceMultilayer perceptronSparse approximationSpeech recognitionClassifier (UML)PerceptronWord error rateFeature vectorArtificial neural networkHidden Markov model

Affiliated Institutions

Johns Hopkins University US

Related Publications

Backpropagation training for multilayer conditional random field based phone recognition

Rohit Prabhavalkar , Eric Fosler‐Lussier

Conditional random fields (CRFs) have recently found increased popularity in automatic speech recognition (ASR) applications. CRFs have previously been shown to be effective com...

2010 31 citations

Deep Belief Networks using discriminative features for phone recognition

Abdelrahman Mohamed , Tara N. Sainath , George E. Dahl +3 more

Deep Belief Networks (DBNs) are multi-layer generative models. They can be trained to model windows of coefficients extracted from speech and they discover multiple layers of fe...

2011 289 citations

Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR

Oriol Vinyals , Suman Ravuri

In this paper, we extend the work done on integrating multilayer perceptron (MLP) networks with HMM systems via the Tandem approach. In particular, we explore whether the use of...

2011 55 citations

Speech Recognition Using Augmented Conditional Random Fields

Yasser Hifny , Steve Renals

Acoustic modeling based on hidden Markov models (HMMs) is employed by state-of-the-art stochastic speech recognition systems. Although HMMs are a natural choice to warp the time...

2009 IEEE Transactions on Audio Speech and... 82 citations

Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR

Tara N. Sainath , Bhuvana Ramabhadran , Michael Picheny +2 more

The use of exemplar-based methods, such as support vector machines (SVMs), k-nearest neighbors (kNNs) and sparse representations (SRs), in speech recognition has thus far been l...

2011 IEEE Transactions on Audio Speech and... 65 citations

Publication Info

Year: 2011
Type: article
Volume: 20
Issue: 1
Pages: 23-29
Citations: 65
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Sparse Multilayer Perceptron for Phoneme Recognition

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    G. S. V. S. Sivaram, 
                                
                                    Hynek Heřmanský
                                
                            (2011). 
                            Sparse Multilayer Perceptron for Phoneme Recognition. 
                            IEEE Transactions on Audio Speech and Language Processing
                            , 20
                            (1)
                            , 23-29.
                            https://doi.org/10.1109/tasl.2011.2129510

Identifiers

DOI: 10.1109/tasl.2011.2129510