Speaker-independent phone recognition using hidden Markov models

Abstract

Hidden Markov modeling is extended to speaker-independent phone recognition. Using multiple codebooks of various linear-predictive-coding (LPC) parameters and discrete hidden Markov models (HMMs) the authors obtain a speaker-independent phone recognition accuracy of 58.8-73.8% on the TIMIT database, depending on the type of acoustic and language models used. In comparison, the performance of expert spectrogram readers is only 69% without use of higher level knowledge. The authors introduce the co-occurrence smoothing algorithm, which enables accurate recognition even with very limited training data. Since the results were evaluated on a standard database, they can be used as benchmarks to evaluate future systems.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

Keywords

Hidden Markov modelSpeech recognitionComputer scienceSpectrogramPhoneTIMITSmoothingPattern recognition (psychology)Linear predictive codingArtificial intelligenceCoding (social sciences)Markov modelMarkov chainMachine learningSpeech codingMathematicsStatistics

Affiliated Institutions

Carnegie Mellon University US

Related Publications

Global optimization of a neural network-hidden Markov model hybrid

Yoshua Bengio , Renato De Mori , Giovanni Flammia +1 more

An original method for integrating artificial neural networks (ANN) with hidden Markov models (HMM) is proposed. ANNs are suitable for performing phonetic classification, wherea...

2002 18 citations

Phoneme recognition using time-delay neural networks

Alexander Waibel , Toshiyuki Hanazawa , Geoffrey E. Hinton +2 more

The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of...

1989 IEEE Transactions on Acoustics Speech... 2619 citations

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition

Ossama Abdel‐Hamid , Abdelrahman Mohamed , Hui Jiang +1 more

Convolutional Neural Networks (CNN) have showed success in achieving translation invariance for many image processing tasks. The success is largely attributed to the use of loca...

2012 885 citations

Hidden Markov models for character recognition

J.A. Vlontzos , Sun‐Yuan Kung

A hierarchical system for character recognition with hidden Markov model knowledge sources which solve both the context sensitivity problem and the character instantiation probl...

1992 IEEE Transactions on Image Processing 55 citations

Acoustic Modeling Using Deep Belief Networks

Abdelrahman Mohamed , George E. Dahl , Geoffrey E. Hinton

Gaussian mixture models are currently the dominant technique for modeling the emission distribution of hidden Markov models for speech recognition. We show that better phone rec...

2011 IEEE Transactions on Audio Speech and... 1732 citations

Publication Info

Year: 1989
Type: article
Volume: 37
Issue: 11
Pages: 1641-1648
Citations: 931
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Speaker-independent phone recognition using hidden Markov models

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

931

OpenAlex

Cite This

APA Style

                            
                                    K.-F. Lee, 
                                
                                    Hsiao-Wuen Hon
                                
                            (1989). 
                            Speaker-independent phone recognition using hidden Markov models. 
                            IEEE Transactions on Acoustics Speech and Signal Processing
                            , 37
                            (11)
                            , 1641-1648.
                            https://doi.org/10.1109/29.46546

Identifiers

DOI: 10.1109/29.46546