Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

Chin-Yew Lin; Franz Josef Och

doi:10.3115/1218955.1219032

Abstract

In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring insequence n-grams automatically. The second method relaxes strict n-gram matching to skipbigram matching. Skip-bigram is any pair of words in their sentence order. Skip-bigram cooccurrence statistics measure the overlap of skip-bigrams between a candidate translation and a set of reference translations. The empirical results show that both methods correlate with human judgments very well in both adequacy and fluency. 1

Keywords

BigramComputer scienceMachine translationLongest common subsequence problemArtificial intelligenceSet (abstract data type)Natural language processingSubsequenceEvaluation of machine translationTranslation (biology)Matching (statistics)Similarity (geometry)Speech recognitionPattern matchingPattern recognition (psychology)AlgorithmTrigramStatisticsMathematicsExample-based machine translation

Affiliated Institutions

Related Publications

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang , Varsha Kishore , Felix Wu +2 more

We propose BERTScore, an automatic evaluation metric for text generation. Analogously to common metrics, BERTScore computes a similarity score for each token in the candidate se...

2020 arXiv (Cornell University) 603 citations

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang , Varsha Kishore , Felix Wu +2 more

We propose BERTScore, an automatic evaluation metric for text generation. Analogously to common metrics, BERTScore computes a similarity score for each token in the candidate se...

2019 arXiv (Cornell University) 2001 citations

Publication Info

Year: 2004
Type: article
Pages: 605-es
Citations: 708
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

708

OpenAlex

Cite This

APA Style

                            
                                    Chin-Yew Lin, 
                                
                                    Franz Josef Och
                                
                            (2004). 
                            Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. 
                            
                            , 605-es.
                            https://doi.org/10.3115/1218955.1219032

Identifiers

DOI: 10.3115/1218955.1219032