您的位置：首页 > 其它

[文献阅读] A Study of Translation Edit Rate with Targeted Human Annotation

2014-04-17 15:23 549 查看

A Study of Translation Edit Rate with Targeted Human Annotation

Matthew Snover and Bonnie Dorr

Institute for Advanced Computer Studies

University of Maryland

College Park, MD 20742

{snover,bonnie}@umiacs.umd.edu

本文重要信息摘要:

1、Translation Edit Rate (TER) measures the amount of editing that a human would have to perform to change a system output so it exactly matches a reference translation.

2、The methods of automatic machine translation consist of BLEU, METEOR,NIST,TER and so on.

3、We define a new, more intuitive measure of “goodness” of MT output—specifically, the number of edits needed to fix the output so that it semantically matches a correct translation.

4、Recently the GALE (Olive, 2005) (Global Autonomous Language Exploitation) research program introduced a new error measure called Translation Edit Rate (TER) that was originally designed to count the number of edits (including
phrasal shifts) performed by a human to change a hypothesis so that it is both fluent and has the correct meaning. This was then decomposed into two steps: defining a new reference and finding the minimum number

of edits so that the hypothesis exactly matches one of the references. This measure was defined such that all edits, including shifts, would have a cost of one. Finding only the minimum number of ed-its, without generating a new reference is the measure defined
as TER; finding the minimum of edits to a new targeted references is defined as human-targeted TER (or HTER).

5、BLEU (Papineni et al., 2002) calculates the score of a translation by measuring the number of n-grams, of varying length, of the system output that occur within the set of references.

6、METEOR (Banerjee and Lavie, 2005) is an evaluation measure that counts the number of exact word matches between the system output and reference. Unmatched words are then stemmed and matched. Additional penalities are assessed
for reordering the words between the hypothesis and reference. This method has been shown to correlate very well with human judgments.

7、TER is defined as the minimum number of edits needed to change a hypothesis so that it exactly matches one of the references, normalized by the average length of the references.

8、Possible edits include the insertion, deletion, and substitution of single words as well as shifts of word sequences.

9、

10、The number of insertions, deletions, and substitutions is calculated using dynamic programming. A greedy search is used to find the set of shifts, by repeatedly selecting the shift that most reduces the number of insertions,
deletions and substitutions, until no more beneficial shifts remain.

11、

12、In both TER and HTER, the majority of the edits were substitutions and deletions.

13、 In an analysis of shift size and distance, we found that most shifts are short in length (1 word) and are

by less than 7 words.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 机器翻译评测

相关文章推荐

新的分享

章节导航