利⽤BLEU进⾏机器翻译检测(Python-NLTK-BLEU评分⽅法)
双语评估替换分数(简称BLEU)是⼀种对⽣成语句进⾏评估的指标。完美匹配的得分为1.0,⽽完全不匹配则得分为0.0。这种评分标准是为了评估⾃动机器翻译系统的预测结果
⽽开发的,具备了以下⼀些优点:
1. 计算速度快,计算成本低。
2. 容易理解。
3. 与具体语⾔⽆关。
4. 已被⼴泛采⽤。
BLEU评分是由Kishore Papineni等⼈在他们2002年的论⽂BLEU a Method for Automatic Evaluation of Machine Translation中提出的。BLEU计算的原理是计算待评价译⽂和⼀
个或多个参考译⽂间的距离。距离是⽂本间n元相似度的平均,n=1,2,3(更⾼的值似乎⽆关紧要)。也就是说,如果待选译⽂和参考译⽂的2元(连续词对)或3元相似度较⾼,
那么该译⽂的得分就较⾼。
reference group我们是翻译众包业务,对于我们的应⽤场景,如何得知译员是否有参考机器翻译引擎就成了⼀个⽐较重要的问题。我提出的基本思路是:
1. 在多个翻译⽹站上翻译原⽂,得到⼀组机器翻译评测集,以下的例⼦中就是⼀段原⽂通过百度、有道翻译之后,组织了⼀个机器翻译评测集
2. 将译员翻译出来的译⽂,作为待评测数据,计算其与机器翻译评测集的BLEU值(使⽤NLTK中提供的BLEU评分⽅法)
3. 值越⾼,表明匹配度越⾼,则译员参考机器翻译或者直接拷贝机器翻译的可能性就越⾼,此时需要项⽬经理介⼊。
以下是⽰例:
1、原⽂
新译星将代表四达时代集团在展览会上闪亮登场,届时我们将从新译星所开展的业务、具备的优势、成功案例等多个维度进⾏介绍,让您更加全⾯的了解新译星。我们拥有稳定的全职国际化团队,能够确保守时、⾼效的完成翻译和配⾳,并通过⾄臻 2、⼈⼯翻译
New Transtar will present itself at the Exhibition on behalf of StarTimes, and we will give a comprehensive introduction of ourselves, including the current services we offer, the advantages we hold, and the projects we have completed, to help yo 3、百度翻译
The new translator will stand on the exhibition on behalf of the four times group at the exhibition. We will introduce the new star's business, the advantages and the successful cases, so that you can understand the new translator more comprehe 4、有道翻译
The new translator star will represent sida times group in the exhibition, when we will introduce the new translator star's business, advantages, successful cases and other dimensions, so that you can have a more comprehensive understanding o 5、⽤百度翻译和有道翻译组织机器翻译评测集
[['The', 'new', 'translator', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that
6、⽤⼈⼯翻译组织待检测数据
['New', 'Transtar', 'will', 'present', 'itself', 'at', 'the', 'Exhibition', 'on', 'behalf', 'of', 'StarTimes', 'and', 'we',
'will', 'give', 'a', 'comprehensive', 'introduction', 'of', 'ourselves', 'including', 'the', 'current', 'services', 'we', 'offer', 'the', 'advantages', 'we',
7、⾸先测试⼈⼯翻译产出的译⽂与机器翻译评测集之间的BLEU值,得到结果为0.119115465241,如下
[root@host-10-0-251-156 ~]# python
Python 2.7.5 (default, Apr 112018, 07:36:10)
[GCC 4.8.520150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license"for more information.
>>> anslate.bleu_score import sentence_bleu
>>>
>>> reference=[['The', 'new', 'translator', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful'
>>>
>>> candidate=['New', 'Transtar', 'will', 'present', 'itself', 'at', 'the', 'Exhibition', 'on', 'behalf', 'of', 'StarTimes', 'and', 'we', 'will', 'give', 'a', 'comprehensive', 'introduction', 'of', 'ourselves', 'including', 'the', 'current', 'services', 'we', 'offer', 'the', 'advantages >>>
>>> score = sentence_bleu(reference, candidate)
>>> print score
0.119115465241
>>>
8、其次我们稍微改动以下百度翻译出来的译⽂,并测试其与机器翻译评测集之间的BLEU值,得到结果0.875629670466,如下:
8.1稍微改动之后的百度翻译
New Transtar will stand on the exhibition on behalf of the four times group at the exhibition. We will in
troduce the new star's business, the advantages and the successful cases, so that you can understand the new translator more comprehensive 8.2⽤改动之后的百度翻译作为待评测数据
['New', 'Transtar', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful', 'cases', 'so', 'that', 'you
8.3BLEU计算
>>> candidate_baidu=['New', 'Transtar', 'will', 'stand', 'on', 'the', 'exhibition', 'on', 'behalf', 'of', 'the', 'four', 'times', 'group', 'at', 'the', 'exhibition', 'We', 'will', 'introduce', 'the', 'new', 'star`s', 'business', 'the', 'advantages', 'and', 'the', 'successful',
>>> score_baidu = sentence_bleu(reference, candidate_baidu)
>>> print score_baidu
0.875629670466
>>>
9、由上⾯⽰例可看到,当待评测译⽂⾮常接近(也就是说该译员参考了机器翻译或直接进⾏的拷贝)机器翻译评测集中的数据时,BLEU值会升⾼。不过⾄于⾼到什么程度
才需要项⽬经理介⼊,这就需要在实际项⽬中不断的摸索了。
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论