Skip to content

Latest commit

 

History

History
 
 

inc-eng

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.asm-eng.asm.eng 17.3 0.357
Tatoeba-test.awa-eng.awa.eng 6.9 0.224
Tatoeba-test.ben-eng.ben.eng 46.3 0.606
Tatoeba-test.bho-eng.bho.eng 30.6 0.456
Tatoeba-test.guj-eng.guj.eng 19.0 0.367
Tatoeba-test.hif-eng.hif.eng 4.2 0.240
Tatoeba-test.hin-eng.hin.eng 38.9 0.568
Tatoeba-test.kok-eng.kok.eng 4.8 0.238
Tatoeba-test.lah-eng.lah.eng 17.6 0.284
Tatoeba-test.mai-eng.mai.eng 47.6 0.699
Tatoeba-test.mar-eng.mar.eng 23.0 0.475
Tatoeba-test.multi.eng 27.6 0.490
Tatoeba-test.nep-eng.nep.eng 1.4 0.189
Tatoeba-test.ori-eng.ori.eng 2.0 0.207
Tatoeba-test.pan-eng.pan.eng 15.5 0.349
Tatoeba-test.rom-eng.rom.eng 3.2 0.174
Tatoeba-test.sin-eng.sin.eng 30.5 0.526
Tatoeba-test.snd-eng.snd.eng 10.0 0.330
Tatoeba-test.urd-eng.urd.eng 28.0 0.476

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 8.7 0.335
newsdev2019-engu-gujeng.guj.eng 8.3 0.308
newstest2014-hien-hineng.hin.eng 12.7 0.389
newstest2019-guen-gujeng.guj.eng 5.9 0.280
Tatoeba-test.asm-eng.asm.eng 18.0 0.360
Tatoeba-test.awa-eng.awa.eng 6.8 0.217
Tatoeba-test.ben-eng.ben.eng 44.6 0.594
Tatoeba-test.bho-eng.bho.eng 28.1 0.462
Tatoeba-test.guj-eng.guj.eng 16.6 0.362
Tatoeba-test.hif-eng.hif.eng 4.4 0.235
Tatoeba-test.hin-eng.hin.eng 38.0 0.556
Tatoeba-test.kok-eng.kok.eng 1.4 0.153
Tatoeba-test.lah-eng.lah.eng 15.3 0.266
Tatoeba-test.mai-eng.mai.eng 51.8 0.661
Tatoeba-test.mar-eng.mar.eng 22.6 0.470
Tatoeba-test.multi.eng 26.8 0.484
Tatoeba-test.nep-eng.nep.eng 2.8 0.180
Tatoeba-test.ori-eng.ori.eng 3.4 0.219
Tatoeba-test.pan-eng.pan.eng 15.2 0.373
Tatoeba-test.rom-eng.rom.eng 1.3 0.166
Tatoeba-test.san-eng.san.eng 3.1 0.167
Tatoeba-test.sin-eng.sin.eng 28.2 0.507
Tatoeba-test.snd-eng.snd.eng 38.5 0.500
Tatoeba-test.urd-eng.urd.eng 25.2 0.451

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 8.9 0.341
newsdev2019-engu-gujeng.guj.eng 8.7 0.321
newstest2014-hien-hineng.hin.eng 13.1 0.396
newstest2019-guen-gujeng.guj.eng 6.5 0.290
Tatoeba-test.asm-eng.asm.eng 18.1 0.363
Tatoeba-test.awa-eng.awa.eng 6.2 0.222
Tatoeba-test.ben-eng.ben.eng 44.7 0.595
Tatoeba-test.bho-eng.bho.eng 29.4 0.458
Tatoeba-test.guj-eng.guj.eng 19.3 0.383
Tatoeba-test.hif-eng.hif.eng 3.7 0.220
Tatoeba-test.hin-eng.hin.eng 38.6 0.564
Tatoeba-test.kok-eng.kok.eng 6.6 0.287
Tatoeba-test.lah-eng.lah.eng 16.0 0.272
Tatoeba-test.mai-eng.mai.eng 75.6 0.796
Tatoeba-test.mar-eng.mar.eng 25.9 0.497
Tatoeba-test.multi.eng 29.0 0.502
Tatoeba-test.nep-eng.nep.eng 4.5 0.198
Tatoeba-test.ori-eng.ori.eng 5.0 0.226
Tatoeba-test.pan-eng.pan.eng 17.4 0.375
Tatoeba-test.rom-eng.rom.eng 1.7 0.174
Tatoeba-test.san-eng.san.eng 5.0 0.173
Tatoeba-test.sin-eng.sin.eng 31.2 0.511
Tatoeba-test.snd-eng.snd.eng 45.7 0.670
Tatoeba-test.urd-eng.urd.eng 25.6 0.456

opus4m-2020-08-12.zip

  • dataset: opus4m
  • model: transformer
  • source language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus4m-2020-08-12.zip
  • test set translations: opus4m-2020-08-12.test.txt
  • test set scores: opus4m-2020-08-12.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2014-hineng.hin.eng 9.2 0.350
newsdev2019-engu-gujeng.guj.eng 10.1 0.339
newstest2014-hien-hineng.hin.eng 13.8 0.410
newstest2019-guen-gujeng.guj.eng 6.9 0.297
Tatoeba-test.asm-eng.asm.eng 19.8 0.382
Tatoeba-test.awa-eng.awa.eng 8.8 0.234
Tatoeba-test.ben-eng.ben.eng 45.1 0.601
Tatoeba-test.bho-eng.bho.eng 25.7 0.411
Tatoeba-test.guj-eng.guj.eng 21.8 0.386
Tatoeba-test.hif-eng.hif.eng 9.0 0.288
Tatoeba-test.hin-eng.hin.eng 39.2 0.570
Tatoeba-test.kok-eng.kok.eng 1.8 0.147
Tatoeba-test.lah-eng.lah.eng 17.5 0.315
Tatoeba-test.mai-eng.mai.eng 53.2 0.713
Tatoeba-test.mar-eng.mar.eng 26.6 0.504
Tatoeba-test.multi.eng 30.0 0.510
Tatoeba-test.nep-eng.nep.eng 3.8 0.206
Tatoeba-test.ori-eng.ori.eng 5.8 0.229
Tatoeba-test.pan-eng.pan.eng 17.3 0.370
Tatoeba-test.rom-eng.rom.eng 1.8 0.172
Tatoeba-test.san-eng.san.eng 4.8 0.173
Tatoeba-test.sin-eng.sin.eng 32.0 0.525
Tatoeba-test.snd-eng.snd.eng 38.5 0.500
Tatoeba-test.urd-eng.urd.eng 26.6 0.468