Machine Translation On Wmt2014 English German

评估指标

BLEU score

Hardware Burden

Operations per network pass

评测结果

各个模型在此基准测试上的表现结果

模型名称	BLEU score	Hardware Burden	Operations per network pass	Paper Title	Repository
Transformer Big + adversarial MLE	29.52			Improving Neural Language Modeling via Adversarial Training	-
MAT	-	-	-	Multi-branch Attentive Transformer	-
AdvAug (aut+adv)	29.57	-	-	AdvAug: Robust Adversarial Augmentation for Neural Machine Translation	-
CMLM+LAT+4 iterations	27.35			Incorporating a Local Translation Mechanism into Non-autoregressive Translation	-
FlowSeq-large (IWD n = 15)	22.94			FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow	-
Transformer (ADMIN init)	30.1	-	-	Very Deep Transformers for Neural Machine Translation	-
MUSE(Parallel Multi-scale Attention)	29.9			MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning	-
Transformer-DRILL Base	28.1			Deep Residual Output Layers for Neural Language Generation	-
Transformer Big with FRAGE	29.11			FRAGE: Frequency-Agnostic Word Representation	-
GLAT	25.21	-	-	Glancing Transformer for Non-Autoregressive Neural Machine Translation	-
PartialFormer	29.56	-	-	PartialFormer: Modeling Part Instead of Whole for Machine Translation	-
Bi-SimCut	30.78	-	-	Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation	-
Transformer + SRU	28.4	34G		Simple Recurrent Units for Highly Parallelizable Recurrence	-
PBMT	20.7			-	-
Local Joint Self-attention	29.7			Joint Source-Target Self Attention with Locality Constraints	-
Lite Transformer	26.5	-	-	Lite Transformer with Long-Short Range Attention	-
Average Attention Network (w/o FFN)	26.05	-	-	Accelerating Neural Transformer via an Average Attention Network	-
Unsupervised NMT + Transformer	17.16			Phrase-Based & Neural Unsupervised Machine Translation	-
KERMIT	28.7			KERMIT: Generative Insertion-Based Modeling for Sequences	-
T2R + Pretrain	28.7			Finetuning Pretrained Transformers into RNNs	-

0 of 91 row(s) selected.