HyperAI
HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
机器翻译
Machine Translation On Wmt2014 English German
Machine Translation On Wmt2014 English German
评估指标
BLEU score
Hardware Burden
Operations per network pass
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
BLEU score
Hardware Burden
Operations per network pass
Paper Title
Repository
Transformer Big + adversarial MLE
29.52
Improving Neural Language Modeling via Adversarial Training
-
MAT
-
-
-
Multi-branch Attentive Transformer
-
AdvAug (aut+adv)
29.57
-
-
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
-
CMLM+LAT+4 iterations
27.35
Incorporating a Local Translation Mechanism into Non-autoregressive Translation
-
FlowSeq-large (IWD n = 15)
22.94
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
-
Transformer (ADMIN init)
30.1
-
-
Very Deep Transformers for Neural Machine Translation
-
MUSE(Parallel Multi-scale Attention)
29.9
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
-
Transformer-DRILL Base
28.1
Deep Residual Output Layers for Neural Language Generation
-
Transformer Big with FRAGE
29.11
FRAGE: Frequency-Agnostic Word Representation
-
GLAT
25.21
-
-
Glancing Transformer for Non-Autoregressive Neural Machine Translation
-
PartialFormer
29.56
-
-
PartialFormer: Modeling Part Instead of Whole for Machine Translation
-
Bi-SimCut
30.78
-
-
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
-
Transformer + SRU
28.4
34G
Simple Recurrent Units for Highly Parallelizable Recurrence
-
PBMT
20.7
-
-
Local Joint Self-attention
29.7
Joint Source-Target Self Attention with Locality Constraints
-
Lite Transformer
26.5
-
-
Lite Transformer with Long-Short Range Attention
-
Average Attention Network (w/o FFN)
26.05
-
-
Accelerating Neural Transformer via an Average Attention Network
-
Unsupervised NMT + Transformer
17.16
Phrase-Based & Neural Unsupervised Machine Translation
-
KERMIT
28.7
KERMIT: Generative Insertion-Based Modeling for Sequences
-
T2R + Pretrain
28.7
Finetuning Pretrained Transformers into RNNs
-
0 of 91 row(s) selected.
Previous
Next