Benchmarking Diffusion Models for Machine Translation

Yunus Demirag, Danni Liu, Jan Niehues


Abstract
Diffusion models have recently shown great potential on many generative tasks.In this work, we explore diffusion models for machine translation (MT).We adapt two prominent diffusion-based text generation models, Diffusion-LM and DiffuSeq, to perform machine translation.As the diffusion models generate non-autoregressively (NAR),we draw parallels to NAR machine translation models.With a comparison to conventional Transformer-based translation models, as well as to the Levenshtein Transformer,an established NAR MT model,we show that the multimodality problem that limits NAR machine translation performance is also a challenge to diffusion models.We demonstrate that knowledge distillation from an autoregressive model improves the performance of diffusion-based MT.A thorough analysis on the translation quality of inputs of different lengths shows that the diffusion models struggle more on long-range dependencies than other models.
Anthology ID:
2024.eacl-srw.25
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Neele Falk, Sara Papi, Mike Zhang
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
313–324
Language:
URL:
https://aclanthology.org/2024.eacl-srw.25
DOI:
Bibkey:
Cite (ACL):
Yunus Demirag, Danni Liu, and Jan Niehues. 2024. Benchmarking Diffusion Models for Machine Translation. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 313–324, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Benchmarking Diffusion Models for Machine Translation (Demirag et al., EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-srw.25.pdf
Video:
 https://aclanthology.org/2024.eacl-srw.25.mp4