SEMANTIC ERRORS IN MACHINE TRANSLATION SYSTEMS AND THEIR CAUSES
Keywords:
machine translation; semantic errors; adequacy; hallucination; word sense disambiguation; omissions; additions; domain shift; evaluation metrics; post-editingAbstract
Semantic errors in machine translation (MT) occur when the output is fluent but fails to preserve meaning—by mistranslating word senses, omitting or adding content, or generating “hallucinated” information not supported by the source. Such errors are especially problematic because they can look grammatically perfect while being semantically wrong, making them hard to detect during post-editing. Research on neural machine translation (NMT) highlights that adequacy problems such as omissions and additions can appear in otherwise fluent output, masking meaning loss. Studies on hallucinations show that NMT can produce translations “untethered” from the input, sometimes triggered by rare tokens or distribution shifts. This article explains the main types of semantic MT errors, links them to underlying causes (lexical ambiguity, domain shift, data noise, decoding behavior, and model uncertainty), and illustrates them with short examples followed by clarifying commentary. It also summarizes evaluation practices (MQM-style categories and adequacy-focused learned metrics) and argues for targeted quality checks beyond surface fluency.
References
Agarwal, A., et al. (n.d.). Hallucinations in neural machine translation. Google Research.
Gupta, P., et al. (2021). Detecting over/under-translation errors for determining translation adequacy. arXiv.
Guerreiro, N. M., et al. (2024). xCOMET: Transparent machine translation evaluation. TACL/ACL Anthology.
Lee, K., et al. (2018). Hallucinations in neural machine translation. OpenReview (PDF).
Popović, M. (2021). On nature and causes of observed MT errors. ACL Anthology (MT Summit).
ScienceDirect Topics. (n.d.). Machine translation—overview (lexical ambiguity and WSD challenge).
Ustaszewski, M. (2019). Exploring adequacy errors in neural machine translation. ACL Anthology (PDF). Vardaro, J., et al. (2019). Translation quality and error recognition in professional contexts (MQM categories). MDPI.
Wan, Y., et al. (2022). Challenges of neural machine translation for short texts. Computational Linguistics (MIT Press).
Xu, W., et al. (2023). Understanding and detecting hallucinations in neural machine translation. TACL (MIT Press).
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.