Preventing translation quality deterioration caused by beam search decoding in neural machine translation using statistical machine translation


Satir E., BULUT H.

INFORMATION SCIENCES, vol.581, pp.791-807, 2021 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 581
  • Publication Date: 2021
  • Doi Number: 10.1016/j.ins.2021.10.006
  • Title of Journal : INFORMATION SCIENCES
  • Page Numbers: pp.791-807
  • Keywords: Neural machine translation, Statistical machine translation, Decoding, Beam search

Abstract

Decoding is an important part of machine translation systems, and the most popular inference algorithm used here is beam search. Beam search algorithm improves translation by allowing a larger search space to be traversed than greedy search. However, as the beam width increases, the translation performance declines after a certain point in neural machine translation (NMT). This problem is usually not observed in statistical machine translation (SMT) due to the decoding method. This paper proposes a hybrid system based method that uses SMT predictions to prevent quality deterioration in the beam search algorithm used in NMT decoding. Our approach is based on the reranking n-best list of NMT according to the SMT system translation sentence. We propose two different algorithms for reranking NMT n-best lists. The first algorithm uses the length information of the SMT outputs. In contrast, the second uses a word-based similarity approach with the Jaccard Index, the Dice's Coefficient, and the Overlap Coefficient. Experiments on three different language pairs show that the method we propose prevents the decrease in translation quality and produces a gain of 1.3 BLEU and 1.6 METEOR for different beam sizes and 1.8 BLEU and 2.1 METEOR average scores compared to the baseline results. (c) 2021 Published by Elsevier Inc.