Using binary classification to evaluate the quality of machine translators

10.48129/kjs.splml.19547

Authors

  • Ran Li chool of Computer and Information Technology, Xinyang Normal University, Xinyang, China
  • Yihao Yang chool of Computer and Information Technology, Xinyang Normal University, Xinyang, China
  • Kelin Shen School of Foreign Languages, Xinyang Agriculture and Forestry University, Xinyang, China
  • Mohammad Hijji Industrial Innovation and Robotic Center (IIRC), University of Tabuk, Tabuk 47711, Saudi Arabia

DOI:

https://doi.org/10.48129/kjs.splml.19547

Abstract

Machine translator becomes increasingly popular and plays an important role nowadays because of its great assistance in cross-cultural communication. However, the machine translator often produces some unnatural texts, an evaluation to machine translator is thus needed to avoid the abuse of machine-translated texts. This paper presents the use of binary classification to evaluate the quality of machine translator without references. First, we construct a large-scale dataset including human-generated texts and machine-translated texts. Second, the dataset is used to train the multiple binary classifiers, e.g., decision tree, random forest, extreme gradient boosting, support vector machines, logistic regression, etc. Finally, these trained classifiers constitute the ensemble model by majority voting, and this ensemble model is used to evaluate the qualities of machine-translated texts. Experimental results show that the proposed evaluation method better measures the qualities of translated texts by some commercial machine translators.

Published

22-06-2022

Issue

Section

Special Issue on Machine Learning (CS)