Using binary classification to evaluate the quality of machine translators
10.48129/kjs.splml.19547
DOI:
https://doi.org/10.48129/kjs.splml.19547Abstract
Machine translator becomes increasingly popular and plays an important role nowadays because of its great assistance in cross-cultural communication. However, the machine translator often produces some unnatural texts, an evaluation to machine translator is thus needed to avoid the abuse of machine-translated texts. This paper presents the use of binary classification to evaluate the quality of machine translator without references. First, we construct a large-scale dataset including human-generated texts and machine-translated texts. Second, the dataset is used to train the multiple binary classifiers, e.g., decision tree, random forest, extreme gradient boosting, support vector machines, logistic regression, etc. Finally, these trained classifiers constitute the ensemble model by majority voting, and this ensemble model is used to evaluate the qualities of machine-translated texts. Experimental results show that the proposed evaluation method better measures the qualities of translated texts by some commercial machine translators.