Some Propositions to Improve the Prediction Capability of Word Confidence Estimation for Machine Translation
Main Article Content
Abstract
Word Confidence Estimation (WCE) is the task of predicting the correct and incorrect words in the MT output. Dealing with this problem, this paper proposes some ideas to build a binary estimator and then enhance its prediction capability. We integrate a number of features of various types (system based, lexical, syntactic and semantic) into the conventional feature set, to build our classifier. After the experiment with all features, we deploy a “Feature Selection†strategy to filter the best performing ones. Next, we propose a method that combines multiple “weak†classifiers to build a strong “composite†classifier by taking advantage of their complementarity. Experimental results show that our propositions helped to achieve a better performance in term of F-score. Finally, we test whether WCE output can help to improve the sentence level confidence estimation system.