Long Hai Trieu, Thai Phuong Nguyen

Main Article Content

Abstract

The sentence alignment approach proposed by Moore (2002) (M-Align) is an effective method which gets a relatively high performance based on combination of length-based and word correspondences. Nevertheless, despite the high precision, M-Align usually gets a low recall especially when dealing with the sparse data problem. We have proposed an algorithm which not only exploit advantages of M-Align but overcomes the weakness of this baseline method by using a new feature in sentence alignment, word clustering. The effectiveness of this proposal is illustrated by results of experiments in both highly recall and reasonable precision rates.