24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Türkiye, 16 - 19 Mayıs 2016, ss.1393-1396
Identification of paraphrase sentence pairs becomes increasingly prominent in natural language processing area (e.g plagiarism detection, summarization, machine translation). In this study, it is proposed to employ information gain measure in determining the value-ranges of the paraphrase classification features on the renown paraphrase corpus of Microsoft Research (MSRP). The classification performances of value-ranges that are determined by information gain measure and an alternative heuristic method are compared by the use of Bayes classifier. The results show that the proposed method performs better than the heuristic method.