Sentence boundary detection in Turkish


DINCER B. , Karaoglan B.

ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, vol.3261, pp.255-262, 2004 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 3261
  • Publication Date: 2004
  • Doi Number: 10.1007/978-3-540-30198-1_26
  • Title of Journal : ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS
  • Page Numbers: pp.255-262

Abstract

In this paper, we describe a solution method for sentence boundary detection in Turkish. The method exploits simple heuristic knowledge of Turkish syllabication and its phonetic rules for disambiguation of dots. The test accuracy of the algorithm is measured as 96.02%. The main contribution of this study is considered as presenting a new lexicon free method for differentiating EOS (end of sentence) dots from the ones that are used for other purposes.