Towards language-independent sentence boundary detection

Do G. Lee, Hae-Chang Rim

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

We propose a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as Part-of-Speech (POS) information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages.

Original languageEnglish
Pages (from-to)142-145
Number of pages4
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2945
Publication statusPublished - 2004 Dec 1

Fingerprint

Boundary Detection
Learning systems
Language
Experiments
Abbreviation
Names
Machine Learning
Heuristics
Experiment

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Computer Science(all)
  • Theoretical Computer Science

Cite this

@article{7263dbe3b39043df89e82c75337187d7,
title = "Towards language-independent sentence boundary detection",
abstract = "We propose a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as Part-of-Speech (POS) information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages.",
author = "Lee, {Do G.} and Hae-Chang Rim",
year = "2004",
month = "12",
day = "1",
language = "English",
volume = "2945",
pages = "142--145",
journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Towards language-independent sentence boundary detection

AU - Lee, Do G.

AU - Rim, Hae-Chang

PY - 2004/12/1

Y1 - 2004/12/1

N2 - We propose a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as Part-of-Speech (POS) information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages.

AB - We propose a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as Part-of-Speech (POS) information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages.

UR - http://www.scopus.com/inward/record.url?scp=35048899332&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35048899332&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:35048899332

VL - 2945

SP - 142

EP - 145

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SN - 0302-9743

ER -