Comparative analysis of term distributions in a sentence and in a document for sentence retrieval

Kyoung S. Han, Hae-Chang Rim

Research output: Contribution to journalArticle

Abstract

Most of previous works of finding relevant sentences applied document retrieval models to sentence retrieval. However, the performance was very poor. This paper analyzes the reason of this poor performance by comparing term statistics in a document with those in a sentence. The analysis shows that the distribution of within-document and within-sentence term frequency is not similar, and the distribution of document frequency is similar to that of sentence frequency. Considering the discrepancy between the term statistics, it is not appropriate that document retrieval models, as they stand, are applied to sentence retrieval.

Original languageEnglish
Pages (from-to)484-487
Number of pages4
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2945
Publication statusPublished - 2004 Dec 1

    Fingerprint

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Computer Science(all)
  • Theoretical Computer Science

Cite this