A model for extracting keywords of document using term frequency and distribution

Jae W. Lee, Doo Kwon Baik

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

In information retrieval systems, it is very important that indexing is defined very well by appropriate terms about documents. In this paper, we propose a simple retrieval model based on terms distribution characteristics besides term frequency in documents. We define the keywords distribution characteristics using a statistics, standard deviation. We can extract document keywords that term frequency is great and standard deviation is great. And if term frequency is great and standard deviation is small, the terms can be defined as paragraph keywords. Applying our proposed retrieval model we can search many documents or knowledge using the document keywords and paragraph keywords.

Original languageEnglish
Pages (from-to)437-440
Number of pages4
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2945
Publication statusPublished - 2004 Dec 1

Fingerprint

Information Systems
Information retrieval systems
Term
Standard deviation
Statistics
Retrieval
Model
Indexing
Information Retrieval
Model-based

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Computer Science(all)
  • Theoretical Computer Science

Cite this

@article{ff3b9b4aac9e426dbbc0974e94dc3554,
title = "A model for extracting keywords of document using term frequency and distribution",
abstract = "In information retrieval systems, it is very important that indexing is defined very well by appropriate terms about documents. In this paper, we propose a simple retrieval model based on terms distribution characteristics besides term frequency in documents. We define the keywords distribution characteristics using a statistics, standard deviation. We can extract document keywords that term frequency is great and standard deviation is great. And if term frequency is great and standard deviation is small, the terms can be defined as paragraph keywords. Applying our proposed retrieval model we can search many documents or knowledge using the document keywords and paragraph keywords.",
author = "Lee, {Jae W.} and Baik, {Doo Kwon}",
year = "2004",
month = "12",
day = "1",
language = "English",
volume = "2945",
pages = "437--440",
journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - A model for extracting keywords of document using term frequency and distribution

AU - Lee, Jae W.

AU - Baik, Doo Kwon

PY - 2004/12/1

Y1 - 2004/12/1

N2 - In information retrieval systems, it is very important that indexing is defined very well by appropriate terms about documents. In this paper, we propose a simple retrieval model based on terms distribution characteristics besides term frequency in documents. We define the keywords distribution characteristics using a statistics, standard deviation. We can extract document keywords that term frequency is great and standard deviation is great. And if term frequency is great and standard deviation is small, the terms can be defined as paragraph keywords. Applying our proposed retrieval model we can search many documents or knowledge using the document keywords and paragraph keywords.

AB - In information retrieval systems, it is very important that indexing is defined very well by appropriate terms about documents. In this paper, we propose a simple retrieval model based on terms distribution characteristics besides term frequency in documents. We define the keywords distribution characteristics using a statistics, standard deviation. We can extract document keywords that term frequency is great and standard deviation is great. And if term frequency is great and standard deviation is small, the terms can be defined as paragraph keywords. Applying our proposed retrieval model we can search many documents or knowledge using the document keywords and paragraph keywords.

UR - http://www.scopus.com/inward/record.url?scp=35048902835&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35048902835&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:35048902835

VL - 2945

SP - 437

EP - 440

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SN - 0302-9743

ER -