Simple weighting techniques for query expansion in biomedical document retrieval

Young In Song, Kyoung Soo Han, So Young Park, Sang Bum Kim, Hae Chang Rim

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.

Original languageEnglish
Pages (from-to)1873-1876
Number of pages4
JournalIEICE Transactions on Information and Systems
VolumeE90-D
Issue number11
DOIs
Publication statusPublished - 2007 Nov

Keywords

  • Biomedical document retrieval
  • Biomedical terminology
  • Biomedical terminology weighting
  • Query expansion

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Simple weighting techniques for query expansion in biomedical document retrieval'. Together they form a unique fingerprint.

Cite this