Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models

Jung Tae Lee, Sang Bum Kim, Young In Song, Hae-Chang Rim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Citations (Scopus)

Abstract

Lexical gaps between queries and questions (documents) have been a major issue in question retrieval on large online question and answer (Q&A) collections. Previous studies address the issue by implicitly expanding queries with the help of translation models pre-constructed using statistical techniques. However, since it is possible for unimportant words (e.g., non-topical words, common words) to be included in the translation models, a lack of noise control on the models can cause degradation of retrieval performance. This paper investigates a number of empirical methods for eliminating unimportant words in order to construct compact translation models for retrieval purposes. Experiments conducted on a real world Q&A collection show that substantial improvements in retrieval performance can be achieved by using compact translation models.

Original languageEnglish
Title of host publicationEMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL
Pages410-418
Number of pages9
Publication statusPublished - 2008 Dec 1
Event2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation - Honolulu, HI, United States
Duration: 2008 Oct 252008 Oct 27

Other

Other2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation
CountryUnited States
CityHonolulu, HI
Period08/10/2508/10/27

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint Dive into the research topics of 'Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models'. Together they form a unique fingerprint.

  • Cite this

    Lee, J. T., Kim, S. B., Song, Y. I., & Rim, H-C. (2008). Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models. In EMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL (pp. 410-418)