Discovering high-quality threaded discussions in online forums

Jung Tae Lee, Min Chul Yang, Hae-Chang Rim

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Archives of threaded discussions generated by users in online forums and discussion boards contain valuable knowledge on various topics. However, not all threads are useful because of deliberate abuses, such as trolling and flaming, that are commonly observed in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every discussion thread stored online contains high-quality contents. Although finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then propose two tasks involving prediction of thread quality without using post rating information. We adopt a popular machine learning framework to solve the two prediction tasks. Experimental results on a real world forum archive demonstrate that our method can significantly improve the prediction performance across all four measures of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites.

Original languageEnglish
Pages (from-to)519-531
Number of pages13
JournalJournal of Computer Science and Technology
Volume29
Issue number3
DOIs
Publication statusPublished - 2014 Jan 1

Fingerprint

Thread
Search engines
Learning systems
Prediction
Performance Prediction
Expertise
Search Engine
Machine Learning
Resources
Experimental Results

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Computational Theory and Mathematics
  • Theoretical Computer Science
  • Computer Science Applications

Cite this

Discovering high-quality threaded discussions in online forums. / Lee, Jung Tae; Yang, Min Chul; Rim, Hae-Chang.

In: Journal of Computer Science and Technology, Vol. 29, No. 3, 01.01.2014, p. 519-531.

Research output: Contribution to journalArticle

@article{353ddd9492a64c8f9ec2536dd2316fb3,
title = "Discovering high-quality threaded discussions in online forums",
abstract = "Archives of threaded discussions generated by users in online forums and discussion boards contain valuable knowledge on various topics. However, not all threads are useful because of deliberate abuses, such as trolling and flaming, that are commonly observed in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every discussion thread stored online contains high-quality contents. Although finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then propose two tasks involving prediction of thread quality without using post rating information. We adopt a popular machine learning framework to solve the two prediction tasks. Experimental results on a real world forum archive demonstrate that our method can significantly improve the prediction performance across all four measures of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites.",
keywords = "discussion board, online forum, thread quality",
author = "Lee, {Jung Tae} and Yang, {Min Chul} and Hae-Chang Rim",
year = "2014",
month = "1",
day = "1",
doi = "10.1007/s11390-014-1446-5",
language = "English",
volume = "29",
pages = "519--531",
journal = "Journal of Computer Science and Technology",
issn = "1000-9000",
publisher = "Springer New York",
number = "3",

}

TY - JOUR

T1 - Discovering high-quality threaded discussions in online forums

AU - Lee, Jung Tae

AU - Yang, Min Chul

AU - Rim, Hae-Chang

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Archives of threaded discussions generated by users in online forums and discussion boards contain valuable knowledge on various topics. However, not all threads are useful because of deliberate abuses, such as trolling and flaming, that are commonly observed in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every discussion thread stored online contains high-quality contents. Although finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then propose two tasks involving prediction of thread quality without using post rating information. We adopt a popular machine learning framework to solve the two prediction tasks. Experimental results on a real world forum archive demonstrate that our method can significantly improve the prediction performance across all four measures of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites.

AB - Archives of threaded discussions generated by users in online forums and discussion boards contain valuable knowledge on various topics. However, not all threads are useful because of deliberate abuses, such as trolling and flaming, that are commonly observed in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every discussion thread stored online contains high-quality contents. Although finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then propose two tasks involving prediction of thread quality without using post rating information. We adopt a popular machine learning framework to solve the two prediction tasks. Experimental results on a real world forum archive demonstrate that our method can significantly improve the prediction performance across all four measures of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites.

KW - discussion board

KW - online forum

KW - thread quality

UR - http://www.scopus.com/inward/record.url?scp=84901683000&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84901683000&partnerID=8YFLogxK

U2 - 10.1007/s11390-014-1446-5

DO - 10.1007/s11390-014-1446-5

M3 - Article

VL - 29

SP - 519

EP - 531

JO - Journal of Computer Science and Technology

JF - Journal of Computer Science and Technology

SN - 1000-9000

IS - 3

ER -