Bringing bag-of-phrases to ODP-based text classification

Haeyong Shin, Byung Gul Ryu, Woo Jong Ryu, Geunjae Lee, Sang-Geun Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The Open Directory Project (ODP) is a large scale, high quality and publicly available web directory. Many studies and real-world applications build on an ODP-based classifier. However, existing approaches use traditional bag-of-words representation of text to develop an ODP-based classifier and words alone do not always provide atomic units of semantic meaning. In this paper, we propose a novel framework to better understand the semantic meaning of text by bringing bag-of-phrases to ODP-based text classification. The proposed method employs a syntactic tree to extract phrases from ODP and applies a phrase selection method to alleviate the high dimensionality problem of bag-of-phrases. The conducted evaluation results demonstrate that our approach outperforms the state-of-the-art methods in classification performance.

Original languageEnglish
Title of host publication2016 International Conference on Big Data and Smart Computing, BigComp 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages485-488
Number of pages4
ISBN (Print)9781467387965
DOIs
Publication statusPublished - 2016 Mar 3
EventInternational Conference on Big Data and Smart Computing, BigComp 2016 - Hong Kong, China
Duration: 2016 Jan 182016 Jan 20

Other

OtherInternational Conference on Big Data and Smart Computing, BigComp 2016
CountryChina
CityHong Kong
Period16/1/1816/1/20

Keywords

  • open directory project
  • syntactic structure
  • text classification
  • text mining

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Fingerprint Dive into the research topics of 'Bringing bag-of-phrases to ODP-based text classification'. Together they form a unique fingerprint.

  • Cite this

    Shin, H., Ryu, B. G., Ryu, W. J., Lee, G., & Lee, S-G. (2016). Bringing bag-of-phrases to ODP-based text classification. In 2016 International Conference on Big Data and Smart Computing, BigComp 2016 (pp. 485-488). [7425975] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BIGCOMP.2016.7425975