TY - GEN
T1 - Improving Open Directory Project-Based Text Classification with Hierarchical Category Embedding
AU - Lee, Ji Min
AU - Kim, Kang Min
AU - Kim, Yeachan
AU - Lee, Sang-Geun
N1 - Funding Information:
This research was supported in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (number 2015R1A2A1A10052665). This research was also in part supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2018-2016-0-00464) supervised by the IITP(Institute for Information & communications Technology Promotion).
Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/4
Y1 - 2018/10/4
N2 - Many works have used knowledge bases that contain taxonomy of hierarchically structured categories for large-scale text classification. These works have utilized hierarchical taxonomies based on the explicit representation model. They demonstrated that the explicit representation model provides a stable performance for large-scale text classification. However, this performance is limited to the knowledge base. In this paper, we integrate the implicit representation model, which has the ability to use external knowledge indirectly, with previous large-scale text classification. To this end, we first propose Hierarchical Category embedding (HC embedding) to generate distributed representations of hierarchical categories based on the implicit representation model. Second, we develop a new semantic similarity method to integrate HC embedding with the large-scale text classification. To demonstrate efficacy, we apply the proposed methodology to Open Directory Project (ODP)-based text classification, which has a hierarchical taxonomy. The evaluation results demonstrate that the proposed method outperforms the current state-of-the-art method by 7.4 %, 7.0 %, and 18 % in terms of micro-averaging F1-score, macro-averaging F1-score, and precision at k, respectively.
AB - Many works have used knowledge bases that contain taxonomy of hierarchically structured categories for large-scale text classification. These works have utilized hierarchical taxonomies based on the explicit representation model. They demonstrated that the explicit representation model provides a stable performance for large-scale text classification. However, this performance is limited to the knowledge base. In this paper, we integrate the implicit representation model, which has the ability to use external knowledge indirectly, with previous large-scale text classification. To this end, we first propose Hierarchical Category embedding (HC embedding) to generate distributed representations of hierarchical categories based on the implicit representation model. Second, we develop a new semantic similarity method to integrate HC embedding with the large-scale text classification. To demonstrate efficacy, we apply the proposed methodology to Open Directory Project (ODP)-based text classification, which has a hierarchical taxonomy. The evaluation results demonstrate that the proposed method outperforms the current state-of-the-art method by 7.4 %, 7.0 %, and 18 % in terms of micro-averaging F1-score, macro-averaging F1-score, and precision at k, respectively.
KW - Artificial neural networks
KW - Embedding
KW - Knowledge manipulations
KW - Knowledge representation
UR - http://www.scopus.com/inward/record.url?scp=85056463040&partnerID=8YFLogxK
U2 - 10.1109/ICCI-CC.2018.8482008
DO - 10.1109/ICCI-CC.2018.8482008
M3 - Conference contribution
AN - SCOPUS:85056463040
T3 - Proceedings of 2018 IEEE 17th International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2018
SP - 246
EP - 253
BT - Proceedings of 2018 IEEE 17th International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2018
A2 - Howard, Newton
A2 - Kwong, Sam
A2 - Wang, Yingxu
A2 - Feldman, Jerome
A2 - Widrow, Bernard
A2 - Sheu, Phillip
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2018
Y2 - 16 July 2018 through 18 July 2018
ER -