TY - GEN
T1 - From small-scale to large-scale text classification
AU - Kim, Kang Min
AU - Kim, Yeachan
AU - Lee, Jungho
AU - Lee, Ji Min
AU - Lee, Sang Keun
N1 - Funding Information:
(b) “Fish Oil Claims Not Supported by Research. Fish oil is now the third most widely used dietary supplement in the United States, after vitamins and minerals, according to a recent report from the National Institutes of Health. At least 10 percent of Americans take fish oil regularly, most believing that the omega-3 fatty acids in the supplements will protect their cardiovascular health.” in NYT dataset
Publisher Copyright:
© 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - Neural network models have achieved impressive results in the field of text classification. However, existing approaches often suffer from insufficient training data in a large-scale text classification involving a large number of categories (e.g., several thousands of categories). Several neural network models have utilized multi-task learning to overcome the limited amount of training data. However, these approaches are also limited to small-scale text classification. In this paper, we propose a novel neural network-based multi-task learning framework for large-scale text classification. To this end, we first treat the different scales of text classification (i.e., large and small numbers of categories) as multiple, related tasks. Then, we train the proposed neural network, which learns small- and large-scale text classification tasks simultaneously. In particular, we further enhance this multi-task learning architecture by using a gate mechanism, which controls the flow of features between the small- and large-scale text classification tasks. Experimental results clearly show that our proposed model improves the performance of the large-scale text classification task with the help of the small-scale text classification task. The proposed scheme exhibits significant improvements of as much as 14% and 5% in terms of micro-averaging and macro-averaging F1-score, respectively, over state-of-the-art techniques.
AB - Neural network models have achieved impressive results in the field of text classification. However, existing approaches often suffer from insufficient training data in a large-scale text classification involving a large number of categories (e.g., several thousands of categories). Several neural network models have utilized multi-task learning to overcome the limited amount of training data. However, these approaches are also limited to small-scale text classification. In this paper, we propose a novel neural network-based multi-task learning framework for large-scale text classification. To this end, we first treat the different scales of text classification (i.e., large and small numbers of categories) as multiple, related tasks. Then, we train the proposed neural network, which learns small- and large-scale text classification tasks simultaneously. In particular, we further enhance this multi-task learning architecture by using a gate mechanism, which controls the flow of features between the small- and large-scale text classification tasks. Experimental results clearly show that our proposed model improves the performance of the large-scale text classification task with the help of the small-scale text classification task. The proposed scheme exhibits significant improvements of as much as 14% and 5% in terms of micro-averaging and macro-averaging F1-score, respectively, over state-of-the-art techniques.
KW - Deep Neural Networks
KW - Large-scale Text Classification
KW - Multi-task Learning
UR - http://www.scopus.com/inward/record.url?scp=85066910847&partnerID=8YFLogxK
U2 - 10.1145/3308558.3313563
DO - 10.1145/3308558.3313563
M3 - Conference contribution
AN - SCOPUS:85066910847
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 853
EP - 862
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -