Semantic Hashtag Relation Classification Using Co-occurrence Word Information

Sungwon Seo, Jong-Kook Kim, Sung Il Kim, Jeewoo Kim, Joongheon Kim

Research output: Contribution to journalArticle

Abstract

Users using social networking service (SNS) may express their thoughts and feelings using simple hashtags. Hashtags are related to other hashtags and images that are used together in the user’s other posts. Understanding the meaning of personal hashtags can be a way to learn latent semantic expressions of personal words. Existing methods for learning and analyzing semantics such as Latent Semantic Analysis, Latent Dirichlet Allocation and Word Embedding need large-scale corpus to construct an elaborate model. Large-scale corpus usually consists of words that a lot of people already use. Thus, existing methods are able to catch the latent meaning of words used in general. However, it is difficult for these methods to find personal meanings of words that are used by a particular person. Because the number of words that a person use is usually very small compared to a large-scale corpus. Another reason for the difficulty is that existing methods use occurrence frequency or co-occurrence probability. Therefore, the importance or the frequency or the probability of personalized meaning may disappear because of this large difference in the number of words. In this research we focused on the classification of semantic words using a user’s hashtag data and the co-occurrence of these hashtags. The performance is evaluated and enhances previous work by 18% for Precision and more than 70% for Recall.

Original languageEnglish
Pages (from-to)1-11
Number of pages11
JournalWireless Personal Communications
DOIs
Publication statusAccepted/In press - 2018 Apr 17

Fingerprint

Semantics

Keywords

  • Hashtag
  • Information retrieval
  • Personal word vector
  • Personalized meaning
  • Semantics
  • Social networking service

ASJC Scopus subject areas

  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Semantic Hashtag Relation Classification Using Co-occurrence Word Information. / Seo, Sungwon; Kim, Jong-Kook; Kim, Sung Il; Kim, Jeewoo; Kim, Joongheon.

In: Wireless Personal Communications, 17.04.2018, p. 1-11.

Research output: Contribution to journalArticle

Seo, Sungwon ; Kim, Jong-Kook ; Kim, Sung Il ; Kim, Jeewoo ; Kim, Joongheon. / Semantic Hashtag Relation Classification Using Co-occurrence Word Information. In: Wireless Personal Communications. 2018 ; pp. 1-11.
@article{06daa03fa5c64a90b5749ca4e87e903e,
title = "Semantic Hashtag Relation Classification Using Co-occurrence Word Information",
abstract = "Users using social networking service (SNS) may express their thoughts and feelings using simple hashtags. Hashtags are related to other hashtags and images that are used together in the user’s other posts. Understanding the meaning of personal hashtags can be a way to learn latent semantic expressions of personal words. Existing methods for learning and analyzing semantics such as Latent Semantic Analysis, Latent Dirichlet Allocation and Word Embedding need large-scale corpus to construct an elaborate model. Large-scale corpus usually consists of words that a lot of people already use. Thus, existing methods are able to catch the latent meaning of words used in general. However, it is difficult for these methods to find personal meanings of words that are used by a particular person. Because the number of words that a person use is usually very small compared to a large-scale corpus. Another reason for the difficulty is that existing methods use occurrence frequency or co-occurrence probability. Therefore, the importance or the frequency or the probability of personalized meaning may disappear because of this large difference in the number of words. In this research we focused on the classification of semantic words using a user’s hashtag data and the co-occurrence of these hashtags. The performance is evaluated and enhances previous work by 18{\%} for Precision and more than 70{\%} for Recall.",
keywords = "Hashtag, Information retrieval, Personal word vector, Personalized meaning, Semantics, Social networking service",
author = "Sungwon Seo and Jong-Kook Kim and Kim, {Sung Il} and Jeewoo Kim and Joongheon Kim",
year = "2018",
month = "4",
day = "17",
doi = "10.1007/s11277-018-5745-y",
language = "English",
pages = "1--11",
journal = "Wireless Personal Communications",
issn = "0929-6212",
publisher = "Springer Netherlands",

}

TY - JOUR

T1 - Semantic Hashtag Relation Classification Using Co-occurrence Word Information

AU - Seo, Sungwon

AU - Kim, Jong-Kook

AU - Kim, Sung Il

AU - Kim, Jeewoo

AU - Kim, Joongheon

PY - 2018/4/17

Y1 - 2018/4/17

N2 - Users using social networking service (SNS) may express their thoughts and feelings using simple hashtags. Hashtags are related to other hashtags and images that are used together in the user’s other posts. Understanding the meaning of personal hashtags can be a way to learn latent semantic expressions of personal words. Existing methods for learning and analyzing semantics such as Latent Semantic Analysis, Latent Dirichlet Allocation and Word Embedding need large-scale corpus to construct an elaborate model. Large-scale corpus usually consists of words that a lot of people already use. Thus, existing methods are able to catch the latent meaning of words used in general. However, it is difficult for these methods to find personal meanings of words that are used by a particular person. Because the number of words that a person use is usually very small compared to a large-scale corpus. Another reason for the difficulty is that existing methods use occurrence frequency or co-occurrence probability. Therefore, the importance or the frequency or the probability of personalized meaning may disappear because of this large difference in the number of words. In this research we focused on the classification of semantic words using a user’s hashtag data and the co-occurrence of these hashtags. The performance is evaluated and enhances previous work by 18% for Precision and more than 70% for Recall.

AB - Users using social networking service (SNS) may express their thoughts and feelings using simple hashtags. Hashtags are related to other hashtags and images that are used together in the user’s other posts. Understanding the meaning of personal hashtags can be a way to learn latent semantic expressions of personal words. Existing methods for learning and analyzing semantics such as Latent Semantic Analysis, Latent Dirichlet Allocation and Word Embedding need large-scale corpus to construct an elaborate model. Large-scale corpus usually consists of words that a lot of people already use. Thus, existing methods are able to catch the latent meaning of words used in general. However, it is difficult for these methods to find personal meanings of words that are used by a particular person. Because the number of words that a person use is usually very small compared to a large-scale corpus. Another reason for the difficulty is that existing methods use occurrence frequency or co-occurrence probability. Therefore, the importance or the frequency or the probability of personalized meaning may disappear because of this large difference in the number of words. In this research we focused on the classification of semantic words using a user’s hashtag data and the co-occurrence of these hashtags. The performance is evaluated and enhances previous work by 18% for Precision and more than 70% for Recall.

KW - Hashtag

KW - Information retrieval

KW - Personal word vector

KW - Personalized meaning

KW - Semantics

KW - Social networking service

UR - http://www.scopus.com/inward/record.url?scp=85045446754&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045446754&partnerID=8YFLogxK

U2 - 10.1007/s11277-018-5745-y

DO - 10.1007/s11277-018-5745-y

M3 - Article

SP - 1

EP - 11

JO - Wireless Personal Communications

JF - Wireless Personal Communications

SN - 0929-6212

ER -