Deep semantic frame-based deceptive opinion spam analysis

Seongsoon Kim, Hyeokyoon Chang, Seongwoon Lee, Minhwan Yu, Jaewoo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

User-generated content is becoming increasingly valuable to both individuals and businesses due to its usefulness and influence in e-commerce markets. As consumers rely more on such information, posting deceptive opinions, which can be deliberately used for potential profit, is becoming more of an issue. Existing work on opinion spam detection focuses mainly on linguistic features such as n-grams, syntactic patterns, or LIWC. However, deep semantic analysis remains largely unstudied. In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and Tri-pAdvisor. Using our proposed semantic frame feature, we developed a classification model that outperforms the baseline model and achieves an accuracy of nearly 91%. Also, we performed qualitative analysis of deceptive and truthful review datasets and considered their semantic differences. Finally, we successfully found some interesting features that existing methods were unable to identify.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
PublisherAssociation for Computing Machinery
Pages1131-1140
Number of pages10
Volume19-23-Oct-2015
ISBN (Print)9781450337946
DOIs
Publication statusPublished - 2015 Oct 17
Event24th ACM International Conference on Information and Knowledge Management, CIKM 2015 - Melbourne, Australia
Duration: 2015 Oct 192015 Oct 23

Other

Other24th ACM International Conference on Information and Knowledge Management, CIKM 2015
CountryAustralia
CityMelbourne
Period15/10/1915/10/23

Fingerprint

Spam
Electronic commerce
Workers
Profit
Qualitative analysis
Usefulness
User-generated content
Domain knowledge
Employees

Keywords

  • Deceptive opinion spam
  • FrameNet
  • Semantic analysis

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Kim, S., Chang, H., Lee, S., Yu, M., & Kang, J. (2015). Deep semantic frame-based deceptive opinion spam analysis. In International Conference on Information and Knowledge Management, Proceedings (Vol. 19-23-Oct-2015, pp. 1131-1140). Association for Computing Machinery. https://doi.org/10.1145/2806416.2806551

Deep semantic frame-based deceptive opinion spam analysis. / Kim, Seongsoon; Chang, Hyeokyoon; Lee, Seongwoon; Yu, Minhwan; Kang, Jaewoo.

International Conference on Information and Knowledge Management, Proceedings. Vol. 19-23-Oct-2015 Association for Computing Machinery, 2015. p. 1131-1140.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, S, Chang, H, Lee, S, Yu, M & Kang, J 2015, Deep semantic frame-based deceptive opinion spam analysis. in International Conference on Information and Knowledge Management, Proceedings. vol. 19-23-Oct-2015, Association for Computing Machinery, pp. 1131-1140, 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, Australia, 15/10/19. https://doi.org/10.1145/2806416.2806551
Kim S, Chang H, Lee S, Yu M, Kang J. Deep semantic frame-based deceptive opinion spam analysis. In International Conference on Information and Knowledge Management, Proceedings. Vol. 19-23-Oct-2015. Association for Computing Machinery. 2015. p. 1131-1140 https://doi.org/10.1145/2806416.2806551
Kim, Seongsoon ; Chang, Hyeokyoon ; Lee, Seongwoon ; Yu, Minhwan ; Kang, Jaewoo. / Deep semantic frame-based deceptive opinion spam analysis. International Conference on Information and Knowledge Management, Proceedings. Vol. 19-23-Oct-2015 Association for Computing Machinery, 2015. pp. 1131-1140
@inproceedings{c683f98af2e34de3bdfff3323b1abe83,
title = "Deep semantic frame-based deceptive opinion spam analysis",
abstract = "User-generated content is becoming increasingly valuable to both individuals and businesses due to its usefulness and influence in e-commerce markets. As consumers rely more on such information, posting deceptive opinions, which can be deliberately used for potential profit, is becoming more of an issue. Existing work on opinion spam detection focuses mainly on linguistic features such as n-grams, syntactic patterns, or LIWC. However, deep semantic analysis remains largely unstudied. In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and Tri-pAdvisor. Using our proposed semantic frame feature, we developed a classification model that outperforms the baseline model and achieves an accuracy of nearly 91{\%}. Also, we performed qualitative analysis of deceptive and truthful review datasets and considered their semantic differences. Finally, we successfully found some interesting features that existing methods were unable to identify.",
keywords = "Deceptive opinion spam, FrameNet, Semantic analysis",
author = "Seongsoon Kim and Hyeokyoon Chang and Seongwoon Lee and Minhwan Yu and Jaewoo Kang",
year = "2015",
month = "10",
day = "17",
doi = "10.1145/2806416.2806551",
language = "English",
isbn = "9781450337946",
volume = "19-23-Oct-2015",
pages = "1131--1140",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Deep semantic frame-based deceptive opinion spam analysis

AU - Kim, Seongsoon

AU - Chang, Hyeokyoon

AU - Lee, Seongwoon

AU - Yu, Minhwan

AU - Kang, Jaewoo

PY - 2015/10/17

Y1 - 2015/10/17

N2 - User-generated content is becoming increasingly valuable to both individuals and businesses due to its usefulness and influence in e-commerce markets. As consumers rely more on such information, posting deceptive opinions, which can be deliberately used for potential profit, is becoming more of an issue. Existing work on opinion spam detection focuses mainly on linguistic features such as n-grams, syntactic patterns, or LIWC. However, deep semantic analysis remains largely unstudied. In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and Tri-pAdvisor. Using our proposed semantic frame feature, we developed a classification model that outperforms the baseline model and achieves an accuracy of nearly 91%. Also, we performed qualitative analysis of deceptive and truthful review datasets and considered their semantic differences. Finally, we successfully found some interesting features that existing methods were unable to identify.

AB - User-generated content is becoming increasingly valuable to both individuals and businesses due to its usefulness and influence in e-commerce markets. As consumers rely more on such information, posting deceptive opinions, which can be deliberately used for potential profit, is becoming more of an issue. Existing work on opinion spam detection focuses mainly on linguistic features such as n-grams, syntactic patterns, or LIWC. However, deep semantic analysis remains largely unstudied. In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and Tri-pAdvisor. Using our proposed semantic frame feature, we developed a classification model that outperforms the baseline model and achieves an accuracy of nearly 91%. Also, we performed qualitative analysis of deceptive and truthful review datasets and considered their semantic differences. Finally, we successfully found some interesting features that existing methods were unable to identify.

KW - Deceptive opinion spam

KW - FrameNet

KW - Semantic analysis

UR - http://www.scopus.com/inward/record.url?scp=84958246343&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958246343&partnerID=8YFLogxK

U2 - 10.1145/2806416.2806551

DO - 10.1145/2806416.2806551

M3 - Conference contribution

AN - SCOPUS:84958246343

SN - 9781450337946

VL - 19-23-Oct-2015

SP - 1131

EP - 1140

BT - International Conference on Information and Knowledge Management, Proceedings

PB - Association for Computing Machinery

ER -