Noise removal using TF-IDF criterion for extracting patent keyword

Jongchan Kim, Dohan Choe, Gabjo Kim, Sangsung Park, Dong Sik Jang

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).

Original languageEnglish
Pages (from-to)61-69
Number of pages9
JournalAdvances in Intelligent Systems and Computing
Volume271
DOIs
Publication statusPublished - 2014

Fingerprint

Electric network analysis
Planning
Costs
Industry

Keywords

  • Extraction
  • Patent analysis
  • Text mining
  • TF-IDF

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science(all)

Cite this

Noise removal using TF-IDF criterion for extracting patent keyword. / Kim, Jongchan; Choe, Dohan; Kim, Gabjo; Park, Sangsung; Jang, Dong Sik.

In: Advances in Intelligent Systems and Computing, Vol. 271, 2014, p. 61-69.

Research output: Contribution to journalArticle

Kim, Jongchan ; Choe, Dohan ; Kim, Gabjo ; Park, Sangsung ; Jang, Dong Sik. / Noise removal using TF-IDF criterion for extracting patent keyword. In: Advances in Intelligent Systems and Computing. 2014 ; Vol. 271. pp. 61-69.
@article{a2e372e96dd84c61acc87e75790d966b,
title = "Noise removal using TF-IDF criterion for extracting patent keyword",
abstract = "These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).",
keywords = "Extraction, Patent analysis, Text mining, TF-IDF",
author = "Jongchan Kim and Dohan Choe and Gabjo Kim and Sangsung Park and Jang, {Dong Sik}",
year = "2014",
doi = "10.1007/978-3-319-05527-5_7",
language = "English",
volume = "271",
pages = "61--69",
journal = "Advances in Intelligent Systems and Computing",
issn = "2194-5357",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Noise removal using TF-IDF criterion for extracting patent keyword

AU - Kim, Jongchan

AU - Choe, Dohan

AU - Kim, Gabjo

AU - Park, Sangsung

AU - Jang, Dong Sik

PY - 2014

Y1 - 2014

N2 - These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).

AB - These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).

KW - Extraction

KW - Patent analysis

KW - Text mining

KW - TF-IDF

UR - http://www.scopus.com/inward/record.url?scp=84927618498&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84927618498&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-05527-5_7

DO - 10.1007/978-3-319-05527-5_7

M3 - Article

VL - 271

SP - 61

EP - 69

JO - Advances in Intelligent Systems and Computing

JF - Advances in Intelligent Systems and Computing

SN - 2194-5357

ER -