TY - JOUR
T1 - AI for Patents
T2 - A Novel Yet Effective and Efficient Framework for Patent Analysis
AU - Son, Junyoung
AU - Moon, Hyeonseok
AU - Lee, Jeongwoo
AU - Lee, Seolhwa
AU - Park, Chanjun
AU - Jung, Wonkyung
AU - Lim, Heuiseok
N1 - Funding Information:
This work was supported in part by the Ministry of Science and ICT (MSIT), South Korea, under the Information Technology Research Center (ITRC) Support Program, Supervised by the Institute for Information and Communications Technology Planning and Evaluation (IITP), under Grant IITP-2018-0-01405; in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant through the Korea Government (MSIT), a Neural-Symbolic Model for Knowledge Acquisition and Inference Techniques, under Grant 2020-0-00368; and in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF), Ministry of Education, under Grant NRF-2021R1A6A1A03045425.
Publisher Copyright:
© 2013 IEEE.
PY - 2022
Y1 - 2022
N2 - Patents provide inventors exclusive rights to their inventions by protecting their intellectual property rights. However, analyzing patent documents generally requires knowledge of various fields, considerable human labor, and expertise. Recent studies to alleviate this problem on patent analysis deal only with the analysis of claims and abstract parts, neglecting the descriptions that contain essential technical cores. Moreover, few studies use a deep learning approach to handle the entire patent analysis process, including preprocessing, summarization, and key-phrase generation. Therefore, we propose a novel multi-stage framework that can aid in analyzing patent documents by using the description part of the patent rather than abstracts or claims with deep learning. The framework comprises two stages: key-sentence extraction and key-phrase generation tasks. These stages are based on the T5 model structure, transformer-based architecture that uses a text-to-text approach. To further improve the framework's performance, we employed two key factors: i) post-training the model with a patent-related raw corpus for encouraging the model's comprehension of the patent domain, and ii) utilizing a text rank algorithm for efficient training based on the priority score of each sentence. We verified that our key-phrase generation method of the framework shows higher performance in both superficial and semantic evaluation than other extraction methods. In addition, we provided the validity and effectiveness of our methods through quantitative and qualitative analysis, demonstrating the practical functionality of our methods. We also provided a practical contribution to the patent analysis by releasing the framework as a demo system.
AB - Patents provide inventors exclusive rights to their inventions by protecting their intellectual property rights. However, analyzing patent documents generally requires knowledge of various fields, considerable human labor, and expertise. Recent studies to alleviate this problem on patent analysis deal only with the analysis of claims and abstract parts, neglecting the descriptions that contain essential technical cores. Moreover, few studies use a deep learning approach to handle the entire patent analysis process, including preprocessing, summarization, and key-phrase generation. Therefore, we propose a novel multi-stage framework that can aid in analyzing patent documents by using the description part of the patent rather than abstracts or claims with deep learning. The framework comprises two stages: key-sentence extraction and key-phrase generation tasks. These stages are based on the T5 model structure, transformer-based architecture that uses a text-to-text approach. To further improve the framework's performance, we employed two key factors: i) post-training the model with a patent-related raw corpus for encouraging the model's comprehension of the patent domain, and ii) utilizing a text rank algorithm for efficient training based on the priority score of each sentence. We verified that our key-phrase generation method of the framework shows higher performance in both superficial and semantic evaluation than other extraction methods. In addition, we provided the validity and effectiveness of our methods through quantitative and qualitative analysis, demonstrating the practical functionality of our methods. We also provided a practical contribution to the patent analysis by releasing the framework as a demo system.
KW - Deep learning
KW - Key-sentence extraction
KW - Keyword extraction
KW - Patent
KW - Patent analysis
KW - Post training
UR - http://www.scopus.com/inward/record.url?scp=85130817030&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3176877
DO - 10.1109/ACCESS.2022.3176877
M3 - Article
AN - SCOPUS:85130817030
SN - 2169-3536
VL - 10
SP - 59205
EP - 59218
JO - IEEE Access
JF - IEEE Access
ER -