TY - JOUR
T1 - Not from Scratch
T2 - Predicting Thermophysical Properties through Model-Based Transfer Learning Using Graph Convolutional Networks
AU - Hormazabal, Rodrigo S.
AU - Kang, Jeong Won
AU - Park, Kiho
AU - Yang, Dae Ryook
N1 - Funding Information:
This work is supported by the Korea University.
Publisher Copyright:
© 2022 American Chemical Society. All rights reserved.
PY - 2022/11/28
Y1 - 2022/11/28
N2 - In this study, a framework for the prediction of thermophysical properties based on transfer learning from existing estimation models is explored. The predictive capabilities of conventional group-contribution methods and traditional machine-learning approaches rely heavily on the availability of experimental datasets and their uncertainty. Through the use of a pretraining scheme, which leverages the knowledge established by other estimation methods, improved prediction models for thermophysical properties can be obtained after fine-tuning networks with more accurate experimental data. As our experiments show, for the case of critical properties of compounds, this pipeline not only improves the performance of the models on commonly found organic structures but can also help these models generalize to less explored areas of chemical space, where experimental data is scarce, such as inorganics and heavier organic compounds. Transfer learning from estimation models data also allows for graph-based deep learning models to create more flexible molecular features over a bigger chemical space, which leads to improved predictive capabilities and can give insights into the relationship between molecular structures and thermophysical properties. The generated molecular features can discriminate behavior discrepancy between isomers without the need of additional parameters. Also, this approach shows better robustness to outliers in experimental datasets.
AB - In this study, a framework for the prediction of thermophysical properties based on transfer learning from existing estimation models is explored. The predictive capabilities of conventional group-contribution methods and traditional machine-learning approaches rely heavily on the availability of experimental datasets and their uncertainty. Through the use of a pretraining scheme, which leverages the knowledge established by other estimation methods, improved prediction models for thermophysical properties can be obtained after fine-tuning networks with more accurate experimental data. As our experiments show, for the case of critical properties of compounds, this pipeline not only improves the performance of the models on commonly found organic structures but can also help these models generalize to less explored areas of chemical space, where experimental data is scarce, such as inorganics and heavier organic compounds. Transfer learning from estimation models data also allows for graph-based deep learning models to create more flexible molecular features over a bigger chemical space, which leads to improved predictive capabilities and can give insights into the relationship between molecular structures and thermophysical properties. The generated molecular features can discriminate behavior discrepancy between isomers without the need of additional parameters. Also, this approach shows better robustness to outliers in experimental datasets.
UR - http://www.scopus.com/inward/record.url?scp=85141573701&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.2c00846
DO - 10.1021/acs.jcim.2c00846
M3 - Article
C2 - 36315416
AN - SCOPUS:85141573701
VL - 62
SP - 5411
EP - 5424
JO - Journal of Chemical Documentation
JF - Journal of Chemical Documentation
SN - 0095-2338
IS - 22
ER -