Abstract
This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.
Original language | English |
---|---|
Pages (from-to) | 177-191 |
Number of pages | 15 |
Journal | Communications for Statistical Applications and Methods |
Volume | 29 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2022 Mar |
Keywords
- Average causal effect
- doubly-robust estimation
- inverse probability weighting
- propensity score
- random forest
- targeted learning
ASJC Scopus subject areas
- Statistics and Probability
- Modelling and Simulation
- Finance
- Statistics, Probability and Uncertainty
- Applied Mathematics