Robust model-based inference for incomplete data via penalized spline propensity prediction

Hyonggin An, Roderick J A Little

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of the response propensity score, which is logit of the estimated probability of being missing given the observed variables. Variables other than the response propensity are included parametrically in the imputation model. However they only considered point estimation based on single imputation with PSPP. We consider here three approaches to standard errors estimation incorporating the uncertainty due to non response: (a) standard errors based on the asymptotic variance of the PSPP estimator, ignoring sampling error in estimating the response propensity; (b) standard errors based on the bootstrap method; and (c) multiple imputation-based standard errors using draws from the joint posterior predictive distribution of missing values under the PSPP model. Simulation studies suggest that the bootstrap and multiple imputation approaches yield good inferences under a range of simulation conditions, with multiple imputation showing some evidence of closer to nominal confidence interval coverage when the sample size is small.

Original languageEnglish
Pages (from-to)1718-1731
Number of pages14
JournalCommunications in Statistics: Simulation and Computation
Volume37
Issue number9
DOIs
Publication statusPublished - 2008 Nov 1

Fingerprint

Penalized Splines
Incomplete Data
Splines
Imputation
Standard error
Multiple Imputation
Model-based
Prediction
Missing Values
Propensity Score
Non-response
Point Estimation
Predictive Distribution
Logit
Misspecification
Bootstrap Method
Asymptotic Variance
Error Estimation
Parametric Model
Posterior distribution

Keywords

  • Asymptotic variance
  • Bootstrap
  • Gibbs sampler
  • Missing data
  • Multiple imputation
  • Penalized spline
  • Response propensity

ASJC Scopus subject areas

  • Modelling and Simulation
  • Statistics and Probability

Cite this

Robust model-based inference for incomplete data via penalized spline propensity prediction. / An, Hyonggin; Little, Roderick J A.

In: Communications in Statistics: Simulation and Computation, Vol. 37, No. 9, 01.11.2008, p. 1718-1731.

Research output: Contribution to journalArticle

@article{2c5fc17acbc7439f8dc646a58c887e19,
title = "Robust model-based inference for incomplete data via penalized spline propensity prediction",
abstract = "Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of the response propensity score, which is logit of the estimated probability of being missing given the observed variables. Variables other than the response propensity are included parametrically in the imputation model. However they only considered point estimation based on single imputation with PSPP. We consider here three approaches to standard errors estimation incorporating the uncertainty due to non response: (a) standard errors based on the asymptotic variance of the PSPP estimator, ignoring sampling error in estimating the response propensity; (b) standard errors based on the bootstrap method; and (c) multiple imputation-based standard errors using draws from the joint posterior predictive distribution of missing values under the PSPP model. Simulation studies suggest that the bootstrap and multiple imputation approaches yield good inferences under a range of simulation conditions, with multiple imputation showing some evidence of closer to nominal confidence interval coverage when the sample size is small.",
keywords = "Asymptotic variance, Bootstrap, Gibbs sampler, Missing data, Multiple imputation, Penalized spline, Response propensity",
author = "Hyonggin An and Little, {Roderick J A}",
year = "2008",
month = "11",
day = "1",
doi = "10.1080/03610910802255840",
language = "English",
volume = "37",
pages = "1718--1731",
journal = "Communications in Statistics Part B: Simulation and Computation",
issn = "0361-0918",
publisher = "Taylor and Francis Ltd.",
number = "9",

}

TY - JOUR

T1 - Robust model-based inference for incomplete data via penalized spline propensity prediction

AU - An, Hyonggin

AU - Little, Roderick J A

PY - 2008/11/1

Y1 - 2008/11/1

N2 - Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of the response propensity score, which is logit of the estimated probability of being missing given the observed variables. Variables other than the response propensity are included parametrically in the imputation model. However they only considered point estimation based on single imputation with PSPP. We consider here three approaches to standard errors estimation incorporating the uncertainty due to non response: (a) standard errors based on the asymptotic variance of the PSPP estimator, ignoring sampling error in estimating the response propensity; (b) standard errors based on the bootstrap method; and (c) multiple imputation-based standard errors using draws from the joint posterior predictive distribution of missing values under the PSPP model. Simulation studies suggest that the bootstrap and multiple imputation approaches yield good inferences under a range of simulation conditions, with multiple imputation showing some evidence of closer to nominal confidence interval coverage when the sample size is small.

AB - Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of the response propensity score, which is logit of the estimated probability of being missing given the observed variables. Variables other than the response propensity are included parametrically in the imputation model. However they only considered point estimation based on single imputation with PSPP. We consider here three approaches to standard errors estimation incorporating the uncertainty due to non response: (a) standard errors based on the asymptotic variance of the PSPP estimator, ignoring sampling error in estimating the response propensity; (b) standard errors based on the bootstrap method; and (c) multiple imputation-based standard errors using draws from the joint posterior predictive distribution of missing values under the PSPP model. Simulation studies suggest that the bootstrap and multiple imputation approaches yield good inferences under a range of simulation conditions, with multiple imputation showing some evidence of closer to nominal confidence interval coverage when the sample size is small.

KW - Asymptotic variance

KW - Bootstrap

KW - Gibbs sampler

KW - Missing data

KW - Multiple imputation

KW - Penalized spline

KW - Response propensity

UR - http://www.scopus.com/inward/record.url?scp=53249133182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=53249133182&partnerID=8YFLogxK

U2 - 10.1080/03610910802255840

DO - 10.1080/03610910802255840

M3 - Article

VL - 37

SP - 1718

EP - 1731

JO - Communications in Statistics Part B: Simulation and Computation

JF - Communications in Statistics Part B: Simulation and Computation

SN - 0361-0918

IS - 9

ER -