Robust likelihood-based analysis of multivariate data with missing values

Roderick Little, Hyonggin An

Research output: Contribution to journalArticle

85 Citations (Scopus)

Abstract

The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

Original languageEnglish
Pages (from-to)949-968
Number of pages20
JournalStatistica Sinica
Volume14
Issue number3
Publication statusPublished - 2004 Jul 1
Externally publishedYes

Fingerprint

Missing Values
Multivariate Data
Likelihood
Model-based
Penalized Splines
Propensity Score
Missing at Random
Model Misspecification
Prediction
Relative Efficiency
Parametric Model
Missing Data
Covariates
Monotone
Regression
Range of data
Simulation
Inference
Missing values
Propensity score

Keywords

  • Double robustness
  • Incomplete data
  • Penalized splines
  • Regression imputation
  • Weighting

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Robust likelihood-based analysis of multivariate data with missing values. / Little, Roderick; An, Hyonggin.

In: Statistica Sinica, Vol. 14, No. 3, 01.07.2004, p. 949-968.

Research output: Contribution to journalArticle

@article{03f538bc648842789189cb54029d2207,
title = "Robust likelihood-based analysis of multivariate data with missing values",
abstract = "The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.",
keywords = "Double robustness, Incomplete data, Penalized splines, Regression imputation, Weighting",
author = "Roderick Little and Hyonggin An",
year = "2004",
month = "7",
day = "1",
language = "English",
volume = "14",
pages = "949--968",
journal = "Statistica Sinica",
issn = "1017-0405",
publisher = "Institute of Statistical Science",
number = "3",

}

TY - JOUR

T1 - Robust likelihood-based analysis of multivariate data with missing values

AU - Little, Roderick

AU - An, Hyonggin

PY - 2004/7/1

Y1 - 2004/7/1

N2 - The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

AB - The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

KW - Double robustness

KW - Incomplete data

KW - Penalized splines

KW - Regression imputation

KW - Weighting

UR - http://www.scopus.com/inward/record.url?scp=8644254410&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=8644254410&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:8644254410

VL - 14

SP - 949

EP - 968

JO - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

IS - 3

ER -