### Abstract

The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

Original language | English |
---|---|

Pages (from-to) | 949-968 |

Number of pages | 20 |

Journal | Statistica Sinica |

Volume | 14 |

Issue number | 3 |

Publication status | Published - 2004 Jul 1 |

Externally published | Yes |

### Fingerprint

### Keywords

- Double robustness
- Incomplete data
- Penalized splines
- Regression imputation
- Weighting

### ASJC Scopus subject areas

- Statistics and Probability
- Statistics, Probability and Uncertainty

### Cite this

*Statistica Sinica*,

*14*(3), 949-968.

**Robust likelihood-based analysis of multivariate data with missing values.** / Little, Roderick; An, Hyonggin.

Research output: Contribution to journal › Article

*Statistica Sinica*, vol. 14, no. 3, pp. 949-968.

}

TY - JOUR

T1 - Robust likelihood-based analysis of multivariate data with missing values

AU - Little, Roderick

AU - An, Hyonggin

PY - 2004/7/1

Y1 - 2004/7/1

N2 - The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

AB - The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter are correct. Extensions to more general patterns are outlined.

KW - Double robustness

KW - Incomplete data

KW - Penalized splines

KW - Regression imputation

KW - Weighting

UR - http://www.scopus.com/inward/record.url?scp=8644254410&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=8644254410&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:8644254410

VL - 14

SP - 949

EP - 968

JO - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

IS - 3

ER -