Imputation of Missing Covariate Data Prior to Propensity Score Analysis: A Tutorial and Evaluation of the Robustness of Practical Approaches

Leite W. L. , AYDIN B. , Cetin-Berber D. D.

EVALUATION REVIEW, 2021 (Journal Indexed in SSCI) identifier identifier

  • Publication Type: Article / Article
  • Volume:
  • Publication Date: 2021
  • Doi Number: 10.1177/0193841x211020245
  • Title of Journal : EVALUATION REVIEW


Background: Propensity score analysis (PSA) is a popular method to remove selection bias due to covariates in quasi-experimental designs, but it requires handling of missing data on covariates before propensity scores are estimated. Multiple imputation (MI) and single imputation (SI) are approaches to handle missing data in PSA. Objectives: The objectives of this study are to review MI-within, MI-across, and SI approaches to handle missing data on covariates prior to PSA, investigate the robustness of MI-across and SI with a Monte Carlo simulation study, and demonstrate the analysis of missing data and PSA with a step-by-step illustrative example. Research design: The Monte Carlo simulation study compared strategies to impute missing data in continuous and categorical covariates for estimation of propensity scores. Manipulated conditions included sample size, the number of covariates, the size of the treatment effect, missing data mechanism, and percentage of missing data. Imputation strategies included MI-across and SI by joint modeling or multivariate imputation by chained equations (MICE). Results: The results indicated that the MI-across method performed well, and SI also performed adequately with smaller percentages of missing data. The illustrative example demonstrated MI and SI, propensity score estimation, calculation of propensity score weights, covariate balance evaluation, estimation of the average treatment effect on the treated, and sensitivity analysis using data from the National Longitudinal Survey of Youth.