AbstractsBiology & Animal Science

Robust and Efficient Methods for Bayesian Finite Population Inference.

by Xi Xia




Institution: University of Michigan
Department: Biostatistics
Degree: PhD
Year: 2015
Keywords: Finite Population Bayesian Inference; Weight Pooling; Weight Trimming; Laplace Prior; Weighted Dirichlet Process Mixture Model; Statistics and Numeric Data; Science
Record ID: 2062615
Full text PDF: http://hdl.handle.net/2027.42/111372


Abstract

Bayesian model-based approaches provide data-driven estimates of population quantity of interest from complex survey data to achieve balance between bias correction and efficiency. We focus on the issue of accommodating sample weights equal to the inverse of the probabilities of inclusion. In settings with highly variable weights, weight "trimming" is often employed in an ad-hoc manner to decrease variance, while possibly increasing bias. We consider three model-based methods to provide principled bias-variance tradeoffs. Weighted estimators can be developed in a model-based framework by including interactions between the quantity of interest and the weights; weight pooling builds a variable selection model that drops interactions on various weight values; and estimation proceeds using the posterior distribution of model averages. The extension considers a weight pooling linear spline model that uses a linear spline to capture regression coefficient patterns for all strata, and collapses together the strata with minor differences. Our model achieves robustness when weights are needed to guard against model misspecification, and efficiency when weight-coefficient interactions could be ignored. We also model interactions between the weights and estimators of interest as random effects, reducing overall RMSEs by shrinking interactions toward zero when such shrinkage is supported by data. We adapt a flexible Laplace prior distribution to gain robustness against model misspecification. We find that weight smoothing models with Laplace priors approximate unweighted estimates when weighting is not necessary, and could greatly reduce the RMSE if strong pattern exists in data in linear model setting. Under logistic regression with same sample size, the estimates are still robust, but with less gain in efficiency. Finally, we adapt a Dirichlet process mixture (DPM) model that can approximat highly-skewed and multimodal distributions, often with few components. The extended weighted DPM version define the DP prior as a mixture of DP random basis measures that is a function of covariates, extends applications to regression, and creates a natural link to survey weights. We also investigate its application to provide a new approach for quantile regression inference with complex survey design. Simulation results suggest great reduction in RMSE from weighted DPM method under most of the scenarios.