AbstractsBiology & Animal Science

Semi-parametric analysis of failure time data from case-control family studies on candidate genes

by Lu Chen




Institution: University of Washington
Department:
Degree: PhD
Year: 2004
Keywords: Biostatistics
Record ID: 1755366
Full text PDF: http://hdl.handle.net/1773/9573


Abstract

Population-based case-control family studies are often conducted to investigate the increased risk of developing cancer that is associated with mutations in candidate genes. In these studies, families are sampled retrospectively through the cancer outcomes, such as cancer status and age at diagnosis, of the case-control probands. The outcomes of family members can be viewed as failure time outcomes. They are often dependent among family members beyond measured covariates. Analysis of failure time data from case-control family studies needs to account for both the family ascertainment scheme and the possible residual dependence among family members. In this dissertation, we propose to use the gamma frailty to account for the residual dependence among family members, while leaving the baseline hazard in the conditional proportional hazards model completely unspecified. We develop several semi-parametric approaches for simultaneous estimation of the regression parameters and the dependence parameter from case-control family data: the two-stage approach which estimates dependence parameter at the first stage and then estimates the regression parameters at the second stage; the expectation-conditional-maximization (ECM) approach which iterates between calculating the expectation of frailty-related terms and estimating all the parameters via conditional maximization of a likelihood; and the iterative-two-stage approach which alternates between the estimation of the dependence parameter and the estimation of the regression parameters. Simulation studies show that the ECM approach is the most efficient whereas the two-stage approach is the least. The ECM approach is then further extended to accommodate missing genotypes in the relatives. The method appears to have reasonable finite sample performance. Finally we illustrate the ECM approach using a real data set on breast cancer and BRCA genes, where the genotypes are missing in the relatives by study design.