AbstractsBiology & Animal Science

Statistical methods in pedigree analysis and the reliability of pedigree data

by Gregory Mark Lathrop




Institution: University of Washington
Department:
Year: 1980
Keywords: Biostatistics
Record ID: 1508274
Full text PDF: http://hdl.handle.net/1773/9606


Abstract

This dissertation consists of three parts. In Chapter 1 a method is presented which allows the estimation of pedigree error from genetic marker data. The method used a parametric model which incorporates several forms of inaccuracy which may be introduced into a putative pedigree as a result of errors in the collection of data, in record linkage and from non-paternity. Errors in the marker data are allowed for by the introduction of field error, or the assignment of a marker sample to an incorrect individual, and system typing error, in which a laboratory or recording error occurs in the assignment of the phenotype at a single system. Estimates of error rates are made using a modification of an algorithm due to Elston and Steward (1971) for general genetic models. Parameter estimates are used to classify family or pedigree units containing a parent-offspring exclusion by the most probable form of error they contain. The methods are applied to data from the Tokelau islands.Chapter 2 contains an analysis of the bias introduced by pedigree error into the maximum likelihood estimation of components of variance of a polygenic trait. Paternal (or maternal) pedigree reliability is defined to be the probability no error occurring within a paternal (or maternal) parent-offspring link. For a simple polygenic trait it is shown that for some fixed cluster designs the bias in the estimation of both the additive and dominance components can be considerably greater than 1-R where R is either the paternal or maternal reliability. In nuclear families, the bias in the estimate of additive component is of the order of 1-R but the bias in the estimate of the dominance component remains larger.Chapter 3 provides a method of fitting the polygenic model to large pedigree data sets via the recursive computation of the partial derivatives needed to implement a scoring algorithm. The direct implementation of the scoring algorithm requires order n('3) operations in each iteration where n is the size of the pedigree, and hence is inefficient or not feasible for large data sets. The algorithm given here requires ni('2) (1+v) order operations where i is the average sibship size, and v is the squared coefficient of variation in sibship size. Linear transformations of the data are used to form successive sets of variables which contain information that can be immediately cumulated into terms for the calculation of the scores and information matrix. The algorithm is similar to that given in Elston and Stewart (1971). However, the Elston-Stewart algorithm cannot be used to directly calculate the scores and information matrix and general numerical methods must be used to determine the parameter estimates. The algorithm developed here combines the computational advantages of both previous methods.