AbstractsBiology & Animal Science

Organization and evolution of transcription factor occupancy in the human genome

by Jeffrey David Vierstra




Institution: University of Washington
Department:
Degree: PhD
Year: 2014
Keywords: chromatin; evolution; gene regulation; transcription factor; Genetics
Record ID: 2029673
Full text PDF: http://hdl.handle.net/1773/26140


Abstract

<italic>Cis</italic>-regulatory DNA encodes the circuitry that enables cell development and differentiation. <italic>Cis</italic>-regulatory DNA is densely populated by recogntition sequences for transcription factors and the cooperative binding TFs to these sequences determines cell-fate and function by the precise transcriptional regulation of their cognate genes. As such, a mechanistic understanding of gene regulation hinges on our ability to quantify transcription factor occupancy. To map transcription factor occupancy with in the human genome, I took part in the development of digital genomic footprinting  – a technique leveraging the endonuclease DNase I that enables the unbiased and simultaneous detection of transcription factor occupancy genome-wide. We applied digital genomic footprinting to 41 diverse cell- and tissue-types to comprehensively map the human <italic>cis</italic>-regulatory lexicon. We show that this small genomic compartment contains an expansive repertoire of conserved recognition sequences for DNA-binding proteins and that nuclease patterns within these sequences mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces. We also show that both genetic and epigenetic variants affecting chromatin states are concentrated within footprints. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency. These results provide for the first time an exhaustive map of TF occupancy within the human genome. The architecture of individual <italic>cis</italic>-regulatory sites is critical for their function. While digital genomic footprinting provides rich information about the occupancy of TFs within individual <italic>cis</italic>-regulatory elements, it is currently not possible to resolve the genome-wide relationship of transcription factors (TFs) and nucleosomes. To address this deficiency, I developed an extension to digital genomic footprinting that couples the detection of individual TF footprints to nucleosome occupancy. We find that TF occupancy is the major determinant of the positioning of <italic>cis</italic>-regulatory proximal nucleosomes, and that the positioning and occupancy of promoter-associated nucloeosomes is related to transcriptional start sites selection and output. The approach we describe provides a new view on the structure of <italic>cis</italic>-regulatory chromatin. In the second part of this thesis, I used a comparative genomics approach to study the evolution of <italic>cis</italic>-regulatory DNA and protein occupancy. To do this, I mapped DNase I hypersensitive sites (DHSs) in 45 mouse cell types and primary tissues, and systematically compared these with human DHS maps from orthologous cell and tissue compartments. While I uncovered a small set of core regulatory sequences that encode a…