AbstractsBiology & Animal Science

Topics in cancer genomics

by Sachet Ashok Shukla




Institution: Iowa State University
Department:
Year: 2014
Keywords: cancer; HLA typing; microRNA; mutation; network inference; Bioinformatics
Record ID: 2035879
Full text PDF: http://lib.dr.iastate.edu/etd/13796


http://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=4803&context=etd


Abstract

Large-scale projects such as the The Cancer Genome Atlas (TCGA) have generated extensive exome libraries across several disease types and populations. Detection of somatic changes in HLA genes by whole-exome sequencing (WES) has been complicated by the highly polymorphic nature of these loci. We developed a method POLYSOLVER (POLYmorphic loci reSOLVER) for accurate inference of class I HLA-A, -B and -C alleles from WES data, and achieved 97% accuracy at protein level resolution when this was applied to 133 HapMap samples of known HLA type. By applying POLYSOLVER in conjunction with somatic change detection tools to 2688 tumor/normal pairs TCGA that were previously analyzed by conventional approaches, we re-discovered 37 of 56 (66%) HLA mutations, while further identifying 23 new events. An analysis of WES data from a larger set of 3768 tumor/normal pairs by POLYSOLVER revealed 131 class I mutations with an enrichment for potentially loss-of-function events. 3% of samples had at least one HLA event with 95 of 131 mutations in the T cell interacting and peptide binding domains. Recurrent hotspot sites of missense, nonsense and splice site mutations were discovered that suggest positive selection, and support immune evasion as an important pathway in cancer. Exome sequencing has also revealed a large number of shared and personal somatic mutations across human cancers. In principle, any genetic alteration affecting a protein-coding region has the potential to generate mutated peptides that are presented by surface HLA class I proteins that might be recognized by cytotoxic T cells. Utilizing POLYSOLVER in conjunction with knowledge of mutations in other genetic loci inferred from exome data, we developed a pipeline for the prediction and validation of such neoantigens derived from individual tumors and presented by patient-specific alleles of the HLA proteins. We applied our computational pipeline to 91 chronic lymphocytic leukemias (CLL) that had undergone whole-exome sequencing. We predicted ~22 mutated HLA-binding peptides per leukemia (derived from ~16 missense mutations), and experimentally confirmed HLA binding for ~55% of such peptides. Finally, we computationally predicted HLA-binding peptides with missense or frameshift mutations for several cancer types and predicted dozens to thousands of neoantigens per individual tumor, suggesting that neoantigens are frequent in most tumors. The neoantigen prediction pipeline can also elucidate the neoantigens unique to a particular cancer patient and help in the design of personalized immune vaccines. MicroRNAs (miRs) are a class of non-coding small RNAs that regulate gene expression by promoting mRNA degradation or by inhibiting mRNA translation. Context Likelihood of Relatedness (CLR) is genetic network reconstruction method that considers the local network context in assessing the significance of connections while also allowing for detection of non-linear associations. Leveraging TCGA multidimensional data in glioblastoma, we inferred the putative regulatory network…