AbstractsBiology & Animal Science

Computational Methods for the Identification and Characterization of Non-Coding RNAs in Bacteria

by Alexander Herbig

Institution: Universität Tübingen
Year: 2015
Record ID: 1107983
Full text PDF: http://hdl.handle.net/10900/59390


In recent years the complexity even of bacterial transcriptomes became more and more evident. The important role of so-called non-coding RNAs (ncRNA), which do not encode proteins, is increasingly recognized as they fulfill a variety of functions, such as the regulation of cellular processes or catalysis of other molecules. Therefore, the characterization of an organism's ncRNA repertoire has become an essential part of systems biology studies. In this context novel high-throughput technologies in the field of DNA and RNA sequencing allow for the investigation of genomes and transcriptomes in unprecedented detail. These methodologies produce vast amounts of data that have to be analysed comparatively in order to elucidate variations between different organisms or environmental conditions. For these tasks efficient computational methods are needed that integrate genomic and transcriptomic data from multiple data sets in an automated and reproducible manner. In addition, these approaches have to facilitate the genomic localization of ncRNA elements and their detailed annotation e.g., with respect to promoter regions or transcription start sites as well as their functional characterization such as the prediction of their targets of regulation. In this dissertation I have made a number of contributions that address these challenges. The computer program nocoRNAc was developed, which predicts ncRNAs in bacterial genomes and characterizes them with respect to multiple properties such as transcription start and end points, secondary structure and potential interaction partners. nocoRNAc has been applied in the context of a comprehensive time series expression study of the antibiotics producing bacterium Streptomyces coelicolor, which was cultivated under different environmental conditions. During this study the importance of ncRNAs as potential regulators became evident. For the analysis of high-resolution genomic and transcriptomic data from multiple organisms the SuperGenome concept was developed. The approach was applied in the context of whole-genome alignment visualization and served as the basis for an algorithm for the comparative detection of transcription start sites in bacterial genomes utilizing RNA-seq data. The application to multiple strains of the human pathogen Campylobacter jejuni allowed for the global characterization of this organism's transcriptome and led to the detection of several novel ncRNAs, among them a previously uncharacterized CRISPR locus, which represents an adaptive bacterial immune system. Studying pathogens can also be of historic relevance. The emerging field of paleogenetics focuses on the reconstruction and analysis of genomes of ancient organisms, whose DNA has been extracted from archaeological samples, such as bones. In this dissertation I present computational methods for the reconstruction and characterization of ancient bacterial genomes, which have been applied to study the evolution of Mycobacterium leprae, the bacterium causing leprosy. Overall, the algorithms…