AbstractsBiology & Animal Science

Network-based contextualisation of LC-MS/MS proteomics data

by Armin Guntram Geiger




Institution: Stellenbosch University
Department: Institute for Wine Biotechnology
Degree: MSc
Year: 2014
Keywords: Wine biotechnology; Dissertations  – Wine biotechnology
Record ID: 1475460
Full text PDF: http://hdl.handle.net/10019.1/96116


Abstract

ENGLISH ABSTRACT: This thesis explores the use of networks as a means to visualise, interpret and mine MS-based proteomics data. A network-based approach was applied to a quantitative, cross-species LCMS/ MS dataset derived from two yeast species, namely Saccharomyces cere- visiae strain VIN13 and Saccharomyces paradoxus strain RO88. In order to identify and quantify proteins from the mass spectra, a workflow consisting of both custom-built and existing programs was assembled. Networks which place the identifed proteins in several biological contexts were then constructed. The contexts included sequence similarity to other proteins, ontological descriptions, proteins-protein interactions, metabolic pathways and cellular location. The contextual, network-based representations of the proteins proved effective for identifying trends and patterns in the data that may otherwise have been obscured. Moreover, by bringing the experimentally derived data together with multiple, extant biological resources, the networks represented the data in a manner that better represents the interconnected biological system from which the samples were derived. Both existing and new hypotheses based on proteins relating to the yeast cell wall and proteins of putative oenological potential were investigated. These proteins were investigated in light of their differential expression between the two yeast species. Examples of proteins that were investigated included cell wall proteins such as GGP1 and SCW4. Proteins with putative oenological potential included haze protection factor proteins such as HPF2. Furthermore, differences in capacity for maloethanolic fermentation between the two strains were also investigated in light of the protein data. The network-based representations also allowed new hypotheses to be formed around proteins that were identified in the dataset, but were of unknown function. AFRIKAANSE OPSOMMING: Hierdie studie verken die gebruik van netwerke om proteonomiese data te visualiseer, te interpreteer en te ontgin. 'n Netwerkgebaseerde benadering is gevolg ter ontleding van 'n kwantitatiewe LC-MS/MS datastel wat afkomstig was van twee gis-spesies nl, Saccharomyces cerevisiae ras VIN1 en Saccharomyces paradoxus ras RO88. Die massaspektra is met bestaande en selfgeskrewe rekenaarprogramme verwerk om 'n werkvloei saam te stel ter identifisering en kwantifisering van die betrokke proteïene. Hierdie proteïene is dan aan bestaande biologiese databasisse gekoppel om die proteïene in biologiese konteks te plaas. Die gekontekstualiseerde is dan gebruik om biologiese netwerke van die data te bou. Die kontekste beskou onder meer lokalisering van selaktiwiteite, ontologiese beskrywings, ooreenkomste in aminosuur-volgordes en interaksies met bekende proteïene asook assosiasie en verbintenisse met metaboliese paaie. Hierdie kontekstuele, netwerk-gebaseerde voorstelling van die betrokke prote- ïene het effektief duidelike data-tendense en patrone opgelewer wat andersins nie opmerkbaar sou wees…