|Full text PDF:||http://louisdl.louislibraries.org/u?/p16313coll12,4822|
Next generation sequencing (NGS) is a relatively new technology that has revolutionized the way scientists discover and investigate pathogens. It has been estimated that a staggering one in every five cancers worldwide is linked to an infectious agent. An understanding of the pathogen biology as well as the interactions with the host will lead to better therapies and outcomes for patients suffering from pathogen-associated malignancies. Despite the promise for this phenomenon through NGS-based approaches, we are still in the infancy of sequence analysis and are unable to fully appreciate the potential of NGS. To facilitate data mining, an automated computational pipeline for the simultaneous analysis of pathogen and host transcripts called RNA CoMPASS was developed. Using RNA CoMPASS to investigate a variety of sequencing datasets over the years, substantial bacterial contamination have been routinely identified in human-derived RNA-seq datasets that likely arose from environmental sources. Based on this analysis, a need for more stringent sequencing and analysis protocols to investigate sequence-based microbial signatures in clinical samples is crucial. NGS-based approaches were utilized to investigate the role of Epstein-Barr virus (EBV) in the pathogenesis of gastric carcinoma. A comprehensive assessment of the virome of various brain tissue samples was also performed, with the notion that an NGS-based detection method would be unbiased, sensitive, specific, and accurate. Taken together, these studies provide a framework for using NGS technology to study oncogenic pathogens and bring awareness to contamination issues within sequencing datasets.