Thumbnail Image

Evaluation and Optimization of Bioinformatic Tools for the Detection of Human Foodborne Pathogens in Complex Metagenomic Datasets

Sharp, Gretta Marie
Foodborne human pathogens pose a significant risk to human health as each year one in six Americans becomes sick from one of over 31 known human foodborne pathogens. Due to the differences in their growth requirements, current detection assays can only detect one to a few of these pathogens per single assay. Metagenomics, an emerging field, allows for an entire community of organisms to be analyzed from DNA or RNA sequence data generated from a single sample, and therefore has the potential to detect any and all foodborne pathogens present in a single complex matrix. However, currently available bioinformatic pipelines for metagenomic sequence analysis require extensive time and high computer power inputs, often with unreliable results. The objectives of this study are 1) to evaluate community profiling bioinformatic pipelines, mapping pipelines and a novel pipeline created at Oklahoma State University, E-probe Diagnostic Nucleic-acid Analysis (EDNA), for the detection of S. enterica (as a model foodborne pathogen) in metagenomic data, 2) to optimize EDNA pipeline for sensitive detection of the S. enterica in metagenomic data, and 3) to simultaneously detect multiple foodborne pathogens from a single metagenomic sample. EDNA was able to detect S. enterica in metagenomic data in approximately five minutes compared to the other pipelines, which took between 2-500 hours. The optimized parameters for the EDNA pipeline were limited to using cleaned Illumina data with a read depth of one. The minimum BLAST E-value was set to 10^-3 for curation. For detection the minimum percent identity was set to 95% and the minimum query coverage to 90% with an E-probe length of 80 nt. These new parameters significantly improved the sensitivity of the assay 100-fold, from 10^3 S. enterica cells detected by the original EDNA pipeline to just 10 cells. In the simultaneous detection of multiple foodborne pathogens, EDNA detected three additional pathogens Listeria monocytogenes, Campylobacter jejuni and Shiga toxin producing Escherichia coli at ten contamination levels in less than ten minutes and provided new detection insights into read abundance as it corresponds to pathogen cell numbers.