Understanding prokaryotic diversity in the post-genomics era

Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)




Functional genomics, Protein domains, Gammaproteobacteria

Subject Categories

Bioinformatics | Life Sciences | Microbiology


Prokaryotes, one of nature's most pervasive organisms, are ubiquitous in the environment and impact human society, both as harmful and beneficial agents. A challenge of microbial research is to understand prokaryotes within the context of their environment. Currently, assessing prokaryotic diversity is difficult because we cannot culture the majority of microbes from the environment. Therefore, new techniques must be developed to not only assess this diversity, but to understand how such diversity is linked to genes in a prokaryote's genome. Access to the genome sequences of a large number of prokaryotes has radically changed microbial research, and the expected availability of thousands of genomes in the future has transitioned the field into the post-genomics era. This thesis presents the development and analysis of two post-genomics tools. The first is in functional genomics, which seeks to determine the genetic interactions that underlie prokaryotic traits. Phylogenomics predicts genetic interactions based on the incidence of coinherited proteins across evolutionary phyla, and assigns function to a gene. This was experimentally tested for the model organism Myxococcus xanthus and further applied to the genome annotation of Sorangium cellulosum . Additional analysis was performed in the area of integrative functional genomics, where it is thought that the combination of multiple functional genomics approaches improves the accuracy of individual predictions. This assumption was tested by combining the predictions from phylogenomics and gene expression mapping for four prokaryotes. It was found that, while this assumption holds, the rate that predictions converge upon a stable set of interactions is significantly slower than previously thought. The second tool relates a prokaryote's genome to its niche. Prokaryotes with similar genetic content, as measured by protein domains, were clustered into mountains on a niche map. When compared to a second map based on 16S rRNA, the niche map better clustered prokaryotes according to an environment/function metric. Analysis of mountains on the niche map showed correlation to niche, including the clustering of marine Gammaproteobacteria, obligate pathogens and symbionts, and prokaryotes that exist at the soil, plant, and human interface. Taken together, both tools can help us better understand prokaryotic diversity in the post-genomics era.