TY - JOUR T1 - Automated ensemble assembly and validation of microbial genomes. JF - BMC Bioinformatics Y1 - 2014 A1 - Koren, Sergey A1 - Todd Treangen A1 - Hill, Christopher M A1 - Pop, Mihai A1 - Phillippy, Adam M KW - Genome, Bacterial KW - Genome, Microbial KW - Genomics KW - Mycobacterium tuberculosis KW - Rhodobacter sphaeroides KW - Sequence Analysis, DNA KW - software AB -

BACKGROUND: The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible.

RESULTS: To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers.

CONCLUSIONS: Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.

VL - 15 M3 - 10.1186/1471-2105-15-126 ER - TY - JOUR T1 - TIGRFAMs and Genome Properties in 2013 JF - Nucleic acids researchNucleic Acids Research Y1 - 2013 A1 - Haft, Daniel H. A1 - J. Selengut A1 - Richter, Roland A. A1 - Harkins, Derek A1 - Basu, Malay K. A1 - Beck, Erin KW - Databases, Protein KW - Genome, Archaeal KW - Genome, Bacterial KW - Genomics KW - Internet KW - Markov chains KW - Molecular Sequence Annotation KW - Proteins KW - sequence alignment AB - TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature. VL - 41 N1 - http://www.ncbi.nlm.nih.gov/pubmed/23197656?dopt=Abstract ER - TY - JOUR T1 - TIGRFAMs and Genome Properties in 2013. JF - Nucleic Acids Res Y1 - 2013 A1 - Haft, Daniel H A1 - Selengut, Jeremy D A1 - Richter, Roland A A1 - Harkins, Derek A1 - Basu, Malay K A1 - Beck, Erin KW - Databases, Protein KW - Genome, Archaeal KW - Genome, Bacterial KW - Genomics KW - Internet KW - Markov chains KW - Molecular Sequence Annotation KW - Proteins KW - sequence alignment AB -

TIGRFAMs, available online at http://www.jcvi.org/tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral sequence. Models describing more functionally heterogeneous families are designated subfamily or domain, and assign less specific but more widely applicable annotations. The Genome Properties database, available at http://www.jcvi.org/genome-properties, specifies how computed evidence, including TIGRFAMs HMM results, should be used to judge whether an enzymatic pathway, a protein complex or another type of molecular subsystem is encoded in a genome. TIGRFAMs and Genome Properties content are developed in concert because subsystems reconstruction for large numbers of genomes guides selection of seed alignment sequences and cutoff values during protein family construction. Both databases specialize heavily in bacterial and archaeal subsystems. At present, 4284 models appear in TIGRFAMs, while 628 systems are described by Genome Properties. Content derives both from subsystem discovery work and from biocuration of the scientific literature.

VL - 41 CP - Database issue M3 - 10.1093/nar/gks1234 ER - TY - JOUR T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. JF - ISME J Y1 - 2012 A1 - Dupont, Chris L A1 - Rusch, Douglas B A1 - Yooseph, Shibu A1 - Lombardo, Mary-Jane A1 - Richter, R Alexander A1 - Valas, Ruben A1 - Novotny, Mark A1 - Yee-Greenbaum, Joyclyn A1 - Selengut, Jeremy D A1 - Haft, Dan H A1 - Halpern, Aaron L A1 - Lasken, Roger S A1 - Nealson, Kenneth A1 - Friedman, Robert A1 - Venter, J Craig KW - Computational Biology KW - Gammaproteobacteria KW - Genome, Bacterial KW - Genomic Library KW - metagenomics KW - Oceans and Seas KW - Phylogeny KW - plankton KW - Rhodopsin KW - Rhodopsins, Microbial KW - RNA, Ribosomal, 16S KW - Seawater AB -

Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.

VL - 6 CP - 6 M3 - 10.1038/ismej.2011.189 ER - TY - JOUR T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage JF - The ISME journalThe ISME journal Y1 - 2012 A1 - Dupont, Chris L. A1 - Rusch, Douglas B. A1 - Yooseph, Shibu A1 - Lombardo, Mary-Jane A1 - Richter, R. Alexander A1 - Valas, Ruben A1 - Novotny, Mark A1 - Yee-Greenbaum, Joyclyn A1 - J. Selengut A1 - Haft, Dan H. A1 - Halpern, Aaron L. A1 - Lasken, Roger S. A1 - Nealson, Kenneth A1 - Friedman, Robert A1 - Venter, J. Craig KW - Computational Biology KW - Gammaproteobacteria KW - Genome, Bacterial KW - Genomic Library KW - metagenomics KW - Oceans and Seas KW - Phylogeny KW - plankton KW - Rhodopsin KW - RNA, Ribosomal, 16S KW - Seawater AB - Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition. VL - 6 N1 - http://www.ncbi.nlm.nih.gov/pubmed/22170421?dopt=Abstract ER - TY - JOUR T1 - Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity. JF - PLoS Negl Trop Dis Y1 - 2012 A1 - Ricaldi, Jessica N A1 - Fouts, Derrick E A1 - Selengut, Jeremy D A1 - Harkins, Derek M A1 - Patra, Kailash P A1 - Moreno, Angelo A1 - Lehmann, Jason S A1 - Purushe, Janaki A1 - Sanka, Ravi A1 - Torres, Michael A1 - Webster, Nicholas J A1 - Vinetz, Joseph M A1 - Matthias, Michael A KW - DNA, Bacterial KW - Evolution, Molecular KW - Gene Transfer, Horizontal KW - Genome, Bacterial KW - Genomic islands KW - HUMANS KW - Leptospira KW - Molecular Sequence Data KW - Multigene Family KW - Prophages KW - Sequence Analysis, DNA KW - Virulence factors AB -

The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835) provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae) that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010(T) and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT). Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for its infectiousness and its unique antigenic characteristics.

VL - 6 CP - 10 M3 - 10.1371/journal.pntd.0001853 ER - TY - JOUR T1 - Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity JF - PLoS neglected tropical diseasesPLoS neglected tropical diseases Y1 - 2012 A1 - Ricaldi, Jessica N. A1 - Fouts, Derrick E. A1 - J. Selengut A1 - Harkins, Derek M. A1 - Patra, Kailash P. A1 - Moreno, Angelo A1 - Lehmann, Jason S. A1 - Purushe, Janaki A1 - Sanka, Ravi A1 - Torres, Michael A1 - Webster, Nicholas J. A1 - Vinetz, Joseph M. A1 - Matthias, Michael A. KW - DNA, Bacterial KW - Evolution, Molecular KW - Gene Transfer, Horizontal KW - Genome, Bacterial KW - Genomic islands KW - HUMANS KW - Leptospira KW - Molecular Sequence Data KW - Multigene Family KW - Prophages KW - Sequence Analysis, DNA KW - Virulence factors AB - The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835) provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae) that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010(T) and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT). Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for its infectiousness and its unique antigenic characteristics. VL - 6 N1 - http://www.ncbi.nlm.nih.gov/pubmed/23145189?dopt=Abstract ER - TY - JOUR T1 - Unexpected abundance of coenzyme F(420)-dependent enzymes in Mycobacterium tuberculosis and other actinobacteria JF - Journal of bacteriologyJournal of bacteriology Y1 - 2010 A1 - J. Selengut A1 - Haft, Daniel H. KW - Actinobacteria KW - Amino Acid Sequence KW - Binding Sites KW - Coenzymes KW - Flavonoids KW - Gene Expression Profiling KW - Gene Expression Regulation, Bacterial KW - Genome, Bacterial KW - molecular biology KW - Molecular Sequence Data KW - Molecular Structure KW - Mycobacterium tuberculosis KW - Phylogeny KW - Protein Conformation KW - Riboflavin AB - Regimens targeting Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), require long courses of treatment and a combination of three or more drugs. An increase in drug-resistant strains of M. tuberculosis demonstrates the need for additional TB-specific drugs. A notable feature of M. tuberculosis is coenzyme F(420), which is distributed sporadically and sparsely among prokaryotes. This distribution allows for comparative genomics-based investigations. Phylogenetic profiling (comparison of differential gene content) based on F(420) biosynthesis nominated many actinobacterial proteins as candidate F(420)-dependent enzymes. Three such families dominated the results: the luciferase-like monooxygenase (LLM), pyridoxamine 5'-phosphate oxidase (PPOX), and deazaflavin-dependent nitroreductase (DDN) families. The DDN family was determined to be limited to F(420)-producing species. The LLM and PPOX families were observed in F(420)-producing species as well as species lacking F(420) but were particularly numerous in many actinobacterial species, including M. tuberculosis. Partitioning the LLM and PPOX families based on an organism's ability to make F(420) allowed the application of the SIMBAL (sites inferred by metabolic background assertion labeling) profiling method to identify F(420)-correlated subsequences. These regions were found to correspond to flavonoid cofactor binding sites. Significantly, these results showed that M. tuberculosis carries at least 28 separate F(420)-dependent enzymes, most of unknown function, and a paucity of flavin mononucleotide (FMN)-dependent proteins in these families. While prevalent in mycobacteria, markers of F(420) biosynthesis appeared to be absent from the normal human gut flora. These findings suggest that M. tuberculosis relies heavily on coenzyme F(420) for its redox reactions. This dependence and the cofactor's rarity may make F(420)-related proteins promising drug targets. VL - 192 N1 - http://www.ncbi.nlm.nih.gov/pubmed/20675471?dopt=Abstract ER - TY - JOUR T1 - Unexpected abundance of coenzyme F(420)-dependent enzymes in Mycobacterium tuberculosis and other actinobacteria. JF - J Bacteriol Y1 - 2010 A1 - Selengut, Jeremy D A1 - Haft, Daniel H KW - Actinobacteria KW - Amino Acid Sequence KW - Binding Sites KW - Coenzymes KW - Flavonoids KW - Gene Expression Profiling KW - Gene Expression Regulation, Bacterial KW - Genome, Bacterial KW - molecular biology KW - Molecular Sequence Data KW - Molecular Structure KW - Mycobacterium tuberculosis KW - Phylogeny KW - Protein Conformation KW - Riboflavin AB -

Regimens targeting Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), require long courses of treatment and a combination of three or more drugs. An increase in drug-resistant strains of M. tuberculosis demonstrates the need for additional TB-specific drugs. A notable feature of M. tuberculosis is coenzyme F(420), which is distributed sporadically and sparsely among prokaryotes. This distribution allows for comparative genomics-based investigations. Phylogenetic profiling (comparison of differential gene content) based on F(420) biosynthesis nominated many actinobacterial proteins as candidate F(420)-dependent enzymes. Three such families dominated the results: the luciferase-like monooxygenase (LLM), pyridoxamine 5'-phosphate oxidase (PPOX), and deazaflavin-dependent nitroreductase (DDN) families. The DDN family was determined to be limited to F(420)-producing species. The LLM and PPOX families were observed in F(420)-producing species as well as species lacking F(420) but were particularly numerous in many actinobacterial species, including M. tuberculosis. Partitioning the LLM and PPOX families based on an organism's ability to make F(420) allowed the application of the SIMBAL (sites inferred by metabolic background assertion labeling) profiling method to identify F(420)-correlated subsequences. These regions were found to correspond to flavonoid cofactor binding sites. Significantly, these results showed that M. tuberculosis carries at least 28 separate F(420)-dependent enzymes, most of unknown function, and a paucity of flavin mononucleotide (FMN)-dependent proteins in these families. While prevalent in mycobacteria, markers of F(420) biosynthesis appeared to be absent from the normal human gut flora. These findings suggest that M. tuberculosis relies heavily on coenzyme F(420) for its redox reactions. This dependence and the cofactor's rarity may make F(420)-related proteins promising drug targets.

VL - 192 CP - 21 M3 - 10.1128/JB.00425-10 ER - TY - JOUR T1 - Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils. JF - Appl Environ Microbiol Y1 - 2009 A1 - Ward, Naomi L A1 - Challacombe, Jean F A1 - Janssen, Peter H A1 - Henrissat, Bernard A1 - Coutinho, Pedro M A1 - Wu, Martin A1 - Xie, Gary A1 - Haft, Daniel H A1 - Sait, Michelle A1 - Badger, Jonathan A1 - Barabote, Ravi D A1 - Bradley, Brent A1 - Brettin, Thomas S A1 - Brinkac, Lauren M A1 - Bruce, David A1 - Creasy, Todd A1 - Daugherty, Sean C A1 - Davidsen, Tanja M A1 - DeBoy, Robert T A1 - Detter, J Chris A1 - Dodson, Robert J A1 - Durkin, A Scott A1 - Ganapathy, Anuradha A1 - Gwinn-Giglio, Michelle A1 - Han, Cliff S A1 - Khouri, Hoda A1 - Kiss, Hajnalka A1 - Kothari, Sagar P A1 - Madupu, Ramana A1 - Nelson, Karen E A1 - Nelson, William C A1 - Paulsen, Ian A1 - Penn, Kevin A1 - Ren, Qinghu A1 - Rosovitz, M J A1 - Selengut, Jeremy D A1 - Shrivastava, Susmita A1 - Sullivan, Steven A A1 - Tapia, Roxanne A1 - Thompson, L Sue A1 - Watkins, Kisha L A1 - Yang, Qi A1 - Yu, Chunhui A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Kuske, Cheryl R KW - Anti-Bacterial Agents KW - bacteria KW - Biological Transport KW - Carbohydrate Metabolism KW - Cyanobacteria KW - DNA, Bacterial KW - Fungi KW - Genome, Bacterial KW - Macrolides KW - Molecular Sequence Data KW - Nitrogen KW - Phylogeny KW - Proteobacteria KW - Sequence Analysis, DNA KW - Sequence Homology KW - Soil Microbiology AB -

The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.

VL - 75 CP - 7 M3 - 10.1128/AEM.02294-08 ER - TY - JOUR T1 - Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils JF - Applied and environmental microbiologyApplied and environmental microbiology Y1 - 2009 A1 - Ward, Naomi L. A1 - Challacombe, Jean F. A1 - Janssen, Peter H. A1 - Henrissat, Bernard A1 - Coutinho, Pedro M. A1 - Wu, Martin A1 - Xie, Gary A1 - Haft, Daniel H. A1 - Sait, Michelle A1 - Badger, Jonathan A1 - Barabote, Ravi D. A1 - Bradley, Brent A1 - Brettin, Thomas S. A1 - Brinkac, Lauren M. A1 - Bruce, David A1 - Creasy, Todd A1 - Daugherty, Sean C. A1 - Davidsen, Tanja M. A1 - DeBoy, Robert T. A1 - Detter, J. Chris A1 - Dodson, Robert J. A1 - Durkin, A. Scott A1 - Ganapathy, Anuradha A1 - Gwinn-Giglio, Michelle A1 - Han, Cliff S. A1 - Khouri, Hoda A1 - Kiss, Hajnalka A1 - Kothari, Sagar P. A1 - Madupu, Ramana A1 - Nelson, Karen E. A1 - Nelson, William C. A1 - Paulsen, Ian A1 - Penn, Kevin A1 - Ren, Qinghu A1 - Rosovitz, M. J. A1 - J. Selengut A1 - Shrivastava, Susmita A1 - Sullivan, Steven A. A1 - Tapia, Roxanne A1 - Thompson, L. Sue A1 - Watkins, Kisha L. A1 - Yang, Qi A1 - Yu, Chunhui A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Kuske, Cheryl R. KW - Anti-Bacterial Agents KW - bacteria KW - Biological Transport KW - Carbohydrate Metabolism KW - Cyanobacteria KW - DNA, Bacterial KW - Fungi KW - Genome, Bacterial KW - Macrolides KW - Molecular Sequence Data KW - Nitrogen KW - Phylogeny KW - Proteobacteria KW - Sequence Analysis, DNA KW - Sequence Homology KW - Soil Microbiology AB - The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration. VL - 75 N1 - http://www.ncbi.nlm.nih.gov/pubmed/19201974?dopt=Abstract ER - TY - JOUR T1 - Genome sequence and identification of candidate vaccine antigens from the animal pathogen Dichelobacter nodosus. JF - Nat Biotechnol Y1 - 2007 A1 - Myers, Garry S A A1 - Parker, Dane A1 - Al-Hasani, Keith A1 - Kennan, Ruth M A1 - Seemann, Torsten A1 - Ren, Qinghu A1 - Badger, Jonathan H A1 - Selengut, Jeremy D A1 - DeBoy, Robert T A1 - Tettelin, Hervé A1 - Boyce, John D A1 - McCarl, Victoria P A1 - Han, Xiaoyan A1 - Nelson, William C A1 - Madupu, Ramana A1 - Mohamoud, Yasmin A1 - Holley, Tara A1 - Fedorova, Nadia A1 - Khouri, Hoda A1 - Bottomley, Steven P A1 - Whittington, Richard J A1 - Adler, Ben A1 - Songer, J Glenn A1 - Rood, Julian I A1 - Paulsen, Ian T KW - Animals KW - Antigens KW - Chromosome mapping KW - Dichelobacter nodosus KW - Foot Rot KW - Genome, Bacterial KW - Sequence Analysis, DNA AB -

Dichelobacter nodosus causes ovine footrot, a disease that leads to severe economic losses in the wool and meat industries. We sequenced its 1.4-Mb genome, the smallest known genome of an anaerobe. It differs markedly from small genomes of intracellular bacteria, retaining greater biosynthetic capabilities and lacking any evidence of extensive ongoing genome reduction. Comparative genomic microarray studies and bioinformatic analysis suggested that, despite its small size, almost 20% of the genome is derived from lateral gene transfer. Most of these regions seem to be associated with virulence. Metabolic reconstruction indicated unsuspected capabilities, including carbohydrate utilization, electron transfer and several aerobic pathways. Global transcriptional profiling and bioinformatic analysis enabled the prediction of virulence factors and cell surface proteins. Screening of these proteins against ovine antisera identified eight immunogenic proteins that are candidate antigens for a cross-protective vaccine.

VL - 25 CP - 5 M3 - 10.1038/nbt1302 ER - TY - JOUR T1 - Genome sequence and identification of candidate vaccine antigens from the animal pathogen Dichelobacter nodosus JF - Nature biotechnologyNature biotechnology Y1 - 2007 A1 - Myers, Garry S. A. A1 - Parker, Dane A1 - Al-Hasani, Keith A1 - Kennan, Ruth M. A1 - Seemann, Torsten A1 - Ren, Qinghu A1 - Badger, Jonathan H. A1 - J. Selengut A1 - DeBoy, Robert T. A1 - Tettelin, Hervé A1 - Boyce, John D. A1 - McCarl, Victoria P. A1 - Han, Xiaoyan A1 - Nelson, William C. A1 - Madupu, Ramana A1 - Mohamoud, Yasmin A1 - Holley, Tara A1 - Fedorova, Nadia A1 - Khouri, Hoda A1 - Bottomley, Steven P. A1 - Whittington, Richard J. A1 - Adler, Ben A1 - Songer, J. Glenn A1 - Rood, Julian I. A1 - Paulsen, Ian T. KW - Animals KW - Antigens KW - Chromosome mapping KW - Dichelobacter nodosus KW - Foot Rot KW - Genome, Bacterial KW - Sequence Analysis, DNA AB - Dichelobacter nodosus causes ovine footrot, a disease that leads to severe economic losses in the wool and meat industries. We sequenced its 1.4-Mb genome, the smallest known genome of an anaerobe. It differs markedly from small genomes of intracellular bacteria, retaining greater biosynthetic capabilities and lacking any evidence of extensive ongoing genome reduction. Comparative genomic microarray studies and bioinformatic analysis suggested that, despite its small size, almost 20% of the genome is derived from lateral gene transfer. Most of these regions seem to be associated with virulence. Metabolic reconstruction indicated unsuspected capabilities, including carbohydrate utilization, electron transfer and several aerobic pathways. Global transcriptional profiling and bioinformatic analysis enabled the prediction of virulence factors and cell surface proteins. Screening of these proteins against ovine antisera identified eight immunogenic proteins that are candidate antigens for a cross-protective vaccine. VL - 25 N1 - http://www.ncbi.nlm.nih.gov/pubmed/17468768?dopt=Abstract ER - TY - JOUR T1 - TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes JF - Nucleic acids researchNucleic Acids Research Y1 - 2007 A1 - J. Selengut A1 - Haft, Daniel H. A1 - Davidsen, Tanja A1 - Ganapathy, Anurhada A1 - Gwinn-Giglio, Michelle A1 - Nelson, William C. A1 - Richter, R. Alexander A1 - White, Owen KW - Archaeal Proteins KW - Bacterial Proteins KW - Databases, Protein KW - Genome, Bacterial KW - Genomics KW - Internet KW - Phylogeny KW - software KW - User-Computer Interface AB - TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe 'equivalog' families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered. The TIGRFAMs and Genome Properties systems can be accessed at http://www.tigr.org/TIGRFAMs and http://www.tigr.org/Genome_Properties. VL - 35 N1 - http://www.ncbi.nlm.nih.gov/pubmed/17151080?dopt=Abstract ER - TY - JOUR T1 - Comparative genomic evidence for a close relationship between the dimorphic prosthecate bacteria Hyphomonas neptunium and Caulobacter crescentus JF - Journal of bacteriologyJournal of bacteriology Y1 - 2006 A1 - Badger, Jonathan H. A1 - Hoover, Timothy R. A1 - Brun, Yves V. A1 - Weiner, Ronald M. A1 - Laub, Michael T. A1 - Alexandre, Gladys A1 - Mrázek, Jan A1 - Ren, Qinghu A1 - Paulsen, Ian T. A1 - Nelson, Karen E. A1 - Khouri, Hoda M. A1 - Radune, Diana A1 - Sosa, Julia A1 - Dodson, Robert J. A1 - Sullivan, Steven A. A1 - Rosovitz, M. J. A1 - Madupu, Ramana A1 - Brinkac, Lauren M. A1 - Durkin, A. Scott A1 - Daugherty, Sean C. A1 - Kothari, Sagar P. A1 - Giglio, Michelle Gwinn A1 - Zhou, Liwei A1 - Haft, Daniel H. A1 - J. Selengut A1 - Davidsen, Tanja M. A1 - Yang, Qi A1 - Zafar, Nikhat A1 - Ward, Naomi L. KW - Alphaproteobacteria KW - Bacterial Outer Membrane Proteins KW - Caulobacter crescentus KW - cell cycle KW - Chemotaxis KW - DNA, Bacterial KW - Flagella KW - Genome, Bacterial KW - Microbial Viability KW - Molecular Sequence Data KW - Movement KW - Sequence Analysis, DNA KW - Sequence Homology KW - signal transduction AB - The dimorphic prosthecate bacteria (DPB) are alpha-proteobacteria that reproduce in an asymmetric manner rather than by binary fission and are of interest as simple models of development. Prior to this work, the only member of this group for which genome sequence was available was the model freshwater organism Caulobacter crescentus. Here we describe the genome sequence of Hyphomonas neptunium, a marine member of the DPB that differs from C. crescentus in that H. neptunium uses its stalk as a reproductive structure. Genome analysis indicates that this organism shares more genes with C. crescentus than it does with Silicibacter pomeroyi (a closer relative according to 16S rRNA phylogeny), that it relies upon a heterotrophic strategy utilizing a wide range of substrates, that its cell cycle is likely to be regulated in a similar manner to that of C. crescentus, and that the outer membrane complements of H. neptunium and C. crescentus are remarkably similar. H. neptunium swarmer cells are highly motile via a single polar flagellum. With the exception of cheY and cheR, genes required for chemotaxis were absent in the H. neptunium genome. Consistent with this observation, H. neptunium swarmer cells did not respond to any chemotactic stimuli that were tested, which suggests that H. neptunium motility is a random dispersal mechanism for swarmer cells rather than a stimulus-controlled navigation system for locating specific environments. In addition to providing insights into bacterial development, the H. neptunium genome will provide an important resource for the study of other interesting biological processes including chromosome segregation, polar growth, and cell aging. VL - 188 N1 - http://www.ncbi.nlm.nih.gov/pubmed/16980487?dopt=Abstract ER - TY - JOUR T1 - Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic JF - BMC biologyBMC biology Y1 - 2006 A1 - Haft, Daniel H. A1 - Paulsen, Ian T. A1 - Ward, Naomi A1 - J. Selengut KW - Amino Acid Motifs KW - Amino Acid Sequence KW - bacteria KW - Bacterial Proteins KW - Biofilms KW - Genome, Bacterial KW - Markov chains KW - Molecular Sequence Data KW - Phylogeny KW - Polysaccharides, Bacterial KW - Protein Sorting Signals KW - Protein Transport KW - Seawater KW - sequence alignment KW - Soil Microbiology AB - BACKGROUND: Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families. RESULTS: We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT). CONCLUSION: These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods. VL - 4 N1 - http://www.ncbi.nlm.nih.gov/pubmed/16930487?dopt=Abstract ER - TY - JOUR T1 - Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome" JF - Proceedings of the National Academy of Sciences of the United States of AmericaProceedings of the National Academy of Sciences of the United States of America Y1 - 2005 A1 - Tettelin, Hervé A1 - Masignani, Vega A1 - Cieslewicz, Michael J. A1 - Donati, Claudio A1 - Medini, Duccio A1 - Ward, Naomi L. A1 - Angiuoli, Samuel V. A1 - Crabtree, Jonathan A1 - Jones, Amanda L. A1 - Durkin, A. Scott A1 - DeBoy, Robert T. A1 - Davidsen, Tanja M. A1 - Mora, Marirosa A1 - Scarselli, Maria A1 - Margarit y Ros, Immaculada A1 - Peterson, Jeremy D. A1 - Hauser, Christopher R. A1 - Sundaram, Jaideep P. A1 - Nelson, William C. A1 - Madupu, Ramana A1 - Brinkac, Lauren M. A1 - Dodson, Robert J. A1 - Rosovitz, Mary J. A1 - Sullivan, Steven A. A1 - Daugherty, Sean C. A1 - Haft, Daniel H. A1 - J. Selengut A1 - Gwinn, Michelle L. A1 - Zhou, Liwei A1 - Zafar, Nikhat A1 - Khouri, Hoda A1 - Radune, Diana A1 - Dimitrov, George A1 - Watkins, Kisha A1 - O'Connor, Kevin J. B. A1 - Smith, Shannon A1 - Utterback, Teresa R. A1 - White, Owen A1 - Rubens, Craig E. A1 - Grandi, Guido A1 - Madoff, Lawrence C. A1 - Kasper, Dennis L. A1 - Telford, John L. A1 - Wessels, Michael R. A1 - Rappuoli, Rino A1 - Fraser, Claire M. KW - Amino Acid Sequence KW - Bacterial Capsules KW - Base Sequence KW - Gene expression KW - Genes, Bacterial KW - Genetic Variation KW - Genome, Bacterial KW - Molecular Sequence Data KW - Phylogeny KW - sequence alignment KW - Sequence Analysis, DNA KW - Streptococcus agalactiae KW - virulence AB - The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes. VL - 102 N1 - http://www.ncbi.nlm.nih.gov/pubmed/16172379?dopt=Abstract ER - TY - JOUR T1 - A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes JF - PLoS computational biologyPLOS Computational Biology Y1 - 2005 A1 - Haft, Daniel H. A1 - J. Selengut A1 - Mongodin, Emmanuel F. A1 - Nelson, Karen E. KW - Genes, Archaeal KW - Genes, Bacterial KW - Genes, Fungal KW - Genome KW - Genome, Bacterial KW - Haloarcula marismortui KW - Markov chains KW - Multigene Family KW - Oligonucleotide Array Sequence Analysis KW - Phylogeny KW - Prokaryotic Cells KW - Proteins KW - Repetitive Sequences, Nucleic Acid KW - Yersinia pestis AB - Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated. VL - 1 N1 - http://www.ncbi.nlm.nih.gov/pubmed/16292354?dopt=Abstract ER - TY - JOUR T1 - Whole-genome sequence analysis of Pseudomonas syringae pv. phaseolicola 1448A reveals divergence among pathovars in genes involved in virulence and transposition JF - Journal of bacteriologyJournal of bacteriology Y1 - 2005 A1 - Joardar, Vinita A1 - Lindeberg, Magdalen A1 - Jackson, Robert W. A1 - J. Selengut A1 - Dodson, Robert A1 - Brinkac, Lauren M. A1 - Daugherty, Sean C. A1 - Deboy, Robert A1 - Durkin, A. Scott A1 - Giglio, Michelle Gwinn A1 - Madupu, Ramana A1 - Nelson, William C. A1 - Rosovitz, M. J. A1 - Sullivan, Steven A1 - Crabtree, Jonathan A1 - Creasy, Todd A1 - Davidsen, Tanja A1 - Haft, Dan H. A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Halpin, Rebecca A1 - Holley, Tara A1 - Khouri, Hoda A1 - Feldblyum, Tamara A1 - White, Owen A1 - Fraser, Claire M. A1 - Chatterjee, Arun K. A1 - Cartinhour, Sam A1 - Schneider, David J. A1 - Mansfield, John A1 - Collmer, Alan A1 - Buell, C. Robin KW - Bacterial Proteins KW - DNA, Bacterial KW - Genes, Bacterial KW - Genome, Bacterial KW - Molecular Sequence Data KW - Pseudomonas syringae KW - Species Specificity KW - virulence AB - Pseudomonas syringae pv. phaseolicola, a gram-negative bacterial plant pathogen, is the causal agent of halo blight of bean. In this study, we report on the genome sequence of P. syringae pv. phaseolicola isolate 1448A, which encodes 5,353 open reading frames (ORFs) on one circular chromosome (5,928,787 bp) and two plasmids (131,950 bp and 51,711 bp). Comparative analyses with a phylogenetically divergent pathovar, P. syringae pv. tomato DC3000, revealed a strong degree of conservation at the gene and genome levels. In total, 4,133 ORFs were identified as putative orthologs in these two pathovars using a reciprocal best-hit method, with 3,941 ORFs present in conserved, syntenic blocks. Although these two pathovars are highly similar at the physiological level, they have distinct host ranges; 1448A causes disease in beans, and DC3000 is pathogenic on tomato and Arabidopsis. Examination of the complement of ORFs encoding virulence, fitness, and survival factors revealed a substantial, but not complete, overlap between these two pathovars. Another distinguishing feature between the two pathovars is their distinctive sets of transposable elements. With access to a fifth complete pseudomonad genome sequence, we were able to identify 3,567 ORFs that likely comprise the core Pseudomonas genome and 365 ORFs that are P. syringae specific. VL - 187 N1 - http://www.ncbi.nlm.nih.gov/pubmed/16159782?dopt=Abstract ER - TY - JOUR T1 - Comparison of the genome of the oral pathogen Treponema denticola with other spirochete genomes JF - Proceedings of the National Academy of Sciences of the United States of AmericaProceedings of the National Academy of Sciences of the United States of America Y1 - 2004 A1 - Seshadri, Rekha A1 - Myers, Garry S. A. A1 - Tettelin, Hervé A1 - Eisen, Jonathan A. A1 - Heidelberg, John F. A1 - Dodson, Robert J. A1 - Davidsen, Tanja M. A1 - DeBoy, Robert T. A1 - Fouts, Derrick E. A1 - Haft, Dan H. A1 - J. Selengut A1 - Ren, Qinghu A1 - Brinkac, Lauren M. A1 - Madupu, Ramana A1 - Kolonay, Jamie A1 - Durkin, A. Scott A1 - Daugherty, Sean C. A1 - Shetty, Jyoti A1 - Shvartsbeyn, Alla A1 - Gebregeorgis, Elizabeth A1 - Geer, Keita A1 - Tsegaye, Getahun A1 - Malek, Joel A1 - Ayodeji, Bola A1 - Shatsman, Sofiya A1 - McLeod, Michael P. A1 - Smajs, David A1 - Howell, Jerrilyn K. A1 - Pal, Sangita A1 - Amin, Anita A1 - Vashisth, Pankaj A1 - McNeill, Thomas Z. A1 - Xiang, Qin A1 - Sodergren, Erica A1 - Baca, Ernesto A1 - Weinstock, George M. A1 - Norris, Steven J. A1 - Fraser, Claire M. A1 - Paulsen, Ian T. KW - ATP-Binding Cassette Transporters KW - Bacterial Proteins KW - Base Sequence KW - Borrelia burgdorferi KW - Genes, Bacterial KW - Genome, Bacterial KW - Leptospira interrogans KW - Models, Genetic KW - Molecular Sequence Data KW - Mouth KW - Sequence Homology, Amino Acid KW - Treponema KW - Treponema pallidum AB - We present the complete 2,843,201-bp genome sequence of Treponema denticola (ATCC 35405) an oral spirochete associated with periodontal disease. Analysis of the T. denticola genome reveals factors mediating coaggregation, cell signaling, stress protection, and other competitive and cooperative measures, consistent with its pathogenic nature and lifestyle within the mixed-species environment of subgingival dental plaque. Comparisons with previously sequenced spirochete genomes revealed specific factors contributing to differences and similarities in spirochete physiology as well as pathogenic potential. The T. denticola genome is considerably larger in size than the genome of the related syphilis-causing spirochete Treponema pallidum. The differences in gene content appear to be attributable to a combination of three phenomena: genome reduction, lineage-specific expansions, and horizontal gene transfer. Genes lost due to reductive evolution appear to be largely involved in metabolism and transport, whereas some of the genes that have arisen due to lineage-specific expansions are implicated in various pathogenic interactions, and genes acquired via horizontal gene transfer are largely phage-related or of unknown function. VL - 101 N1 - http://www.ncbi.nlm.nih.gov/pubmed/15064399?dopt=Abstract ER - TY - JOUR T1 - Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment JF - NatureNature Y1 - 2004 A1 - Moran, Mary Ann A1 - Buchan, Alison A1 - González, José M. A1 - Heidelberg, John F. A1 - Whitman, William B. A1 - Kiene, Ronald P. A1 - Henriksen, James R. A1 - King, Gary M. A1 - Belas, Robert A1 - Fuqua, Clay A1 - Brinkac, Lauren A1 - Lewis, Matt A1 - Johri, Shivani A1 - Weaver, Bruce A1 - Pai, Grace A1 - Eisen, Jonathan A. A1 - Rahe, Elisha A1 - Sheldon, Wade M. A1 - Ye, Wenying A1 - Miller, Todd R. A1 - Carlton, Jane A1 - Rasko, David A. A1 - Paulsen, Ian T. A1 - Ren, Qinghu A1 - Daugherty, Sean C. A1 - DeBoy, Robert T. A1 - Dodson, Robert J. A1 - Durkin, A. Scott A1 - Madupu, Ramana A1 - Nelson, William C. A1 - Sullivan, Steven A. A1 - Rosovitz, M. J. A1 - Haft, Daniel H. A1 - J. Selengut A1 - Ward, Naomi KW - Adaptation, Physiological KW - Carrier Proteins KW - Genes, Bacterial KW - Genome, Bacterial KW - marine biology KW - Molecular Sequence Data KW - Oceans and Seas KW - Phylogeny KW - plankton KW - RNA, Ribosomal, 16S KW - Roseobacter KW - Seawater AB - Since the recognition of prokaryotes as essential components of the oceanic food web, bacterioplankton have been acknowledged as catalysts of most major biogeochemical processes in the sea. Studying heterotrophic bacterioplankton has been challenging, however, as most major clades have never been cultured or have only been grown to low densities in sea water. Here we describe the genome sequence of Silicibacter pomeroyi, a member of the marine Roseobacter clade (Fig. 1), the relatives of which comprise approximately 10-20% of coastal and oceanic mixed-layer bacterioplankton. This first genome sequence from any major heterotrophic clade consists of a chromosome (4,109,442 base pairs) and megaplasmid (491,611 base pairs). Genome analysis indicates that this organism relies upon a lithoheterotrophic strategy that uses inorganic compounds (carbon monoxide and sulphide) to supplement heterotrophy. Silicibacter pomeroyi also has genes advantageous for associations with plankton and suspended particles, including genes for uptake of algal-derived compounds, use of metabolites from reducing microzones, rapid growth and cell-density-dependent regulation. This bacterium has a physiology distinct from that of marine oligotrophs, adding a new strategy to the recognized repertoire for coping with a nutrient-poor ocean. VL - 432 N1 - http://www.ncbi.nlm.nih.gov/pubmed/15602564?dopt=Abstract ER - TY - JOUR T1 - The genome sequence of the anaerobic, sulfate-reducing bacterium Desulfovibrio vulgaris Hildenborough JF - Nature biotechnologyNature biotechnology Y1 - 2004 A1 - Heidelberg, John F. A1 - Seshadri, Rekha A1 - Haveman, Shelley A. A1 - Hemme, Christopher L. A1 - Paulsen, Ian T. A1 - Kolonay, James F. A1 - Eisen, Jonathan A. A1 - Ward, Naomi A1 - Methe, Barbara A1 - Brinkac, Lauren M. A1 - Daugherty, Sean C. A1 - DeBoy, Robert T. A1 - Dodson, Robert J. A1 - Durkin, A. Scott A1 - Madupu, Ramana A1 - Nelson, William C. A1 - Sullivan, Steven A. A1 - Fouts, Derrick A1 - Haft, Daniel H. A1 - J. Selengut A1 - Peterson, Jeremy D. A1 - Davidsen, Tanja M. A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Radune, Diana A1 - Dimitrov, George A1 - Hance, Mark A1 - Tran, Kevin A1 - Khouri, Hoda A1 - Gill, John A1 - Utterback, Terry R. A1 - Feldblyum, Tamara V. A1 - Wall, Judy D. A1 - Voordouw, Gerrit A1 - Fraser, Claire M. KW - Desulfovibrio vulgaris KW - Energy Metabolism KW - Genome, Bacterial KW - Molecular Sequence Data AB - Desulfovibrio vulgaris Hildenborough is a model organism for studying the energy metabolism of sulfate-reducing bacteria (SRB) and for understanding the economic impacts of SRB, including biocorrosion of metal infrastructure and bioremediation of toxic metal ions. The 3,570,858 base pair (bp) genome sequence reveals a network of novel c-type cytochromes, connecting multiple periplasmic hydrogenases and formate dehydrogenases, as a key feature of its energy metabolism. The relative arrangement of genes encoding enzymes for energy transduction, together with inferred cellular location of the enzymes, provides a basis for proposing an expansion to the 'hydrogen-cycling' model for increasing energy efficiency in this bacterium. Plasmid-encoded functions include modification of cell surface components, nitrogen fixation and a type-III protein secretion system. This genome sequence represents a substantial step toward the elucidation of pathways for reduction (and bioremediation) of pollutants such as uranium and chromium and offers a new starting point for defining this organism's complex anaerobic respiration. VL - 22 N1 - http://www.ncbi.nlm.nih.gov/pubmed/15077118?dopt=Abstract ER - TY - JOUR T1 - Structural flexibility in the Burkholderia mallei genome JF - Proceedings of the National Academy of Sciences of the United States of AmericaProceedings of the National Academy of Sciences of the United States of America Y1 - 2004 A1 - Nierman, William C. A1 - DeShazer, David A1 - Kim, H. Stanley A1 - Tettelin, Hervé A1 - Nelson, Karen E. A1 - Feldblyum, Tamara A1 - Ulrich, Ricky L. A1 - Ronning, Catherine M. A1 - Brinkac, Lauren M. A1 - Daugherty, Sean C. A1 - Davidsen, Tanja D. A1 - DeBoy, Robert T. A1 - Dimitrov, George A1 - Dodson, Robert J. A1 - Durkin, A. Scott A1 - Gwinn, Michelle L. A1 - Haft, Daniel H. A1 - Khouri, Hoda A1 - Kolonay, James F. A1 - Madupu, Ramana A1 - Mohammoud, Yasmin A1 - Nelson, William C. A1 - Radune, Diana A1 - Romero, Claudia M. A1 - Sarria, Saul A1 - J. Selengut A1 - Shamblin, Christine A1 - Sullivan, Steven A. A1 - White, Owen A1 - Yu, Yan A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Fraser, Claire M. KW - Animals KW - Base Composition KW - Base Sequence KW - Burkholderia mallei KW - Chromosomes, Bacterial KW - Cricetinae KW - Genome, Bacterial KW - Glanders KW - Liver KW - Mesocricetus KW - Molecular Sequence Data KW - Multigene Family KW - Oligonucleotide Array Sequence Analysis KW - Open Reading Frames KW - virulence AB - The complete genome sequence of Burkholderia mallei ATCC 23344 provides insight into this highly infectious bacterium's pathogenicity and evolutionary history. B. mallei, the etiologic agent of glanders, has come under renewed scientific investigation as a result of recent concerns about its past and potential future use as a biological weapon. Genome analysis identified a number of putative virulence factors whose function was supported by comparative genome hybridization and expression profiling of the bacterium in hamster liver in vivo. The genome contains numerous insertion sequence elements that have mediated extensive deletions and rearrangements of the genome relative to Burkholderia pseudomallei. The genome also contains a vast number (>12,000) of simple sequence repeats. Variation in simple sequence repeats in key genes can provide a mechanism for generating antigenic variation that may account for the mammalian host's inability to mount a durable adaptive immune response to a B. mallei infection. VL - 101 N1 - http://www.ncbi.nlm.nih.gov/pubmed/15377793?dopt=Abstract ER - TY - JOUR T1 - Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species JF - Nucleic acids researchNucleic Acids Research Y1 - 2004 A1 - Nelson, Karen E. A1 - Fouts, Derrick E. A1 - Mongodin, Emmanuel F. A1 - Ravel, Jacques A1 - DeBoy, Robert T. A1 - Kolonay, James F. A1 - Rasko, David A. A1 - Angiuoli, Samuel V. A1 - Gill, Steven R. A1 - Paulsen, Ian T. A1 - Peterson, Jeremy A1 - White, Owen A1 - Nelson, William C. A1 - Nierman, William A1 - Beanan, Maureen J. A1 - Brinkac, Lauren M. A1 - Daugherty, Sean C. A1 - Dodson, Robert J. A1 - Durkin, A. Scott A1 - Madupu, Ramana A1 - Haft, Daniel H. A1 - J. Selengut A1 - Van Aken, Susan A1 - Khouri, Hoda A1 - Fedorova, Nadia A1 - Forberger, Heather A1 - Tran, Bao A1 - Kathariou, Sophia A1 - Wonderling, Laura D. A1 - Uhlich, Gaylen A. A1 - Bayles, Darrell O. A1 - Luchansky, John B. A1 - Fraser, Claire M. KW - Base Composition KW - Chromosomes, Bacterial KW - DNA Transposable Elements KW - Food Microbiology KW - Genes, Bacterial KW - Genome, Bacterial KW - Genomics KW - Listeria monocytogenes KW - Meat KW - Open Reading Frames KW - Physical Chromosome Mapping KW - Polymorphism, Single Nucleotide KW - Prophages KW - Serotyping KW - Species Specificity KW - Synteny KW - virulence AB - The genomes of three strains of Listeria monocytogenes that have been associated with food-borne illness in the USA were subjected to whole genome comparative analysis. A total of 51, 97 and 69 strain-specific genes were identified in L.monocytogenes strains F2365 (serotype 4b, cheese isolate), F6854 (serotype 1/2a, frankfurter isolate) and H7858 (serotype 4b, meat isolate), respectively. Eighty-three genes were restricted to serotype 1/2a and 51 to serotype 4b strains. These strain- and serotype-specific genes probably contribute to observed differences in pathogenicity, and the ability of the organisms to survive and grow in their respective environmental niches. The serotype 1/2a-specific genes include an operon that encodes the rhamnose biosynthetic pathway that is associated with teichoic acid biosynthesis, as well as operons for five glycosyl transferases and an adenine-specific DNA methyltransferase. A total of 8603 and 105 050 high quality single nucleotide polymorphisms (SNPs) were found on the draft genome sequences of strain H7858 and strain F6854, respectively, when compared with strain F2365. Whole genome comparative analyses revealed that the L.monocytogenes genomes are essentially syntenic, with the majority of genomic differences consisting of phage insertions, transposable elements and SNPs. VL - 32 N1 - http://www.ncbi.nlm.nih.gov/pubmed/15115801?dopt=Abstract ER - TY - JOUR T1 - The complete genome sequence of the Arabidopsis and tomato pathogen Pseudomonas syringae pv. tomato DC3000 JF - Proceedings of the National Academy of Sciences of the United States of AmericaProceedings of the National Academy of Sciences of the United States of America Y1 - 2003 A1 - Buell, C. Robin A1 - Joardar, Vinita A1 - Lindeberg, Magdalen A1 - J. Selengut A1 - Paulsen, Ian T. A1 - Gwinn, Michelle L. A1 - Dodson, Robert J. A1 - DeBoy, Robert T. A1 - Durkin, A. Scott A1 - Kolonay, James F. A1 - Madupu, Ramana A1 - Daugherty, Sean A1 - Brinkac, Lauren A1 - Beanan, Maureen J. A1 - Haft, Daniel H. A1 - Nelson, William C. A1 - Davidsen, Tanja A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Liu, Jia A1 - Yuan, Qiaoping A1 - Khouri, Hoda A1 - Fedorova, Nadia A1 - Tran, Bao A1 - Russell, Daniel A1 - Berry, Kristi A1 - Utterback, Teresa A1 - Aken, Susan E. van A1 - Feldblyum, Tamara V. A1 - D'Ascenzo, Mark A1 - Deng, Wen-Ling A1 - Ramos, Adela R. A1 - Alfano, James R. A1 - Cartinhour, Samuel A1 - Chatterjee, Arun K. A1 - Delaney, Terrence P. A1 - Lazarowitz, Sondra G. A1 - Martin, Gregory B. A1 - Schneider, David J. A1 - Tang, Xiaoyan A1 - Bender, Carol L. A1 - White, Owen A1 - Fraser, Claire M. A1 - Collmer, Alan KW - Arabidopsis KW - Base Sequence KW - Biological Transport KW - Genome, Bacterial KW - Lycopersicon esculentum KW - Molecular Sequence Data KW - Plant Growth Regulators KW - Plasmids KW - Pseudomonas KW - Reactive Oxygen Species KW - Siderophores KW - virulence AB - We report the complete genome sequence of the model bacterial pathogen Pseudomonas syringae pathovar tomato DC3000 (DC3000), which is pathogenic on tomato and Arabidopsis thaliana. The DC3000 genome (6.5 megabases) contains a circular chromosome and two plasmids, which collectively encode 5,763 ORFs. We identified 298 established and putative virulence genes, including several clusters of genes encoding 31 confirmed and 19 predicted type III secretion system effector proteins. Many of the virulence genes were members of paralogous families and also were proximal to mobile elements, which collectively comprise 7% of the DC3000 genome. The bacterium possesses a large repertoire of transporters for the acquisition of nutrients, particularly sugars, as well as genes implicated in attachment to plant surfaces. Over 12% of the genes are dedicated to regulation, which may reflect the need for rapid adaptation to the diverse environments encountered during epiphytic growth and pathogenesis. Comparative analyses confirmed a high degree of similarity with two sequenced pseudomonads, Pseudomonas putida and Pseudomonas aeruginosa, yet revealed 1,159 genes unique to DC3000, of which 811 lack a known function. VL - 100 N1 - http://www.ncbi.nlm.nih.gov/pubmed/12928499?dopt=Abstract ER - TY - JOUR T1 - Genome of Geobacter sulfurreducens: metal reduction in subsurface environments JF - Science (New York, N.Y.)Science (New York, N.Y.) Y1 - 2003 A1 - Methé, B. A. A1 - Nelson, K. E. A1 - Eisen, J. A. A1 - Paulsen, I. T. A1 - Nelson, W. A1 - Heidelberg, J. F. A1 - Wu, D. A1 - Wu, M. A1 - Ward, N. A1 - Beanan, M. J. A1 - Dodson, R. J. A1 - Madupu, R. A1 - Brinkac, L. M. A1 - Daugherty, S. C. A1 - DeBoy, R. T. A1 - Durkin, A. S. A1 - Gwinn, M. A1 - Kolonay, J. F. A1 - Sullivan, S. A. A1 - Haft, D. H. A1 - J. Selengut A1 - Davidsen, T. M. A1 - Zafar, N. A1 - White, O. A1 - Tran, B. A1 - Romero, C. A1 - Forberger, H. A. A1 - Weidman, J. A1 - Khouri, H. A1 - Feldblyum, T. V. A1 - Utterback, T. R. A1 - Van Aken, S. E. A1 - Lovley, D. R. A1 - Fraser, C. M. KW - Acetates KW - Acetyl Coenzyme A KW - Aerobiosis KW - Anaerobiosis KW - Bacterial Proteins KW - Carbon KW - Chemotaxis KW - Chromosomes, Bacterial KW - Cytochromes c KW - Electron Transport KW - Energy Metabolism KW - Genes, Bacterial KW - Genes, Regulator KW - Genome, Bacterial KW - Geobacter KW - Hydrogen KW - Metals KW - Movement KW - Open Reading Frames KW - Oxidation-Reduction KW - Phylogeny AB - The complete genome sequence of Geobacter sulfurreducens, a delta-proteobacterium, reveals unsuspected capabilities, including evidence of aerobic metabolism, one-carbon and complex carbon metabolism, motility, and chemotactic behavior. These characteristics, coupled with the possession of many two-component sensors and many c-type cytochromes, reveal an ability to create alternative, redundant, electron transport networks and offer insights into the process of metal ion reduction in subsurface environments. As well as playing roles in the global cycling of metals and carbon, this organism clearly has the potential for use in bioremediation of radioactive metals and in the generation of electricity. VL - 302 N1 - http://www.ncbi.nlm.nih.gov/pubmed/14671304?dopt=Abstract ER -