TY - Generic
T1 - Developmental expression of chicken FOXN1 and putative target genes during feather development.
Y1 - 2014
A1 - Darnell, Diana K
A1 - Zhang, Li S
A1 - Hannenhalli, Sridhar
A1 - Yaklichkin, Sergey Y
KW - Amino Acid Sequence
KW - Animals
KW - Biological Evolution
KW - Blotting, Western
KW - Cell Differentiation
KW - Cells, Cultured
KW - Chick Embryo
KW - Chickens
KW - Cloning, Molecular
KW - Embryo, Nonmammalian
KW - Epidermis
KW - Feathers
KW - Forkhead Transcription Factors
KW - Gene Expression Regulation, Developmental
KW - In Situ Hybridization
KW - Molecular Sequence Data
KW - Morphogenesis
KW - Phylogeny
KW - Real-Time Polymerase Chain Reaction
KW - Reverse Transcriptase Polymerase Chain Reaction
KW - RNA, Messenger
KW - Sequence Homology, Amino Acid
AB -
FOXN1 is a member of the forkhead box family of transcription factors. FOXN1 is crucial for hair outgrowth and thymus differentiation in mammals. Unlike the thymus, which is found in all amniotes, hair is an epidermal appendage that arose after the last shared common ancestor between mammals and birds, and hair and feathers differ markedly in their differentiation and gene expression. Here, we show that FOXN1 is expressed in embryonic chicken feathers, nails and thymus, demonstrating an evolutionary conservation that goes beyond obvious homology. At embryonic day (ED) 12, FOXN1 is expressed in some feather buds and at ED13 expression extends along the length of the feather filament. At ED14 FOXN1 mRNA is restricted to the proximal feather filament and is not detectable in distal feather shafts. At the base of the feather, FOXN1 is expressed in the epithelium of the feather sheath and distal barb and marginal plate, whereas in the midsection FOXN1 transcripts are mainly detected in the barb plates of the feather filament. FOXN1 is also expressed in claws; however, no expression was detected in skin or scales. Despite expression of FOXN1 in developing feathers, examination of chick homologs of five putative mammalian FOXN1 target genes shows that, while these genes are expressed in feathers, there is little similarity to the FOXN1 expression pattern, suggesting that some gene regulatory networks may have diverged during evolution of epidermal appendages.
JA - Int J Dev Biol
VL - 58
CP - 1
M3 - 10.1387/ijdb.130023sy
ER -
TY - JOUR
T1 - A large-scale, higher-level, molecular phylogenetic study of the insect order Lepidoptera (moths and butterflies)
JF - PLoS OnePLoS One
Y1 - 2013
A1 - Regier, Jerome C.
A1 - Mitter, Charles
A1 - Zwick, Andreas
A1 - Adam L. Bazinet
A1 - Michael P. Cummings
A1 - Kawahara, Akito Y.
A1 - Sohn, Jae-Cheon
A1 - Zwickl, Derrick J.
A1 - Cho, Soowon
A1 - Davis, Donald R.
A1 - Baixeras, Joaquin
A1 - Brown, John
A1 - Parr, Cynthia
A1 - Weller, Susan
A1 - Lees, David C.
A1 - Mitter, Kim T.
KW - Animals
KW - Butterflies
KW - Moths
KW - Phylogeny
AB - BACKGROUND: Higher-level relationships within the Lepidoptera, and particularly within the species-rich subclade Ditrysia, are generally not well understood, although recent studies have yielded progress. We present the most comprehensive molecular analysis of lepidopteran phylogeny to date, focusing on relationships among superfamilies.
METHODOLOGY PRINCIPAL FINDINGS: 483 taxa spanning 115 of 124 families were sampled for 19 protein-coding nuclear genes, from which maximum likelihood tree estimates and bootstrap percentages were obtained using GARLI. Assessment of heuristic search effectiveness showed that better trees and higher bootstrap percentages probably remain to be discovered even after 1000 or more search replicates, but further search proved impractical even with grid computing. Other analyses explored the effects of sampling nonsynonymous change only versus partitioned and unpartitioned total nucleotide change; deletion of rogue taxa; and compositional heterogeneity. Relationships among the non-ditrysian lineages previously inferred from morphology were largely confirmed, plus some new ones, with strong support. Robust support was also found for divergences among non-apoditrysian lineages of Ditrysia, but only rarely so within Apoditrysia. Paraphyly for Tineoidea is strongly supported by analysis of nonsynonymous-only signal; conflicting, strong support for tineoid monophyly when synonymous signal was added back is shown to result from compositional heterogeneity. CONCLUSIONS SIGNIFICANCE: Support for among-superfamily relationships outside the Apoditrysia is now generally strong. Comparable support is mostly lacking within Apoditrysia, but dramatically increased bootstrap percentages for some nodes after rogue taxon removal, and concordance with other evidence, strongly suggest that our picture of apoditrysian phylogeny is approximately correct. This study highlights the challenge of finding optimal topologies when analyzing hundreds of taxa. It also shows that some nodes get strong support only when analysis is restricted to nonsynonymous change, while total change is necessary for strong support of others. Thus, multiple types of analyses will be necessary to fully resolve lepidopteran phylogeny.
VL - 8
ER -
TY - JOUR
T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage.
JF - ISME J
Y1 - 2012
A1 - Dupont, Chris L
A1 - Rusch, Douglas B
A1 - Yooseph, Shibu
A1 - Lombardo, Mary-Jane
A1 - Richter, R Alexander
A1 - Valas, Ruben
A1 - Novotny, Mark
A1 - Yee-Greenbaum, Joyclyn
A1 - Selengut, Jeremy D
A1 - Haft, Dan H
A1 - Halpern, Aaron L
A1 - Lasken, Roger S
A1 - Nealson, Kenneth
A1 - Friedman, Robert
A1 - Venter, J Craig
KW - Computational Biology
KW - Gammaproteobacteria
KW - Genome, Bacterial
KW - Genomic Library
KW - metagenomics
KW - Oceans and Seas
KW - Phylogeny
KW - plankton
KW - Rhodopsin
KW - Rhodopsins, Microbial
KW - RNA, Ribosomal, 16S
KW - Seawater
AB - Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.
VL - 6
CP - 6
M3 - 10.1038/ismej.2011.189
ER -
TY - JOUR
T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage
JF - The ISME journalThe ISME journal
Y1 - 2012
A1 - Dupont, Chris L.
A1 - Rusch, Douglas B.
A1 - Yooseph, Shibu
A1 - Lombardo, Mary-Jane
A1 - Richter, R. Alexander
A1 - Valas, Ruben
A1 - Novotny, Mark
A1 - Yee-Greenbaum, Joyclyn
A1 - J. Selengut
A1 - Haft, Dan H.
A1 - Halpern, Aaron L.
A1 - Lasken, Roger S.
A1 - Nealson, Kenneth
A1 - Friedman, Robert
A1 - Venter, J. Craig
KW - Computational Biology
KW - Gammaproteobacteria
KW - Genome, Bacterial
KW - Genomic Library
KW - metagenomics
KW - Oceans and Seas
KW - Phylogeny
KW - plankton
KW - Rhodopsin
KW - RNA, Ribosomal, 16S
KW - Seawater
AB - Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.
VL - 6
N1 - http://www.ncbi.nlm.nih.gov/pubmed/22170421?dopt=Abstract
ER -
TY - JOUR
T1 - ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process.
JF - BMC Bioinformatics
Y1 - 2011
A1 - Basu, Malay K
A1 - Selengut, Jeremy D
A1 - Haft, Daniel H
KW - algorithms
KW - Archaea
KW - Archaeal Proteins
KW - DNA
KW - Methane
KW - Phylogeny
KW - software
AB - BACKGROUND: Phylogenetic profiling is a technique of scoring co-occurrence between a protein family and some other trait, usually another protein family, across a set of taxonomic groups. In spite of several refinements in recent years, the technique still invites significant improvement. To be its most effective, a phylogenetic profiling algorithm must be able to examine co-occurrences among protein families whose boundaries are uncertain within large homologous protein superfamilies.
RESULTS: Partial Phylogenetic Profiling (PPP) is an iterative algorithm that scores a given taxonomic profile against the taxonomic distribution of families for all proteins in a genome. The method works through optimizing the boundary of each protein family, rather than by relying on prebuilt protein families or fixed sequence similarity thresholds. Double Partial Phylogenetic Profiling (DPPP) is a related procedure that begins with a single sequence and searches for optimal granularities for its surrounding protein family in order to generate the best query profiles for PPP. We present ProPhylo, a high-performance software package for phylogenetic profiling studies through creating individually optimized protein family boundaries. ProPhylo provides precomputed databases for immediate use and tools for manipulating the taxonomic profiles used as queries.
CONCLUSION: ProPhylo results show universal markers of methanogenesis, a new DNA phosphorothioation-dependent restriction enzyme, and efficacy in guiding protein family construction. The software and the associated databases are freely available under the open source Perl Artistic License from ftp://ftp.jcvi.org/pub/data/ppp/.
VL - 12
M3 - 10.1186/1471-2105-12-434
ER -
TY - JOUR
T1 - ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process
JF - BMC bioinformaticsBMC Bioinformatics
Y1 - 2011
A1 - Basu, Malay K.
A1 - J. Selengut
A1 - Haft, Daniel H.
KW - algorithms
KW - Archaea
KW - Archaeal Proteins
KW - DNA
KW - Methane
KW - Phylogeny
KW - software
AB - BACKGROUND: Phylogenetic profiling is a technique of scoring co-occurrence between a protein family and some other trait, usually another protein family, across a set of taxonomic groups. In spite of several refinements in recent years, the technique still invites significant improvement. To be its most effective, a phylogenetic profiling algorithm must be able to examine co-occurrences among protein families whose boundaries are uncertain within large homologous protein superfamilies. RESULTS: Partial Phylogenetic Profiling (PPP) is an iterative algorithm that scores a given taxonomic profile against the taxonomic distribution of families for all proteins in a genome. The method works through optimizing the boundary of each protein family, rather than by relying on prebuilt protein families or fixed sequence similarity thresholds. Double Partial Phylogenetic Profiling (DPPP) is a related procedure that begins with a single sequence and searches for optimal granularities for its surrounding protein family in order to generate the best query profiles for PPP. We present ProPhylo, a high-performance software package for phylogenetic profiling studies through creating individually optimized protein family boundaries. ProPhylo provides precomputed databases for immediate use and tools for manipulating the taxonomic profiles used as queries. CONCLUSION: ProPhylo results show universal markers of methanogenesis, a new DNA phosphorothioation-dependent restriction enzyme, and efficacy in guiding protein family construction. The software and the associated databases are freely available under the open source Perl Artistic License from ftp://ftp.jcvi.org/pub/data/ppp/.
VL - 12
N1 - http://www.ncbi.nlm.nih.gov/pubmed/22070167?dopt=Abstract
ER -
TY - JOUR
T1 - The Alveolate Perkinsus marinus: biological insights from EST gene discovery.
JF - BMC Genomics
Y1 - 2010
A1 - Joseph, Sandeep J
A1 - Fernández-Robledo, José A
A1 - Gardner, Malcolm J
A1 - El-Sayed, Najib M
A1 - Kuo, Chih-Horng
A1 - Schott, Eric J
A1 - Wang, Haiming
A1 - Kissinger, Jessica C
A1 - Vasta, Gerardo R
KW - Alveolata
KW - Animals
KW - Expressed Sequence Tags
KW - Ostreidae
KW - Phylogeny
AB - BACKGROUND: Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date.
RESULTS: To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated>31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-valueCONCLUSIONS: Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict plastid. Further, although P. marinus sequences display significant similarity to those from both apicomplexans and dinoflagellates, the presence of trans-spliced transcripts confirms the previously established affinities with the latter. The EST analysis reported herein, together with the recently completed sequence of the P. marinus genome and the development of transfection methodology, should result in improved intervention strategies against dermo disease.
VL - 11
M3 - 10.1186/1471-2164-11-228
ER -
TY - Generic
T1 - MetaPhyler: Taxonomic profiling for metagenomic sequences
T2 - 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Y1 - 2010
A1 - Liu, Bo
A1 - Gibbons, T.
A1 - Ghodsi, M.
A1 - M. Pop
KW - Bioinformatics
KW - CARMA comparison
KW - Databases
KW - Genomics
KW - Linear regression
KW - marker genes
KW - matching length
KW - Megan comparison
KW - metagenomic sequences
KW - metagenomics
KW - MetaPhyler
KW - microbial diversity
KW - microorganisms
KW - molecular biophysics
KW - molecular configurations
KW - Pattern classification
KW - pattern matching
KW - phylogenetic classification
KW - Phylogeny
KW - PhymmBL comparison
KW - reference gene database
KW - Sensitivity
KW - sequence matching
KW - taxonomic classifier
KW - taxonomic level
KW - taxonomic profiling
KW - whole metagenome sequencing data
AB - A major goal of metagenomics is to characterize the microbial diversity of an environment. The most popular approach relies on 16S rRNA sequencing, however this approach can generate biased estimates due to differences in the copy number of the 16S rRNA gene between even closely related organisms, and due to PCR artifacts. The taxonomic composition can also be determined from whole-metagenome sequencing data by matching individual sequences against a database of reference genes. One major limitation of prior methods used for this purpose is the use of a universal classification threshold for all genes at all taxonomic levels. We propose that better classification results can be obtained by tuning the taxonomic classifier to each matching length, reference gene, and taxonomic level. We present a novel taxonomic profiler MetaPhyler, which uses marker genes as a taxonomic reference. Results on simulated datasets demonstrate that MetaPhyler outperforms other tools commonly used in this context (CARMA, Megan and PhymmBL). We also present interesting results obtained by applying MetaPhyler to a real metagenomic dataset.
JA - 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
PB - IEEE
SN - 978-1-4244-8306-8
ER -
TY - JOUR
T1 - Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function
JF - BMC bioinformaticsBMC Bioinformatics
Y1 - 2010
A1 - J. Selengut
A1 - Rusch, Douglas B.
A1 - Haft, Daniel H.
KW - algorithms
KW - Amino Acid Sequence
KW - Gene Expression Profiling
KW - Molecular Sequence Data
KW - Phylogeny
KW - Proteins
KW - Sequence Analysis, Protein
KW - Structure-Activity Relationship
AB - BACKGROUND: Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. RESULTS: Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. CONCLUSIONS: SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites.
VL - 11
N1 - http://www.ncbi.nlm.nih.gov/pubmed/20102603?dopt=Abstract
ER -
TY - JOUR
T1 - Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function.
JF - BMC Bioinformatics
Y1 - 2010
A1 - Selengut, Jeremy D
A1 - Rusch, Douglas B
A1 - Haft, Daniel H
KW - algorithms
KW - Amino Acid Sequence
KW - Gene Expression Profiling
KW - Molecular Sequence Data
KW - Phylogeny
KW - Proteins
KW - Sequence Analysis, Protein
KW - Structure-Activity Relationship
AB - BACKGROUND: Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets.
RESULTS: Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization.
CONCLUSIONS: SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites.
VL - 11
M3 - 10.1186/1471-2105-11-52
ER -
TY - JOUR
T1 - Unexpected abundance of coenzyme F(420)-dependent enzymes in Mycobacterium tuberculosis and other actinobacteria
JF - Journal of bacteriologyJournal of bacteriology
Y1 - 2010
A1 - J. Selengut
A1 - Haft, Daniel H.
KW - Actinobacteria
KW - Amino Acid Sequence
KW - Binding Sites
KW - Coenzymes
KW - Flavonoids
KW - Gene Expression Profiling
KW - Gene Expression Regulation, Bacterial
KW - Genome, Bacterial
KW - molecular biology
KW - Molecular Sequence Data
KW - Molecular Structure
KW - Mycobacterium tuberculosis
KW - Phylogeny
KW - Protein Conformation
KW - Riboflavin
AB - Regimens targeting Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), require long courses of treatment and a combination of three or more drugs. An increase in drug-resistant strains of M. tuberculosis demonstrates the need for additional TB-specific drugs. A notable feature of M. tuberculosis is coenzyme F(420), which is distributed sporadically and sparsely among prokaryotes. This distribution allows for comparative genomics-based investigations. Phylogenetic profiling (comparison of differential gene content) based on F(420) biosynthesis nominated many actinobacterial proteins as candidate F(420)-dependent enzymes. Three such families dominated the results: the luciferase-like monooxygenase (LLM), pyridoxamine 5'-phosphate oxidase (PPOX), and deazaflavin-dependent nitroreductase (DDN) families. The DDN family was determined to be limited to F(420)-producing species. The LLM and PPOX families were observed in F(420)-producing species as well as species lacking F(420) but were particularly numerous in many actinobacterial species, including M. tuberculosis. Partitioning the LLM and PPOX families based on an organism's ability to make F(420) allowed the application of the SIMBAL (sites inferred by metabolic background assertion labeling) profiling method to identify F(420)-correlated subsequences. These regions were found to correspond to flavonoid cofactor binding sites. Significantly, these results showed that M. tuberculosis carries at least 28 separate F(420)-dependent enzymes, most of unknown function, and a paucity of flavin mononucleotide (FMN)-dependent proteins in these families. While prevalent in mycobacteria, markers of F(420) biosynthesis appeared to be absent from the normal human gut flora. These findings suggest that M. tuberculosis relies heavily on coenzyme F(420) for its redox reactions. This dependence and the cofactor's rarity may make F(420)-related proteins promising drug targets.
VL - 192
N1 - http://www.ncbi.nlm.nih.gov/pubmed/20675471?dopt=Abstract
ER -
TY - JOUR
T1 - Unexpected abundance of coenzyme F(420)-dependent enzymes in Mycobacterium tuberculosis and other actinobacteria.
JF - J Bacteriol
Y1 - 2010
A1 - Selengut, Jeremy D
A1 - Haft, Daniel H
KW - Actinobacteria
KW - Amino Acid Sequence
KW - Binding Sites
KW - Coenzymes
KW - Flavonoids
KW - Gene Expression Profiling
KW - Gene Expression Regulation, Bacterial
KW - Genome, Bacterial
KW - molecular biology
KW - Molecular Sequence Data
KW - Molecular Structure
KW - Mycobacterium tuberculosis
KW - Phylogeny
KW - Protein Conformation
KW - Riboflavin
AB - Regimens targeting Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), require long courses of treatment and a combination of three or more drugs. An increase in drug-resistant strains of M. tuberculosis demonstrates the need for additional TB-specific drugs. A notable feature of M. tuberculosis is coenzyme F(420), which is distributed sporadically and sparsely among prokaryotes. This distribution allows for comparative genomics-based investigations. Phylogenetic profiling (comparison of differential gene content) based on F(420) biosynthesis nominated many actinobacterial proteins as candidate F(420)-dependent enzymes. Three such families dominated the results: the luciferase-like monooxygenase (LLM), pyridoxamine 5'-phosphate oxidase (PPOX), and deazaflavin-dependent nitroreductase (DDN) families. The DDN family was determined to be limited to F(420)-producing species. The LLM and PPOX families were observed in F(420)-producing species as well as species lacking F(420) but were particularly numerous in many actinobacterial species, including M. tuberculosis. Partitioning the LLM and PPOX families based on an organism's ability to make F(420) allowed the application of the SIMBAL (sites inferred by metabolic background assertion labeling) profiling method to identify F(420)-correlated subsequences. These regions were found to correspond to flavonoid cofactor binding sites. Significantly, these results showed that M. tuberculosis carries at least 28 separate F(420)-dependent enzymes, most of unknown function, and a paucity of flavin mononucleotide (FMN)-dependent proteins in these families. While prevalent in mycobacteria, markers of F(420) biosynthesis appeared to be absent from the normal human gut flora. These findings suggest that M. tuberculosis relies heavily on coenzyme F(420) for its redox reactions. This dependence and the cofactor's rarity may make F(420)-related proteins promising drug targets.
VL - 192
CP - 21
M3 - 10.1128/JB.00425-10
ER -
TY - JOUR
T1 - Validating the systematic position of ıt Plationus Segers, Murugan & Dumont, 1993 (Rotifera: Brachionidae) using sequences of the large subunit of the nuclear ribosomal DNA and of cytochrome C oxidase
JF - HydrobiologiaHydrobiologia
Y1 - 2010
A1 - Reyna-Fabian, M. E.
A1 - Laclette, J. P.
A1 - Michael P. Cummings
A1 - García-Varela, M.
KW - Cox1
KW - likelihood
KW - LSU
KW - maximum
KW - Phylogeny
KW - Plationus
AB - Members of the family Brachionidae are free-living organisms that range in size from 170 to 250 microns. They comprise part of the zooplankton in freshwater and marine systems worldwide. Morphologically, members of the family are characterized by a single piece loricated body without furrows, grooves, sulci or dorsal head shields, and a malleate trophi. Differences in these structures have been traditionally used to recognize 217 species that are classified into seven genera. However, the validity of the species, Plationus patulus, P. patulus macracanthus P. polyacanthus, and P. felicitas have been confused because they were alternatively assigned in Brachionus or Platyias, when considering only morphological and ecological characters. Based on scanning electron microscope (SEM) images of the trophi, these taxa were assigned in a new genus, Plationus. In this study, we examined the systematic position of P. patulus and P. patulus macracanthus using DNA sequences of two genes: the cytochrome oxidase subunit 1 (cox1) and domains D2 and D3 of the large subunit of the nuclear ribosomal RNA (LSU). In addition, the cox1 and LSU sequences representing five genera of Brachionidae (Anuraeopsis, Brachionus, Keratella, Plationus, and Platyias) plus four species of three families from the order Ploima were used as the outgroup. The maximum likelihood (ML) analyses were conducted for each individual gene as well as for the combined (cox1 + LSU) data set. The ML tree from the combined data set yielded the family Brachionidae as a monophyletic group with weak bootstrap support (< 50%). Five main clades in this tree had high (> 85%) bootstrap support. The first clade was composed of three populations of P. patulus + P. patulus macracanthus. The second clade was composed of a single species of Platyias. The third clade was composed of six species of Brachionus. The fourth clade included a single species of the genus Anuraeopsis, and the fifth clade was composed of three species of the genus Keratella. The genetic divergence between Plationus and Platyias ranged from 18.4 to 19.2% for cox1, and from 4.5 to 4.9% for LSU, and between Brachionus and Plationus, it ranged from 16.9 to 23.1% (cox1), and from 7.3 to 9.1% (LSU). Morphological evidence, the amount of genetic divergence, the systematic position of Plationus within the family Brachionidae, and the position of Plationus as a sister group of Brachionus and Platyias support the validity of Plationus patulus and P. patulus macracanthus into the genus Plationus.
VL - 644
ER -
TY - JOUR
T1 - Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils
JF - Applied and environmental microbiologyApplied and environmental microbiology
Y1 - 2009
A1 - Ward, Naomi L.
A1 - Challacombe, Jean F.
A1 - Janssen, Peter H.
A1 - Henrissat, Bernard
A1 - Coutinho, Pedro M.
A1 - Wu, Martin
A1 - Xie, Gary
A1 - Haft, Daniel H.
A1 - Sait, Michelle
A1 - Badger, Jonathan
A1 - Barabote, Ravi D.
A1 - Bradley, Brent
A1 - Brettin, Thomas S.
A1 - Brinkac, Lauren M.
A1 - Bruce, David
A1 - Creasy, Todd
A1 - Daugherty, Sean C.
A1 - Davidsen, Tanja M.
A1 - DeBoy, Robert T.
A1 - Detter, J. Chris
A1 - Dodson, Robert J.
A1 - Durkin, A. Scott
A1 - Ganapathy, Anuradha
A1 - Gwinn-Giglio, Michelle
A1 - Han, Cliff S.
A1 - Khouri, Hoda
A1 - Kiss, Hajnalka
A1 - Kothari, Sagar P.
A1 - Madupu, Ramana
A1 - Nelson, Karen E.
A1 - Nelson, William C.
A1 - Paulsen, Ian
A1 - Penn, Kevin
A1 - Ren, Qinghu
A1 - Rosovitz, M. J.
A1 - J. Selengut
A1 - Shrivastava, Susmita
A1 - Sullivan, Steven A.
A1 - Tapia, Roxanne
A1 - Thompson, L. Sue
A1 - Watkins, Kisha L.
A1 - Yang, Qi
A1 - Yu, Chunhui
A1 - Zafar, Nikhat
A1 - Zhou, Liwei
A1 - Kuske, Cheryl R.
KW - Anti-Bacterial Agents
KW - bacteria
KW - Biological Transport
KW - Carbohydrate Metabolism
KW - Cyanobacteria
KW - DNA, Bacterial
KW - Fungi
KW - Genome, Bacterial
KW - Macrolides
KW - Molecular Sequence Data
KW - Nitrogen
KW - Phylogeny
KW - Proteobacteria
KW - Sequence Analysis, DNA
KW - Sequence Homology
KW - Soil Microbiology
AB - The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.
VL - 75
N1 - http://www.ncbi.nlm.nih.gov/pubmed/19201974?dopt=Abstract
ER -
TY - JOUR
T1 - Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils.
JF - Appl Environ Microbiol
Y1 - 2009
A1 - Ward, Naomi L
A1 - Challacombe, Jean F
A1 - Janssen, Peter H
A1 - Henrissat, Bernard
A1 - Coutinho, Pedro M
A1 - Wu, Martin
A1 - Xie, Gary
A1 - Haft, Daniel H
A1 - Sait, Michelle
A1 - Badger, Jonathan
A1 - Barabote, Ravi D
A1 - Bradley, Brent
A1 - Brettin, Thomas S
A1 - Brinkac, Lauren M
A1 - Bruce, David
A1 - Creasy, Todd
A1 - Daugherty, Sean C
A1 - Davidsen, Tanja M
A1 - DeBoy, Robert T
A1 - Detter, J Chris
A1 - Dodson, Robert J
A1 - Durkin, A Scott
A1 - Ganapathy, Anuradha
A1 - Gwinn-Giglio, Michelle
A1 - Han, Cliff S
A1 - Khouri, Hoda
A1 - Kiss, Hajnalka
A1 - Kothari, Sagar P
A1 - Madupu, Ramana
A1 - Nelson, Karen E
A1 - Nelson, William C
A1 - Paulsen, Ian
A1 - Penn, Kevin
A1 - Ren, Qinghu
A1 - Rosovitz, M J
A1 - Selengut, Jeremy D
A1 - Shrivastava, Susmita
A1 - Sullivan, Steven A
A1 - Tapia, Roxanne
A1 - Thompson, L Sue
A1 - Watkins, Kisha L
A1 - Yang, Qi
A1 - Yu, Chunhui
A1 - Zafar, Nikhat
A1 - Zhou, Liwei
A1 - Kuske, Cheryl R
KW - Anti-Bacterial Agents
KW - bacteria
KW - Biological Transport
KW - Carbohydrate Metabolism
KW - Cyanobacteria
KW - DNA, Bacterial
KW - Fungi
KW - Genome, Bacterial
KW - Macrolides
KW - Molecular Sequence Data
KW - Nitrogen
KW - Phylogeny
KW - Proteobacteria
KW - Sequence Analysis, DNA
KW - Sequence Homology
KW - Soil Microbiology
AB - The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.
VL - 75
CP - 7
M3 - 10.1128/AEM.02294-08
ER -
TY - JOUR
T1 - A GENEALOGICAL APPROACH TO QUANTIFYING LINEAGE DIVERGENCE
JF - EvolutionEvolution
Y1 - 2008
A1 - Michael P. Cummings
A1 - Neel, Maile C.
A1 - Shaw, Kerry L.
KW - Ancestral polymorphism
KW - congruence
KW - exclusivity
KW - genealogy
KW - lineage sorting
KW - monophyly
KW - paraphyly
KW - Phylogeny
KW - polyphyly
KW - speciation
KW - species
AB - We introduce a statistic, the genealogical sorting index (gsi), for quantifying the degree of exclusive ancestry of labeled groups on a rooted genealogy and demonstrate its application. The statistic is simple, intuitive, and easily calculated. It has a normalized range to facilitate comparisons among different groups, trees, or studies and it provides information on individual groups rather than a composite measure for all groups. It naturally handles polytomies and accommodates measures of uncertainty in phylogenetic relationships. We use coalescent simulations to explore the behavior of the gsi across a range of divergence times, with the mean value increasing to 1, the maximum value when exclusivity within a group reached monophyly. Simulations also demonstrate that the power to reject the null hypothesis of mixed genealogical ancestry increased markedly as sample size increased, and that the gsi provides a statistically more powerful measure of divergence than FST. Applications to data from published studies demonstrated that the gsi provides a useful way to detect significant exclusivity even when groups are not monophyletic. Although we describe this statistic in the context of divergence, it is more broadly applicable to quantify and assess the significance of clustering of observations in labeled groups on any tree.
VL - 62
SN - 1558-5646
ER -
TY - Generic
T1 - Uncovering Genomic Reassortments among Influenza Strains by Enumerating Maximal Bicliques
T2 - IEEE International Conference on Bioinformatics and Biomedicine, 2008. BIBM '08
Y1 - 2008
A1 - Nagarajan, N.
A1 - Kingsford, Carl
KW - avian hosted influenza genome
KW - Bioinformatics
KW - Capacitive sensors
KW - Delay
KW - diseases
KW - Event detection
KW - general bipartite graphs
KW - genomic reassortments
KW - Genomics
KW - graph theory
KW - high probability inconsistencies
KW - History
KW - human hosted influenza genome
KW - incompatibility graph
KW - Influenza
KW - influenza strain
KW - maximal biclique
KW - maximal biclique enumeration
KW - microorganisms
KW - phylogenetic trees
KW - Phylogeny
KW - Public healthcare
KW - quadratic delay algorithm
KW - reassortment
KW - reassortment event detection
KW - Tree graphs
KW - viral genome evolutionary history
KW - virulence
AB - The evolutionary histories of viral genomes have received significant recent attention due to their importance in understanding virulence and the corresponding ramifications to public health. We present a novel framework to detect reassortment events in influenza based on the comparison of two distributions of phylogenetic trees, rather than a pair of, possibly unreliable, consensus trees. We show how to detect all high-probability inconsistencies between two distributions of trees by enumerating maximal bicliques within a defined incompatibility graph. In the process, we give the first quadratic delay algorithm for enumerating maximal bicliques within general bipartite graphs. We demonstrate the utility of our approach by applying it to several sets of influenza genomes (both human- and avian-hosted) and successfully identify all known reassortment events and a few novel candidate reassortments. In addition, on simulated datasets, our approach correctly finds implanted reassortments and rarely detects reassortments where none were introduced.
JA - IEEE International Conference on Bioinformatics and Biomedicine, 2008. BIBM '08
PB - IEEE
SN - 978-0-7695-3452-7
ER -
TY - JOUR
T1 - Evolution of genes and genomes on the Drosophila phylogeny.
JF - Nature
Y1 - 2007
A1 - Clark, Andrew G
A1 - Eisen, Michael B
A1 - Smith, Douglas R
A1 - Bergman, Casey M
A1 - Oliver, Brian
A1 - Markow, Therese A
A1 - Kaufman, Thomas C
A1 - Kellis, Manolis
A1 - Gelbart, William
A1 - Iyer, Venky N
A1 - Pollard, Daniel A
A1 - Sackton, Timothy B
A1 - Larracuente, Amanda M
A1 - Singh, Nadia D
A1 - Abad, Jose P
A1 - Abt, Dawn N
A1 - Adryan, Boris
A1 - Aguade, Montserrat
A1 - Akashi, Hiroshi
A1 - Anderson, Wyatt W
A1 - Aquadro, Charles F
A1 - Ardell, David H
A1 - Arguello, Roman
A1 - Artieri, Carlo G
A1 - Barbash, Daniel A
A1 - Barker, Daniel
A1 - Barsanti, Paolo
A1 - Batterham, Phil
A1 - Batzoglou, Serafim
A1 - Begun, Dave
A1 - Bhutkar, Arjun
A1 - Blanco, Enrico
A1 - Bosak, Stephanie A
A1 - Bradley, Robert K
A1 - Brand, Adrianne D
A1 - Brent, Michael R
A1 - Brooks, Angela N
A1 - Brown, Randall H
A1 - Butlin, Roger K
A1 - Caggese, Corrado
A1 - Calvi, Brian R
A1 - Bernardo de Carvalho, A
A1 - Caspi, Anat
A1 - Castrezana, Sergio
A1 - Celniker, Susan E
A1 - Chang, Jean L
A1 - Chapple, Charles
A1 - Chatterji, Sourav
A1 - Chinwalla, Asif
A1 - Civetta, Alberto
A1 - Clifton, Sandra W
A1 - Comeron, Josep M
A1 - Costello, James C
A1 - Coyne, Jerry A
A1 - Daub, Jennifer
A1 - David, Robert G
A1 - Delcher, Arthur L
A1 - Delehaunty, Kim
A1 - Do, Chuong B
A1 - Ebling, Heather
A1 - Edwards, Kevin
A1 - Eickbush, Thomas
A1 - Evans, Jay D
A1 - Filipski, Alan
A1 - Findeiss, Sven
A1 - Freyhult, Eva
A1 - Fulton, Lucinda
A1 - Fulton, Robert
A1 - Garcia, Ana C L
A1 - Gardiner, Anastasia
A1 - Garfield, David A
A1 - Garvin, Barry E
A1 - Gibson, Greg
A1 - Gilbert, Don
A1 - Gnerre, Sante
A1 - Godfrey, Jennifer
A1 - Good, Robert
A1 - Gotea, Valer
A1 - Gravely, Brenton
A1 - Greenberg, Anthony J
A1 - Griffiths-Jones, Sam
A1 - Gross, Samuel
A1 - Guigo, Roderic
A1 - Gustafson, Erik A
A1 - Haerty, Wilfried
A1 - Hahn, Matthew W
A1 - Halligan, Daniel L
A1 - Halpern, Aaron L
A1 - Halter, Gillian M
A1 - Han, Mira V
A1 - Heger, Andreas
A1 - Hillier, LaDeana
A1 - Hinrichs, Angie S
A1 - Holmes, Ian
A1 - Hoskins, Roger A
A1 - Hubisz, Melissa J
A1 - Hultmark, Dan
A1 - Huntley, Melanie A
A1 - Jaffe, David B
A1 - Jagadeeshan, Santosh
A1 - Jeck, William R
A1 - Johnson, Justin
A1 - Jones, Corbin D
A1 - Jordan, William C
A1 - Karpen, Gary H
A1 - Kataoka, Eiko
A1 - Keightley, Peter D
A1 - Kheradpour, Pouya
A1 - Kirkness, Ewen F
A1 - Koerich, Leonardo B
A1 - Kristiansen, Karsten
A1 - Kudrna, Dave
A1 - Kulathinal, Rob J
A1 - Kumar, Sudhir
A1 - Kwok, Roberta
A1 - Lander, Eric
A1 - Langley, Charles H
A1 - Lapoint, Richard
A1 - Lazzaro, Brian P
A1 - Lee, So-Jeong
A1 - Levesque, Lisa
A1 - Li, Ruiqiang
A1 - Lin, Chiao-Feng
A1 - Lin, Michael F
A1 - Lindblad-Toh, Kerstin
A1 - Llopart, Ana
A1 - Long, Manyuan
A1 - Low, Lloyd
A1 - Lozovsky, Elena
A1 - Lu, Jian
A1 - Luo, Meizhong
A1 - Machado, Carlos A
A1 - Makalowski, Wojciech
A1 - Marzo, Mar
A1 - Matsuda, Muneo
A1 - Matzkin, Luciano
A1 - McAllister, Bryant
A1 - McBride, Carolyn S
A1 - McKernan, Brendan
A1 - McKernan, Kevin
A1 - Mendez-Lago, Maria
A1 - Minx, Patrick
A1 - Mollenhauer, Michael U
A1 - Montooth, Kristi
A1 - Mount, Stephen M
A1 - Mu, Xu
A1 - Myers, Eugene
A1 - Negre, Barbara
A1 - Newfeld, Stuart
A1 - Nielsen, Rasmus
A1 - Noor, Mohamed A F
A1 - O'Grady, Patrick
A1 - Pachter, Lior
A1 - Papaceit, Montserrat
A1 - Parisi, Matthew J
A1 - Parisi, Michael
A1 - Parts, Leopold
A1 - Pedersen, Jakob S
A1 - Pesole, Graziano
A1 - Phillippy, Adam M
A1 - Ponting, Chris P
A1 - Pop, Mihai
A1 - Porcelli, Damiano
A1 - Powell, Jeffrey R
A1 - Prohaska, Sonja
A1 - Pruitt, Kim
A1 - Puig, Marta
A1 - Quesneville, Hadi
A1 - Ram, Kristipati Ravi
A1 - Rand, David
A1 - Rasmussen, Matthew D
A1 - Reed, Laura K
A1 - Reenan, Robert
A1 - Reily, Amy
A1 - Remington, Karin A
A1 - Rieger, Tania T
A1 - Ritchie, Michael G
A1 - Robin, Charles
A1 - Rogers, Yu-Hui
A1 - Rohde, Claudia
A1 - Rozas, Julio
A1 - Rubenfield, Marc J
A1 - Ruiz, Alfredo
A1 - Russo, Susan
A1 - Salzberg, Steven L
A1 - Sanchez-Gracia, Alejandro
A1 - Saranga, David J
A1 - Sato, Hajime
A1 - Schaeffer, Stephen W
A1 - Schatz, Michael C
A1 - Schlenke, Todd
A1 - Schwartz, Russell
A1 - Segarra, Carmen
A1 - Singh, Rama S
A1 - Sirot, Laura
A1 - Sirota, Marina
A1 - Sisneros, Nicholas B
A1 - Smith, Chris D
A1 - Smith, Temple F
A1 - Spieth, John
A1 - Stage, Deborah E
A1 - Stark, Alexander
A1 - Stephan, Wolfgang
A1 - Strausberg, Robert L
A1 - Strempel, Sebastian
A1 - Sturgill, David
A1 - Sutton, Granger
A1 - Sutton, Granger G
A1 - Tao, Wei
A1 - Teichmann, Sarah
A1 - Tobari, Yoshiko N
A1 - Tomimura, Yoshihiko
A1 - Tsolas, Jason M
A1 - Valente, Vera L S
A1 - Venter, Eli
A1 - Venter, J Craig
A1 - Vicario, Saverio
A1 - Vieira, Filipe G
A1 - Vilella, Albert J
A1 - Villasante, Alfredo
A1 - Walenz, Brian
A1 - Wang, Jun
A1 - Wasserman, Marvin
A1 - Watts, Thomas
A1 - Wilson, Derek
A1 - Wilson, Richard K
A1 - Wing, Rod A
A1 - Wolfner, Mariana F
A1 - Wong, Alex
A1 - Wong, Gane Ka-Shu
A1 - Wu, Chung-I
A1 - Wu, Gabriel
A1 - Yamamoto, Daisuke
A1 - Yang, Hsiao-Pei
A1 - Yang, Shiaw-Pyng
A1 - Yorke, James A
A1 - Yoshida, Kiyohito
A1 - Zdobnov, Evgeny
A1 - Zhang, Peili
A1 - Zhang, Yu
A1 - Zimin, Aleksey V
A1 - Baldwin, Jennifer
A1 - Abdouelleil, Amr
A1 - Abdulkadir, Jamal
A1 - Abebe, Adal
A1 - Abera, Brikti
A1 - Abreu, Justin
A1 - Acer, St Christophe
A1 - Aftuck, Lynne
A1 - Alexander, Allen
A1 - An, Peter
A1 - Anderson, Erica
A1 - Anderson, Scott
A1 - Arachi, Harindra
A1 - Azer, Marc
A1 - Bachantsang, Pasang
A1 - Barry, Andrew
A1 - Bayul, Tashi
A1 - Berlin, Aaron
A1 - Bessette, Daniel
A1 - Bloom, Toby
A1 - Blye, Jason
A1 - Boguslavskiy, Leonid
A1 - Bonnet, Claude
A1 - Boukhgalter, Boris
A1 - Bourzgui, Imane
A1 - Brown, Adam
A1 - Cahill, Patrick
A1 - Channer, Sheridon
A1 - Cheshatsang, Yama
A1 - Chuda, Lisa
A1 - Citroen, Mieke
A1 - Collymore, Alville
A1 - Cooke, Patrick
A1 - Costello, Maura
A1 - D'Aco, Katie
A1 - Daza, Riza
A1 - De Haan, Georgius
A1 - DeGray, Stuart
A1 - DeMaso, Christina
A1 - Dhargay, Norbu
A1 - Dooley, Kimberly
A1 - Dooley, Erin
A1 - Doricent, Missole
A1 - Dorje, Passang
A1 - Dorjee, Kunsang
A1 - Dupes, Alan
A1 - Elong, Richard
A1 - Falk, Jill
A1 - Farina, Abderrahim
A1 - Faro, Susan
A1 - Ferguson, Diallo
A1 - Fisher, Sheila
A1 - Foley, Chelsea D
A1 - Franke, Alicia
A1 - Friedrich, Dennis
A1 - Gadbois, Loryn
A1 - Gearin, Gary
A1 - Gearin, Christina R
A1 - Giannoukos, Georgia
A1 - Goode, Tina
A1 - Graham, Joseph
A1 - Grandbois, Edward
A1 - Grewal, Sharleen
A1 - Gyaltsen, Kunsang
A1 - Hafez, Nabil
A1 - Hagos, Birhane
A1 - Hall, Jennifer
A1 - Henson, Charlotte
A1 - Hollinger, Andrew
A1 - Honan, Tracey
A1 - Huard, Monika D
A1 - Hughes, Leanne
A1 - Hurhula, Brian
A1 - Husby, M Erii
A1 - Kamat, Asha
A1 - Kanga, Ben
A1 - Kashin, Seva
A1 - Khazanovich, Dmitry
A1 - Kisner, Peter
A1 - Lance, Krista
A1 - Lara, Marcia
A1 - Lee, William
A1 - Lennon, Niall
A1 - Letendre, Frances
A1 - LeVine, Rosie
A1 - Lipovsky, Alex
A1 - Liu, Xiaohong
A1 - Liu, Jinlei
A1 - Liu, Shangtao
A1 - Lokyitsang, Tashi
A1 - Lokyitsang, Yeshi
A1 - Lubonja, Rakela
A1 - Lui, Annie
A1 - MacDonald, Pen
A1 - Magnisalis, Vasilia
A1 - Maru, Kebede
A1 - Matthews, Charles
A1 - McCusker, William
A1 - McDonough, Susan
A1 - Mehta, Teena
A1 - Meldrim, James
A1 - Meneus, Louis
A1 - Mihai, Oana
A1 - Mihalev, Atanas
A1 - Mihova, Tanya
A1 - Mittelman, Rachel
A1 - Mlenga, Valentine
A1 - Montmayeur, Anna
A1 - Mulrain, Leonidas
A1 - Navidi, Adam
A1 - Naylor, Jerome
A1 - Negash, Tamrat
A1 - Nguyen, Thu
A1 - Nguyen, Nga
A1 - Nicol, Robert
A1 - Norbu, Choe
A1 - Norbu, Nyima
A1 - Novod, Nathaniel
A1 - O'Neill, Barry
A1 - Osman, Sahal
A1 - Markiewicz, Eva
A1 - Oyono, Otero L
A1 - Patti, Christopher
A1 - Phunkhang, Pema
A1 - Pierre, Fritz
A1 - Priest, Margaret
A1 - Raghuraman, Sujaa
A1 - Rege, Filip
A1 - Reyes, Rebecca
A1 - Rise, Cecil
A1 - Rogov, Peter
A1 - Ross, Keenan
A1 - Ryan, Elizabeth
A1 - Settipalli, Sampath
A1 - Shea, Terry
A1 - Sherpa, Ngawang
A1 - Shi, Lu
A1 - Shih, Diana
A1 - Sparrow, Todd
A1 - Spaulding, Jessica
A1 - Stalker, John
A1 - Stange-Thomann, Nicole
A1 - Stavropoulos, Sharon
A1 - Stone, Catherine
A1 - Strader, Christopher
A1 - Tesfaye, Senait
A1 - Thomson, Talene
A1 - Thoulutsang, Yama
A1 - Thoulutsang, Dawa
A1 - Topham, Kerri
A1 - Topping, Ira
A1 - Tsamla, Tsamla
A1 - Vassiliev, Helen
A1 - Vo, Andy
A1 - Wangchuk, Tsering
A1 - Wangdi, Tsering
A1 - Weiand, Michael
A1 - Wilkinson, Jane
A1 - Wilson, Adam
A1 - Yadav, Shailendra
A1 - Young, Geneva
A1 - Yu, Qing
A1 - Zembek, Lisa
A1 - Zhong, Danni
A1 - Zimmer, Andrew
A1 - Zwirko, Zac
A1 - Jaffe, David B
A1 - Alvarez, Pablo
A1 - Brockman, Will
A1 - Butler, Jonathan
A1 - Chin, CheeWhye
A1 - Gnerre, Sante
A1 - Grabherr, Manfred
A1 - Kleber, Michael
A1 - Mauceli, Evan
A1 - MacCallum, Iain
KW - Animals
KW - Codon
KW - DNA Transposable Elements
KW - Drosophila
KW - Drosophila Proteins
KW - Evolution, Molecular
KW - Gene Order
KW - Genes, Insect
KW - Genome, Insect
KW - Genome, Mitochondrial
KW - Genomics
KW - Immunity
KW - Multigene Family
KW - Phylogeny
KW - Reproduction
KW - RNA, Untranslated
KW - sequence alignment
KW - Sequence Analysis, DNA
KW - Synteny
AB - Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
VL - 450
CP - 7167
M3 - 10.1038/nature06341
ER -
TY - JOUR
T1 - Spliceosomal small nuclear RNA genes in 11 insect genomes.
JF - RNA
Y1 - 2007
A1 - Mount, Stephen M
A1 - Gotea, Valer
A1 - Lin, Chiao-Feng
A1 - Hernandez, Kristina
A1 - Makalowski, Wojciech
KW - Animals
KW - Base Sequence
KW - Bees
KW - Computational Biology
KW - Diptera
KW - Evolution, Molecular
KW - Genes, Insect
KW - Genome, Insect
KW - Molecular Sequence Data
KW - Nucleic Acid Conformation
KW - Phylogeny
KW - Promoter Regions, Genetic
KW - RNA Splicing
KW - RNA, Small Nuclear
KW - Sequence Analysis, RNA
KW - Spliceosomes
AB - The removal of introns from the primary transcripts of protein-coding genes is accomplished by the spliceosome, a large macromolecular complex of which small nuclear RNAs (snRNAs) are crucial components. Following the recent sequencing of the honeybee (Apis mellifera) genome, we used various computational methods, ranging from sequence similarity search to RNA secondary structure prediction, to search for putative snRNA genes (including their promoters) and to examine their pattern of conservation among 11 available insect genomes (A. mellifera, Tribolium castaneum, Bombyx mori, Anopheles gambiae, Aedes aegypti, and six Drosophila species). We identified candidates for all nine spliceosomal snRNA genes in all the analyzed genomes. All the species contain a similar number of snRNA genes, with the exception of A. aegypti, whose genome contains more U1, U2, and U5 genes, and A. mellifera, whose genome contains fewer U2 and U5 genes. We found that snRNA genes are generally more closely related to homologs within the same genus than to those in other genera. Promoter regions for all spliceosomal snRNA genes within each insect species share similar sequence motifs that are likely to correspond to the PSEA (proximal sequence element A), the binding site for snRNA activating protein complex, but these promoter elements vary in sequence among the five insect families surveyed here. In contrast to the other insect species investigated, Dipteran genomes are characterized by a rapid evolution (or loss) of components of the U12 spliceosome and a striking loss of U12-type introns.
VL - 13
CP - 1
M3 - 10.1261/rna.259207
ER -
TY - JOUR
T1 - TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes
JF - Nucleic acids researchNucleic Acids Research
Y1 - 2007
A1 - J. Selengut
A1 - Haft, Daniel H.
A1 - Davidsen, Tanja
A1 - Ganapathy, Anurhada
A1 - Gwinn-Giglio, Michelle
A1 - Nelson, William C.
A1 - Richter, R. Alexander
A1 - White, Owen
KW - Archaeal Proteins
KW - Bacterial Proteins
KW - Databases, Protein
KW - Genome, Bacterial
KW - Genomics
KW - Internet
KW - Phylogeny
KW - software
KW - User-Computer Interface
AB - TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe 'equivalog' families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered. The TIGRFAMs and Genome Properties systems can be accessed at http://www.tigr.org/TIGRFAMs and http://www.tigr.org/Genome_Properties.
VL - 35
N1 - http://www.ncbi.nlm.nih.gov/pubmed/17151080?dopt=Abstract
ER -
TY - JOUR
T1 - Comparative genomics of emerging human ehrlichiosis agents
JF - PLoS geneticsPLoS genetics
Y1 - 2006
A1 - Dunning Hotopp, Julie C.
A1 - Lin, Mingqun
A1 - Madupu, Ramana
A1 - Crabtree, Jonathan
A1 - Angiuoli, Samuel V.
A1 - Eisen, Jonathan A.
A1 - Eisen, Jonathan
A1 - Seshadri, Rekha
A1 - Ren, Qinghu
A1 - Wu, Martin
A1 - Utterback, Teresa R.
A1 - Smith, Shannon
A1 - Lewis, Matthew
A1 - Khouri, Hoda
A1 - Zhang, Chunbin
A1 - Niu, Hua
A1 - Lin, Quan
A1 - Ohashi, Norio
A1 - Zhi, Ning
A1 - Nelson, William
A1 - Brinkac, Lauren M.
A1 - Dodson, Robert J.
A1 - Rosovitz, M. J.
A1 - Sundaram, Jaideep
A1 - Daugherty, Sean C.
A1 - Davidsen, Tanja
A1 - Durkin, Anthony S.
A1 - Gwinn, Michelle
A1 - Haft, Daniel H.
A1 - J. Selengut
A1 - Sullivan, Steven A.
A1 - Zafar, Nikhat
A1 - Zhou, Liwei
A1 - Benahmed, Faiza
A1 - Forberger, Heather
A1 - Halpin, Rebecca
A1 - Mulligan, Stephanie
A1 - Robinson, Jeffrey
A1 - White, Owen
A1 - Rikihisa, Yasuko
A1 - Tettelin, Hervé
KW - Animals
KW - Biotin
KW - DNA Repair
KW - Ehrlichia
KW - Ehrlichiosis
KW - Genome
KW - Genomics
KW - HUMANS
KW - Models, Biological
KW - Phylogeny
KW - Rickettsia
KW - Ticks
AB - Anaplasma (formerly Ehrlichia) phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia) sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens.
VL - 2
N1 - http://www.ncbi.nlm.nih.gov/pubmed/16482227?dopt=Abstract
ER -
TY - JOUR
T1 - Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic
JF - BMC biologyBMC biology
Y1 - 2006
A1 - Haft, Daniel H.
A1 - Paulsen, Ian T.
A1 - Ward, Naomi
A1 - J. Selengut
KW - Amino Acid Motifs
KW - Amino Acid Sequence
KW - bacteria
KW - Bacterial Proteins
KW - Biofilms
KW - Genome, Bacterial
KW - Markov chains
KW - Molecular Sequence Data
KW - Phylogeny
KW - Polysaccharides, Bacterial
KW - Protein Sorting Signals
KW - Protein Transport
KW - Seawater
KW - sequence alignment
KW - Soil Microbiology
AB - BACKGROUND: Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families. RESULTS: We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT). CONCLUSION: These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods.
VL - 4
N1 - http://www.ncbi.nlm.nih.gov/pubmed/16930487?dopt=Abstract
ER -
TY - JOUR
T1 - Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome"
JF - Proceedings of the National Academy of Sciences of the United States of AmericaProceedings of the National Academy of Sciences of the United States of America
Y1 - 2005
A1 - Tettelin, Hervé
A1 - Masignani, Vega
A1 - Cieslewicz, Michael J.
A1 - Donati, Claudio
A1 - Medini, Duccio
A1 - Ward, Naomi L.
A1 - Angiuoli, Samuel V.
A1 - Crabtree, Jonathan
A1 - Jones, Amanda L.
A1 - Durkin, A. Scott
A1 - DeBoy, Robert T.
A1 - Davidsen, Tanja M.
A1 - Mora, Marirosa
A1 - Scarselli, Maria
A1 - Margarit y Ros, Immaculada
A1 - Peterson, Jeremy D.
A1 - Hauser, Christopher R.
A1 - Sundaram, Jaideep P.
A1 - Nelson, William C.
A1 - Madupu, Ramana
A1 - Brinkac, Lauren M.
A1 - Dodson, Robert J.
A1 - Rosovitz, Mary J.
A1 - Sullivan, Steven A.
A1 - Daugherty, Sean C.
A1 - Haft, Daniel H.
A1 - J. Selengut
A1 - Gwinn, Michelle L.
A1 - Zhou, Liwei
A1 - Zafar, Nikhat
A1 - Khouri, Hoda
A1 - Radune, Diana
A1 - Dimitrov, George
A1 - Watkins, Kisha
A1 - O'Connor, Kevin J. B.
A1 - Smith, Shannon
A1 - Utterback, Teresa R.
A1 - White, Owen
A1 - Rubens, Craig E.
A1 - Grandi, Guido
A1 - Madoff, Lawrence C.
A1 - Kasper, Dennis L.
A1 - Telford, John L.
A1 - Wessels, Michael R.
A1 - Rappuoli, Rino
A1 - Fraser, Claire M.
KW - Amino Acid Sequence
KW - Bacterial Capsules
KW - Base Sequence
KW - Gene expression
KW - Genes, Bacterial
KW - Genetic Variation
KW - Genome, Bacterial
KW - Molecular Sequence Data
KW - Phylogeny
KW - sequence alignment
KW - Sequence Analysis, DNA
KW - Streptococcus agalactiae
KW - virulence
AB - The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
VL - 102
N1 - http://www.ncbi.nlm.nih.gov/pubmed/16172379?dopt=Abstract
ER -
TY - JOUR
T1 - A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes
JF - PLoS computational biologyPLOS Computational Biology
Y1 - 2005
A1 - Haft, Daniel H.
A1 - J. Selengut
A1 - Mongodin, Emmanuel F.
A1 - Nelson, Karen E.
KW - Genes, Archaeal
KW - Genes, Bacterial
KW - Genes, Fungal
KW - Genome
KW - Genome, Bacterial
KW - Haloarcula marismortui
KW - Markov chains
KW - Multigene Family
KW - Oligonucleotide Array Sequence Analysis
KW - Phylogeny
KW - Prokaryotic Cells
KW - Proteins
KW - Repetitive Sequences, Nucleic Acid
KW - Yersinia pestis
AB - Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
VL - 1
N1 - http://www.ncbi.nlm.nih.gov/pubmed/16292354?dopt=Abstract
ER -
TY - JOUR
T1 - Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment
JF - NatureNature
Y1 - 2004
A1 - Moran, Mary Ann
A1 - Buchan, Alison
A1 - González, José M.
A1 - Heidelberg, John F.
A1 - Whitman, William B.
A1 - Kiene, Ronald P.
A1 - Henriksen, James R.
A1 - King, Gary M.
A1 - Belas, Robert
A1 - Fuqua, Clay
A1 - Brinkac, Lauren
A1 - Lewis, Matt
A1 - Johri, Shivani
A1 - Weaver, Bruce
A1 - Pai, Grace
A1 - Eisen, Jonathan A.
A1 - Rahe, Elisha
A1 - Sheldon, Wade M.
A1 - Ye, Wenying
A1 - Miller, Todd R.
A1 - Carlton, Jane
A1 - Rasko, David A.
A1 - Paulsen, Ian T.
A1 - Ren, Qinghu
A1 - Daugherty, Sean C.
A1 - DeBoy, Robert T.
A1 - Dodson, Robert J.
A1 - Durkin, A. Scott
A1 - Madupu, Ramana
A1 - Nelson, William C.
A1 - Sullivan, Steven A.
A1 - Rosovitz, M. J.
A1 - Haft, Daniel H.
A1 - J. Selengut
A1 - Ward, Naomi
KW - Adaptation, Physiological
KW - Carrier Proteins
KW - Genes, Bacterial
KW - Genome, Bacterial
KW - marine biology
KW - Molecular Sequence Data
KW - Oceans and Seas
KW - Phylogeny
KW - plankton
KW - RNA, Ribosomal, 16S
KW - Roseobacter
KW - Seawater
AB - Since the recognition of prokaryotes as essential components of the oceanic food web, bacterioplankton have been acknowledged as catalysts of most major biogeochemical processes in the sea. Studying heterotrophic bacterioplankton has been challenging, however, as most major clades have never been cultured or have only been grown to low densities in sea water. Here we describe the genome sequence of Silicibacter pomeroyi, a member of the marine Roseobacter clade (Fig. 1), the relatives of which comprise approximately 10-20% of coastal and oceanic mixed-layer bacterioplankton. This first genome sequence from any major heterotrophic clade consists of a chromosome (4,109,442 base pairs) and megaplasmid (491,611 base pairs). Genome analysis indicates that this organism relies upon a lithoheterotrophic strategy that uses inorganic compounds (carbon monoxide and sulphide) to supplement heterotrophy. Silicibacter pomeroyi also has genes advantageous for associations with plankton and suspended particles, including genes for uptake of algal-derived compounds, use of metabolites from reducing microzones, rapid growth and cell-density-dependent regulation. This bacterium has a physiology distinct from that of marine oligotrophs, adding a new strategy to the recognized repertoire for coping with a nutrient-poor ocean.
VL - 432
N1 - http://www.ncbi.nlm.nih.gov/pubmed/15602564?dopt=Abstract
ER -
TY - JOUR
T1 - Genome of Geobacter sulfurreducens: metal reduction in subsurface environments
JF - Science (New York, N.Y.)Science (New York, N.Y.)
Y1 - 2003
A1 - Methé, B. A.
A1 - Nelson, K. E.
A1 - Eisen, J. A.
A1 - Paulsen, I. T.
A1 - Nelson, W.
A1 - Heidelberg, J. F.
A1 - Wu, D.
A1 - Wu, M.
A1 - Ward, N.
A1 - Beanan, M. J.
A1 - Dodson, R. J.
A1 - Madupu, R.
A1 - Brinkac, L. M.
A1 - Daugherty, S. C.
A1 - DeBoy, R. T.
A1 - Durkin, A. S.
A1 - Gwinn, M.
A1 - Kolonay, J. F.
A1 - Sullivan, S. A.
A1 - Haft, D. H.
A1 - J. Selengut
A1 - Davidsen, T. M.
A1 - Zafar, N.
A1 - White, O.
A1 - Tran, B.
A1 - Romero, C.
A1 - Forberger, H. A.
A1 - Weidman, J.
A1 - Khouri, H.
A1 - Feldblyum, T. V.
A1 - Utterback, T. R.
A1 - Van Aken, S. E.
A1 - Lovley, D. R.
A1 - Fraser, C. M.
KW - Acetates
KW - Acetyl Coenzyme A
KW - Aerobiosis
KW - Anaerobiosis
KW - Bacterial Proteins
KW - Carbon
KW - Chemotaxis
KW - Chromosomes, Bacterial
KW - Cytochromes c
KW - Electron Transport
KW - Energy Metabolism
KW - Genes, Bacterial
KW - Genes, Regulator
KW - Genome, Bacterial
KW - Geobacter
KW - Hydrogen
KW - Metals
KW - Movement
KW - Open Reading Frames
KW - Oxidation-Reduction
KW - Phylogeny
AB - The complete genome sequence of Geobacter sulfurreducens, a delta-proteobacterium, reveals unsuspected capabilities, including evidence of aerobic metabolism, one-carbon and complex carbon metabolism, motility, and chemotactic behavior. These characteristics, coupled with the possession of many two-component sensors and many c-type cytochromes, reveal an ability to create alternative, redundant, electron transport networks and offer insights into the process of metal ion reduction in subsurface environments. As well as playing roles in the global cycling of metals and carbon, this organism clearly has the potential for use in bioremediation of radioactive metals and in the generation of electricity.
VL - 302
N1 - http://www.ncbi.nlm.nih.gov/pubmed/14671304?dopt=Abstract
ER -
TY - JOUR
T1 - The TIGRFAMs database of protein families
JF - Nucleic acids researchNucleic Acids Research
Y1 - 2003
A1 - Haft, Daniel H.
A1 - J. Selengut
A1 - White, Owen
KW - Animals
KW - Databases, Protein
KW - Markov chains
KW - Mixed Function Oxygenases
KW - Phylogeny
KW - Proteins
KW - Pyruvate Carboxylase
KW - Sequence Homology, Amino Acid
AB - TIGRFAMs is a collection of manually curated protein families consisting of hidden Markov models (HMMs), multiple sequence alignments, commentary, Gene Ontology (GO) assignments, literature references and pointers to related TIGRFAMs, Pfam and InterPro models. These models are designed to support both automated and manually curated annotation of genomes. TIGRFAMs contains models of full-length proteins and shorter regions at the levels of superfamilies, subfamilies and equivalogs, where equivalogs are sets of homologous proteins conserved with respect to function since their last common ancestor. The scope of each model is set by raising or lowering cutoff scores and choosing members of the seed alignment to group proteins sharing specific function (equivalog) or more general properties. The overall goal is to provide information with maximum utility for the annotation process. TIGRFAMs is thus complementary to Pfam, whose models typically achieve broad coverage across distant homologs but end at the boundaries of conserved structural domains. The database currently contains over 1600 protein families. TIGRFAMs is available for searching or downloading at www.tigr.org/TIGRFAMs.
VL - 31
N1 - http://www.ncbi.nlm.nih.gov/pubmed/12520025?dopt=Abstract
ER -
TY - JOUR
T1 - The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins.
JF - Science
Y1 - 2002
A1 - Dehal, Paramvir
A1 - Satou, Yutaka
A1 - Campbell, Robert K
A1 - Chapman, Jarrod
A1 - Degnan, Bernard
A1 - De Tomaso, Anthony
A1 - Davidson, Brad
A1 - Di Gregorio, Anna
A1 - Gelpke, Maarten
A1 - Goodstein, David M
A1 - Harafuji, Naoe
A1 - Hastings, Kenneth E M
A1 - Ho, Isaac
A1 - Hotta, Kohji
A1 - Huang, Wayne
A1 - Kawashima, Takeshi
A1 - Lemaire, Patrick
A1 - Martinez, Diego
A1 - Meinertzhagen, Ian A
A1 - Necula, Simona
A1 - Nonaka, Masaru
A1 - Putnam, Nik
A1 - Rash, Sam
A1 - Saiga, Hidetoshi
A1 - Satake, Masanobu
A1 - Terry, Astrid
A1 - Yamada, Lixy
A1 - Wang, Hong-Gang
A1 - Awazu, Satoko
A1 - Azumi, Kaoru
A1 - Boore, Jeffrey
A1 - Branno, Margherita
A1 - Chin-Bow, Stephen
A1 - DeSantis, Rosaria
A1 - Doyle, Sharon
A1 - Francino, Pilar
A1 - Keys, David N
A1 - Haga, Shinobu
A1 - Hayashi, Hiroko
A1 - Hino, Kyosuke
A1 - Imai, Kaoru S
A1 - Inaba, Kazuo
A1 - Kano, Shungo
A1 - Kobayashi, Kenji
A1 - Kobayashi, Mari
A1 - Lee, Byung-In
A1 - Makabe, Kazuhiro W
A1 - Manohar, Chitra
A1 - Matassi, Giorgio
A1 - Medina, Monica
A1 - Mochizuki, Yasuaki
A1 - Mount, Steve
A1 - Morishita, Tomomi
A1 - Miura, Sachiko
A1 - Nakayama, Akie
A1 - Nishizaka, Satoko
A1 - Nomoto, Hisayo
A1 - Ohta, Fumiko
A1 - Oishi, Kazuko
A1 - Rigoutsos, Isidore
A1 - Sano, Masako
A1 - Sasaki, Akane
A1 - Sasakura, Yasunori
A1 - Shoguchi, Eiichi
A1 - Shin-i, Tadasu
A1 - Spagnuolo, Antoinetta
A1 - Stainier, Didier
A1 - Suzuki, Miho M
A1 - Tassy, Olivier
A1 - Takatori, Naohito
A1 - Tokuoka, Miki
A1 - Yagi, Kasumi
A1 - Yoshizaki, Fumiko
A1 - Wada, Shuichi
A1 - Zhang, Cindy
A1 - Hyatt, P Douglas
A1 - Larimer, Frank
A1 - Detter, Chris
A1 - Doggett, Norman
A1 - Glavina, Tijana
A1 - Hawkins, Trevor
A1 - Richardson, Paul
A1 - Lucas, Susan
A1 - Kohara, Yuji
A1 - Levine, Michael
A1 - Satoh, Nori
A1 - Rokhsar, Daniel S
KW - Alleles
KW - Animals
KW - Apoptosis
KW - Base Sequence
KW - Cellulose
KW - Central Nervous System
KW - Ciona intestinalis
KW - Computational Biology
KW - Endocrine System
KW - Gene Dosage
KW - Gene Duplication
KW - genes
KW - Genes, Homeobox
KW - Genome
KW - Heart
KW - Immunity
KW - Molecular Sequence Data
KW - Multigene Family
KW - Muscle Proteins
KW - Organizers, Embryonic
KW - Phylogeny
KW - Polymorphism, Genetic
KW - Proteins
KW - Sequence Analysis, DNA
KW - Sequence Homology, Nucleic Acid
KW - Species Specificity
KW - Thyroid Gland
KW - Urochordata
KW - Vertebrates
AB - The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains approximately 16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.
VL - 298
CP - 5601
M3 - 10.1126/science.1080049
ER -
TY - JOUR
T1 - A new, expressed multigene family containing a hot spot for insertion of retroelements is associated with polymorphic subtelomeric regions of Trypanosoma brucei.
JF - Eukaryot Cell
Y1 - 2002
A1 - Bringaud, Frederic
A1 - Biteau, Nicolas
A1 - Melville, Sara E
A1 - Hez, Stéphanie
A1 - El-Sayed, Najib M
A1 - Leech, Vanessa
A1 - Berriman, Matthew
A1 - Hall, Neil
A1 - Donelson, John E
A1 - Baltz, Théo
KW - Amino Acid Sequence
KW - Animals
KW - Base Sequence
KW - Cloning, Molecular
KW - DNA Primers
KW - DNA, Protozoan
KW - Escherichia coli
KW - Genes, Protozoan
KW - Molecular Sequence Data
KW - Multigene Family
KW - Mutagenesis, Insertional
KW - Phylogeny
KW - Polymorphism, Genetic
KW - Protozoan Proteins
KW - Pseudogenes
KW - Retroelements
KW - sequence alignment
KW - Sequence Homology, Amino Acid
KW - Telomere
KW - Trypanosoma brucei brucei
KW - Trypanosoma cruzi
AB - We describe a novel gene family that forms clusters in subtelomeric regions of Trypanosoma brucei chromosomes and partially accounts for the observed clustering of retrotransposons. The ingi and ribosomal inserted mobile element (RIME) non-LTR retrotransposons share 250 bp at both extremities and are the most abundant putatively mobile elements, with about 500 copies per haploid genome. From cDNA clones and subsequently in the T. brucei genomic DNA databases, we identified 52 homologous gene and pseudogene sequences, 16 of which contain a RIME and/or ingi retrotransposon inserted at exactly the same relative position. Here these genes are called the RHS family, for retrotransposon hot spot. Comparison of the protein sequences encoded by RHS genes (21 copies) and pseudogenes (24 copies) revealed a conserved central region containing an ATP/GTP-binding motif and the RIME/ingi insertion site. The RHS proteins share between 13 and 96% identity, and six subfamilies, RHS1 to RHS6, can be defined on the basis of their divergent C-terminal domains. Immunofluorescence and Western blot analyses using RHS subfamily-specific immune sera show that RHS proteins are constitutively expressed and occur mainly in the nucleus. Analysis of Genome Survey Sequence databases indicated that the Trypanosoma brucei diploid genome contains about 280 RHS (pseudo)genes. Among the 52 identified RHS (pseudo)genes, 48 copies are in three RHS clusters located in subtelomeric regions of chromosomes Ia and II and adjacent to the active bloodstream form expression site in T. brucei strain TREU927/4 GUTat10.1. RHS genes comprise the remaining sequence of the size-polymorphic "repetitive region" described for T. brucei chromosome I, and a homologous gene family is present in the Trypanosoma cruzi genome.
VL - 1
CP - 1
ER -