TY - JOUR
T1 - Genome assortment, not serogroup, defines Vibrio cholerae pandemic strains
JF - NatureNature
Y1 - 2009
A1 - Brettin, Thomas S.
A1 - Bruce, David C.
A1 - Challacombe, Jean F.
A1 - Detter, John C.
A1 - Han, Cliff S.
A1 - Munik, A. C.
A1 - Chertkov, Olga
A1 - Meincke, Linda
A1 - Saunders, Elizabeth
A1 - Choi, Seon Y.
A1 - Haley, Bradd J.
A1 - Taviani, Elisa
A1 - Jeon, Yoon-Seong
A1 - Kim, Dong Wook
A1 - Lee, Jae-Hak
A1 - Walters, Ronald A.
A1 - Hug, Anwar
A1 - Rita R. Colwell
KW - 59
KW - CHOLERA
KW - genes
KW - Genetics
KW - GENOTYPE
KW - ISLANDS
KW - ORIGIN
KW - PHENOTYPE
KW - PUBLIC HEALTH
KW - recombination
KW - STRAINS
KW - Toxins
AB - Vibrio cholerae, the causative agent of cholera, is a bacterium autochthonous to the aquatic environment, and a serious public health threat. V. cholerae serogroup O1 is responsible for the previous two cholera pandemics, in which classical and El Tor biotypes were dominant in the 6th and the current 7th pandemics, respectively. Cholera researchers continually face newly emerging and re-emerging pathogenic clones carrying combinations of new serogroups as well as of phenotypic and genotypic properties. These genotype and phenotype changes have hampered control of the disease. Here we compare the complete genome sequences of 23 strains of V. cholerae isolated from a variety of sources and geographical locations over the past 98 years in an effort to elucidate the evolutionary mechanisms governing genetic diversity and genesis of new pathogenic clones. The genome-based phylogeny revealed 12 distinct V. cholerae phyletic lineages, of which one, designated the V. cholerae core genome (CG), comprises both O1 classical and EI Tor biotypes. All 7th pandemic clones share nearly identical gene content, i.e., the same genome backbone. The transition from 6th to 7th pandemic strains is defined here as a 'shift' between pathogenic clones belonging to the same O1 serogroup, but from significantly different phyletic lineages within the CG clade. In contrast, transition among clones during the present 7th pandemic period can be characterized as a 'drift' between clones, differentiated mainly by varying composition of laterally transferred genomic islands, resulting in emergence of variants, exemplified by V.cholerae serogroup O139 and V.cholerae O1 El Tor hybrid clones that produce cholera toxin of classical biotype. Based on the comprehensive comparative genomics presented in this study it is concluded that V. cholerae undergoes extensive genetic recombination via lateral gene transfer, and, therefore, genome assortment, not serogroup, should be used to define pathogenic V. cholerae clones.
ER -
TY - Generic
T1 - Inexact Local Alignment Search over Suffix Arrays
T2 - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09
Y1 - 2009
A1 - Ghodsi, M.
A1 - M. Pop
KW - bacteria
KW - Bioinformatics
KW - biology computing
KW - Computational Biology
KW - Costs
KW - DNA
KW - DNA homology searches
KW - DNA sequences
KW - Educational institutions
KW - generalized heuristic
KW - genes
KW - Genetics
KW - genome alignment
KW - Genomics
KW - human
KW - inexact local alignment search
KW - inexact seeds
KW - local alignment
KW - local alignment tools
KW - memory efficient suffix array
KW - microorganisms
KW - molecular biophysics
KW - mouse
KW - Organisms
KW - Sensitivity and Specificity
KW - sequences
KW - suffix array
KW - USA Councils
AB - We describe an algorithm for finding approximate seeds for DNA homology searches. In contrast to previous algorithms that use exact or spaced seeds, our approximate seeds may contain insertions and deletions. We present a generalized heuristic for finding such seeds efficiently and prove that the heuristic does not affect sensitivity. We show how to adapt this algorithm to work over the memory efficient suffix array with provably minimal overhead in running time. We demonstrate the effectiveness of our algorithm on two tasks: whole genome alignment of bacteria and alignment of the DNA sequences of 177 genes that are orthologous in human and mouse. We show our algorithm achieves better sensitivity and uses less memory than other commonly used local alignment tools.
JA - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09
PB - IEEE
SN - 978-0-7695-3885-3
ER -
TY - JOUR
T1 - Microbial oceanography in a sea of opportunity
JF - NatureNature
Y1 - 2009
A1 - Bowler, Chris
A1 - Karl, David M.
A1 - Rita R. Colwell
KW - Astronomy
KW - astrophysics
KW - Biochemistry
KW - Bioinformatics
KW - Biology
KW - biotechnology
KW - cancer
KW - cell cycle
KW - cell signalling
KW - climate change
KW - Computational Biology
KW - development
KW - developmental biology
KW - DNA
KW - drug discovery
KW - earth science
KW - ecology
KW - environmental science
KW - Evolution
KW - evolutionary biology
KW - functional genomics
KW - Genetics
KW - Genomics
KW - geophysics
KW - immunology
KW - interdisciplinary science
KW - life
KW - marine biology
KW - materials science
KW - medical research
KW - medicine
KW - metabolomics
KW - molecular biology
KW - molecular interactions
KW - nanotechnology
KW - Nature
KW - neurobiology
KW - neuroscience
KW - palaeobiology
KW - pharmacology
KW - Physics
KW - proteomics
KW - quantum physics
KW - RNA
KW - Science
KW - science news
KW - science policy
KW - signal transduction
KW - structural biology
KW - systems biology
KW - transcriptomics
AB - Plankton use solar energy to drive the nutrient cycles that make the planet habitable for larger organisms. We can now explore the diversity and functions of plankton using genomics, revealing the gene repertoires associated with survival in the oceans. Such studies will help us to appreciate the sensitivity of ocean systems and of the ocean's response to climate change, improving the predictive power of climate models.
VL - 459
SN - 0028-0836
ER -
TY - JOUR
T1 - A book like its cover
JF - HeredityHeredity
Y1 - 2004
A1 - Michael P. Cummings
KW - animal and plant breeding
KW - biometrical and statistical genetics
KW - cytogenetics
KW - ecological
KW - eukaryotes
KW - Genetics
KW - Genomics
KW - human population genetics
KW - population and evolutionary genetics
KW - post-genomics
AB - An official journal of the Genetics Society, Heredity publishes high-quality articles describing original research and theoretical insights in all areas of genetics. Research papers are complimented by News & Commentary articles and reviews, keeping researchers and students abreast of hot topics in the field.
VL - 93
SN - 0018-067X
ER -
TY - Generic
T1 - Dynamic querying for pattern identification in microarray and genomic data
T2 - 2003 International Conference on Multimedia and Expo, 2003. ICME '03. Proceedings
Y1 - 2003
A1 - Hochheiser, H.
A1 - Baehrecke, E. H.
A1 - Stephen M. Mount
A1 - Shneiderman, Ben
KW - Bioinformatics
KW - data sets
KW - Displays
KW - dynamic querying
KW - expression profiles
KW - Frequency
KW - Gene expression
KW - genes
KW - Genetics
KW - genomic data
KW - Genomics
KW - linear ordered sequences
KW - macromolecules
KW - medical signal processing
KW - Mice
KW - Microarray
KW - pattern identification
KW - pattern recognition
KW - premRNA splicing
KW - Query processing
KW - sequences
KW - Signal processing
KW - splicing
KW - TimeSearcher
AB - Data sets involving linear ordered sequences are a recurring theme in bioinformatics. Dynamic query tools that support exploration of these data sets can be useful for identifying patterns of interest. This paper describes the use of one such tool - timesearcher - to interactively explore linear sequence data sets taken from two bioinformatics problems. Microarray time course data sets involve expression levels for large numbers of genes over multiple time points. Timesearcher can be used to interactively search these data sets for genes with expression profiles of interest. The occurrence frequencies of short sequences of DNA in aligned exons can be used to identify sequences that play a role in the pre-mRNA splicing. Timesearcher can be used to search these data sets for candidate splicing signals.
JA - 2003 International Conference on Multimedia and Expo, 2003. ICME '03. Proceedings
PB - IEEE
VL - 3
SN - 0-7803-7965-9
ER -
TY - JOUR
T1 - Genetic consequences of ecological reserve design guidelines: An empirical investigation
JF - Conserv GenetConserv Genet
Y1 - 2003
A1 - Neel, M. C.
A1 - Michael P. Cummings
KW - albens
KW - Astragulus
KW - Bernardino
KW - conservation
KW - design
KW - diversity
KW - Erigeron
KW - Eriogonum
KW - genetic
KW - Genetics
KW - goodmaniana
KW - Mountains
KW - ovalifolium
KW - Oxytheca
KW - parishii
KW - plant
KW - reserve
KW - San
KW - var.
KW - vineum
AB - We assessed the genetic diversity consequences of applying ecological reserve design guidelines to four federally-listed globally-rare plant species. Consequences were measured using two metrics: proportion of all alleles and of common alleles included in reserves. Common alleles were defined as those alleles having a frequency of greater than or equal to0.05 in at least one population. Four conservation professionals applied ecological reserve guidelines to choose specific populations of each species for inclusion in reserves of size 1 to N - 1, where N is the total number of populations of each species. Information regarding genetic diversity was not used in selecting populations. The resulting reserve designs were compared to random designs, and the agreement among experts was assessed using Kendall's coefficient of concordance. Application of ecological reserve design guidelines proved mostly ineffective in capturing more genetic diversity than is captured selecting populations randomly. Meeting established targets for genetic diversity, such as one advocated by the Center for Plant Conservation, required larger numbers of populations than are suggested to be sufficient. Relative performance of expert designs differed among species and was dependent on whether the proportion of all alleles or of common alleles was used as a measure of diversity. Furthermore there was no significant concordance among experts in order in which populations were incorporated into reserves as experts differed in priority they placed on individual guidelines.
VL - 4
ER -
TY - JOUR
T1 - Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals
JF - J. ACMJ. ACM
Y1 - 1999
A1 - Sridhar Hannenhalli
A1 - Pevzner, Pavel A.
KW - Computational Biology
KW - Genetics
AB - Genomes frequently evolve by reversals &rgr;(i,j) that transform a gene order &pgr;1 … &pgr;i&pgr;i+1 … &pgr;j-1&pgr;j … &pgr;n into &pgr;1 … &pgr;i&pgr;j-1 … &pgr;i+1&pgr;j … &pgr;n. Reversal distance between permutations &pgr; and &sgr;is the minimum number of reversals to transform &pgr; into &Agr;. Analysis of genome rearrangements in molecular biology started in the late 1930's, when Dobzhansky and Sturtevant published a milestone paper presenting a rearrangement scenario with 17 inversions between the species of Drosophilia. Analysis of genomes evolving by inversions leads to a combinatorial problem of sorting by reversals studied in detail recently. We study sorting of signed permutations by reversals, a problem that adequately models rearrangements in a small genomes like chloroplast or mitochondrial DNA. The previously suggested approximation algorithms for sorting signed permutations by reversals compute the reversal distance between permutations with an astonishing accuracy for both simulated and biological data. We prove a duality theorem explaining this intriguing performance and show that there exists a “hidden” parameter that allows one to compute the reversal distance between signed permutations in polynomial time.
VL - 46
SN - 0004-5411
ER -
TY - Generic
T1 - Transforming men into mice (polynomial algorithm for genomic distance problem)
T2 - Foundations of Computer Science, Annual IEEE Symposium on
Y1 - 1995
A1 - Sridhar Hannenhalli
A1 - Pevzner, P. A.
KW - biology computing
KW - combinatorial properties
KW - comparative physical mapping data
KW - computable parameters
KW - duality (mathematics)
KW - duality theorem
KW - evolution (biological)
KW - Genetics
KW - genome rearrangement algorithm
KW - genomic distance problem
KW - genomic rearrangements
KW - human-mouse evolution
KW - mammalian evolution
KW - multi chromosomal genomes
KW - parsimonious rearrangement scenarios
KW - pattern matching
KW - polynomial algorithm
KW - polynomial time algorithm
KW - set theory
KW - sorting
KW - string matching
KW - strings
KW - zoo fish
AB - Many people believe that transformations of humans into mice happen only in fairy tales. However, despite some differences in appearance and habits, men and mice are genetically very similar. In the pioneering paper, J.H. Nadeau and B.A. Taylor (1984) estimated that surprisingly few genomic rearrangements (178/spl plusmn/39) happened since the divergence of human and mouse 80 million years ago. However, their analysis is nonconstructive and no rearrangement scenario for human-mouse evolution has been suggested yet. The problem is complicated by the fact that rearrangements in multi chromosomal genomes include inversions, translocations, fusions and fissions of chromosomes, a rather complex set of operations. As a result, at first glance, a polynomial algorithm for the genomic distance problem with all these operations looks almost as improbable as the transformation of a (real) man into a (real) mouse. We prove a duality theorem which expresses the genomic distance in terms of easily computable parameters reflecting different combinatorial properties of sets of strings. This theorem leads to a polynomial time algorithm for computing most parsimonious rearrangement scenarios. Based on this result and the latest comparative physical mapping data we have constructed a scenario of human-mouse evolution with 131 reversals/translocaitons/fusions/fissions. A combination of the genome rearrangement algorithm with the recently proposed experimental technique called ZOO FISH suggests a new constructive approach to the 100 year old problem of reconstructing mammalian evolution.
JA - Foundations of Computer Science, Annual IEEE Symposium on
PB - IEEE Computer Society
CY - Los Alamitos, CA, USA
ER -
TY - Generic
T1 - A SIMD solution to the sequence comparison problem on the MGAP
T2 - International Conference on Application Specific Array Processors, 1994. Proceedings
Y1 - 1994
A1 - Borah, M.
A1 - Bajwa, R. S.
A1 - Sridhar Hannenhalli
A1 - Irwin, M. J.
KW - AT-optimal algorithm
KW - Biological information theory
KW - biology computing
KW - biosequence comparison problem
KW - computational complexity
KW - Computer science
KW - Costs
KW - database size
KW - Databases
KW - DNA computing
KW - dynamic programming
KW - dynamic programming algorithms
KW - fine-grained massively parallel processor array
KW - Genetics
KW - Heuristic algorithms
KW - maximally similar sequence
KW - MGAP parallel computer
KW - Micro-Grain Array Processor
KW - Military computing
KW - molecular biology
KW - molecular biophysics
KW - Nearest neighbor searches
KW - nearest-neighbor connections
KW - Parallel algorithms
KW - pipeline processing
KW - pipelined SIMD solution
KW - sequence alignment problem
KW - sequences
AB - Molecular biologists frequently compare an unknown biosequence with a set of other known biosequences to find the sequence which is maximally similar, with the hope that what is true of one sequence, either physically or functionally, could be true of its analogue. Even though efficient dynamic programming algorithms exist for the problem, when the size of the database is large, the time required is quite long, even for moderate length sequences. In this paper, we present an efficient pipelined SIMD solution to the sequence alignment problem on the Micro-Grain Array Processor (MGAP), a fine-grained massively parallel array of processors with nearest-neighbor connections. The algorithm compares K sequences of length O(M) with the actual sequence of length N, in O(M+N+K) time with O(MN) processors, which is AT-optimal. The implementation on the MGAP computes at the rate of about 0.1 million comparisons per second for sequences of length 128
JA - International Conference on Application Specific Array Processors, 1994. Proceedings
PB - IEEE
SN - 0-8186-6517-3
ER -