Genome/Metagenome Assembly


The analysis of these vast amounts of data is complicated by the fact that reconstructing large genomic segments from metagenomic reads is a formidable computational challenge. Even for single organisms, the assembly of genome sequences from sequencing reads is a complex task, primarily due to ambiguities in the reconstruction that are caused by genomic repeats. In metagenomic data, additional challenges arise from the non-uniform representation of genomes in a sample as well as from the genomic variants between the sequences of closely related organisms.


assembly and analysis toolkit for metagenomics

metAMOS is an integrated assembly and analysis pipeline for metagenomic data. It is built around the Bambus2 metagenomic scaffolder and includes many current tools for assembly, gene finding, and taxonomic classification. metAMOS is under active development and changes quite frequently

Obtaining metAMOS

Genomic Variation

Graphs to Diversity: extracting genomic variation from sequence graphs

Sequencing as a tool for uncovering genome variation

Principal Investigators

Metagenomic Assembly

Assembly and Analysis Software for Exploring the Human Microbiome

Quick Links

Gene Finding

Metagenomic assembly

Principal Investigators

VALET - VALidating mETagenomic assemblies

VALET is a pipeline for performing de novo validation of metagenomic assemblies. VALET checks a number of properties that should hold true for a correct assembly (e.g., mate-pairs are aligned at the correct distance from each other in the assembly, the depth of coverage is fairly uniform along contigs, etc.). The violations of these invariants are reported allowing one to pinpoint areas that were potentially mis-assembled, or to compare the quality of different assemblies.

Genome assembly validation

Despite continued advances in the development of assembly algorithms, few tools are available that evaluate the correctness of the assemblies generated. With the exception of the few genomes that are manually curated by experts during an expensive process called finishing, most genome data is published as "draft" assemblies whose quality is uncertain. The correctness of the long range connectivity of the assembly is an essential prerequisite for any comparative genomic studies, as mis-assemblies can lead to incorrect conclusions.

Students and Postdoctoral researchers:

Principal Investigators

Genome Sequence Assembly

Despite the fact that the assembly of bacterial genomes has become a routine task at major sequencing centers, the assembly problem is far from being solved. Many new challenges are uncovered as scientists tackle diverse new organisms. Furthermore new sequencing technologies will change the assumptions currently made on the characteristics of the data being assembled.

Students and Postdoctoral researchers:

Principal Investigators

Genome Assembly and Analysis with Optical Restriction Maps

Optical Mapping Data as a Guide for Genome Assembly

Genome assembly -- the task of reconstructing a genome from the small fragments of DNA that can be sequenced by modern technologies -- is a difficult computational problem, in no small part due to the fact that the shotgun sequencing process cannot preserve the long-range structure of the genome being assembled. Optical mapping is a genomic technology, pioneered by David Schwartz, which can map the location of restriction sites along a genomic chromosome. Thus, optical mapping provides a long-range sparse representation

Students and Postdoctoral researchers:

Principal Investigators


iMetAMOS is an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. iMetAMOS is available as a workflow within the metAMOS package starting with version 1.5.


Subscribe to RSS - Genome/Metagenome Assembly