TY - Generic T1 - MetaPhyler: Taxonomic profiling for metagenomic sequences T2 - 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Y1 - 2010 A1 - Liu, Bo A1 - Gibbons, T. A1 - Ghodsi, M. A1 - M. Pop KW - Bioinformatics KW - CARMA comparison KW - Databases KW - Genomics KW - Linear regression KW - marker genes KW - matching length KW - Megan comparison KW - metagenomic sequences KW - metagenomics KW - MetaPhyler KW - microbial diversity KW - microorganisms KW - molecular biophysics KW - molecular configurations KW - Pattern classification KW - pattern matching KW - phylogenetic classification KW - Phylogeny KW - PhymmBL comparison KW - reference gene database KW - Sensitivity KW - sequence matching KW - taxonomic classifier KW - taxonomic level KW - taxonomic profiling KW - whole metagenome sequencing data AB - A major goal of metagenomics is to characterize the microbial diversity of an environment. The most popular approach relies on 16S rRNA sequencing, however this approach can generate biased estimates due to differences in the copy number of the 16S rRNA gene between even closely related organisms, and due to PCR artifacts. The taxonomic composition can also be determined from whole-metagenome sequencing data by matching individual sequences against a database of reference genes. One major limitation of prior methods used for this purpose is the use of a universal classification threshold for all genes at all taxonomic levels. We propose that better classification results can be obtained by tuning the taxonomic classifier to each matching length, reference gene, and taxonomic level. We present a novel taxonomic profiler MetaPhyler, which uses marker genes as a taxonomic reference. Results on simulated datasets demonstrate that MetaPhyler outperforms other tools commonly used in this context (CARMA, Megan and PhymmBL). We also present interesting results obtained by applying MetaPhyler to a real metagenomic dataset. JA - 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) PB - IEEE SN - 978-1-4244-8306-8 ER -