CBCB Research in Progress Series (RIPS) & CBCB Industry Speaker Series

UPDATE (Jan. 24, 2022): For Spring 2022, CBCB Seminars will be held virtually via Zoom. Unless otherwise noted, Seminar Talks will take place on Thursday afternoons at 2PM. The format for the RIPS talks will be the same as in previous semesters; each week we will look to have 2 30-minute talks. However, if you feel you need an hour slot, please specify this in the second row of the corresponding date on the spreadsheet. The signup sheet for this semester is here. For Zoom meeting details to view RIP talks contact Barbara Lewis.

The CBCB RIP series provides an informal forum for computational biologists to keep abreast of colleagues' projects, to help students and postdocs hone their presentation skills, and to get expert feedback on new or ongoing projects. The forum is targeted towards anyone working at the interface of Biology and Analytical sciences. This is a great opportunity for everyone in our CBCB community to come together, and to learn about the research being done by our colleagues in different labs within the center.

Other seminars you may be interested in attending can be found HERE

    CBCB RIPS & Industry Speaker Series Schedule for Spring Semester 2022




    Topic & Abstract

    Time (if other than 2PM)


    Mikhail Kolmogorov

    New Investigator at NCI

    Algorithms for genome and metagenome assembly using long reads

    Abstract: Long-read sequencing technologies have substantially improved our ability to study large and complex genomes. However, de novo assembly of complex genomic and metagenomic datasets remains difficult. In this talk, I will give an algorithmic overview of the genome assembly problem. I will also highlight our Flye assembler that uses repeat graphs to generate accurate and complete assemblies. Finally, I will also present our new metagenomic assembler metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. Using metaFlye, we were able to recover complete or nearly-complete bacterial genomes from complex environmental samples, such as human gut or cow rumen. We also showed that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.




    Dr. Laura Dillon

    Invited industry speaker - Parthenon Therapeutics (Alum - El-Sayed Lab)

    Leveraging computational biology expertise in oncology translational medicine at big pharma and small biotech

    Abstract: I’ll share how my career path at NIH, FDA, AstraZeneca, and Parthenon Therapeutics were influenced by my CBBG training and vice-versa. Highlighted topics will include machine learning in digital pathology, emerging imaging technologies, and biomarker discovery for oncology biologics. I’ll provide insights into how students can pursue a bioinformatics or bioinformatics-adjacent career in industry and make themselves more attractive to hiring managers.



    Dr. Wai Lim Ku

    NIH (Host- Najib El-Sayed)

    The roles of bivalent histone modifications in mediating cellular plasticity at the single-cell level

    Abstract: Cellular plasticity describes the effectiveness of the conversion from one cell type to the other cell types in response to changing conditions. It plays a critical role in several diseases and their regulators can contribute to the development of novel therapeutic strategies; however the underlying global mechanism of plasticity is still unknown. A consensus is that bivalent domains, which are chromatin regions where H3K4me3 (Histone modifications related to gene activation) and H3K27me3 (Histone modifications related to gene repression) coexist, are critical for cellular plasticity. However, its functional role in plasticity is challenged by its universal existence in different cell types including somatic cells. By applying an innovative single-cell technique, we captured the genome-wide profiling of H3K4me3 and H3K27me3 for human white blood cells and observed a positive correlation between the cell-to-cell variations of H3K4me3 and H3K27me3 at bivalent domains. To further interpret this result, we developed a computational model to simulate the histone dynamics at bivalent domains, where we discovered that the histone interactions among bivalent domains are one of the key factors that contributed to the positive correlation. Furthermore, the inferred histone interaction network of bivalent domains using the single-cell experimental data is cell-type specific, suggesting that cellular plasticity, and the progression of cell development, is regulated by the histone interaction network of bivalent domains. Our work reveals unique insights into the relationship between cellular states and bivalent domains and may offer a basis for future drug development.



    Spring Break - No RIPS



    Seth Commichaux

    Mihai Pop and Hugh Rand

    Assessing the added value of horizontally transferred plasmids for the 2020 Salmonella enterica Newport onion outbreak investigation

    Abstract: The Salmonella enterica Newport red onion outbreak of 2020 was the largest foodborne outbreak of Salmonella in over a decade. The epidemiological investigation implicated two onion farms in California as the likely source of contamination. However, single nucleotide polymorphism analysis of the whole genome sequencing data found that none of the isolates collected from the farm regions were closely related to the clinical isolates—preventing the use of phylogenetics in source identification. Here we explored an alternative method for analyzing the whole genome sequencing data, driven by the hypothesis that clinical and environmental isolates from the same microbiome might be related by horizontally transferred plasmids.


    Dr. Mohamed Gunady

    Invited industry speaker - Illumina

    Mohamed is currently a Bioinformatics Scientist I on the IPA team at Illumina. Prior to his work at Illumina, he worked at The Henry M Jackson Foundation for the Advancement of Military Medicine for one year. Mohamed graduated from UMD after receiving his PhD in Computer Science advised by Dr. Héctor Corrada Bravo. Mohamed attended The University of Alexandria for his bachelor's degree in Computer Engineering, followed by Egypt-Japan University of Science and Technology for his master's degree in computer science.


    Mark Mammel

    Invited industry speakers - FDA Center for Food Safety and Applied Nutrition

    Genotyping Cyclospora in environmental samples by AmpliSeq

    Abstract: Cyclospora cayetanensis is a foodborne eukaryotic parasite causing watery diarrhea in the US and worldwide. Genotyping and clustering of samples would enable determination of a potential link between environmental/produce samples and clinical samples. Targeted sequences are amplified and then sequenced. The sequenced amplicons are matched to a database of known alleles and the presence/absence of each allele is used for "Eukaryotyping" analysis. An ensemble method combines distance matrices calculated by two different methods and uses hierarchical clustering to show the similarity among the samples.



    Yunheng Han

    Erin Molloy

    TREE-QMC framework

    Abstract: Increasingly, species trees are estimated from genome-scale data, bringing attention to the fact that different regions of the genome can have different evolutionary histories from each other and from the species tree due to incomplete lineage sorting (ILS). This has led to the development of many new methods, some of which take estimated gene trees as input. In this RIPS talk, I will present TREE-QMC, a new method that leverages the algorithmic framework proposed in weighted Quartet Max Cut (wQMC; Avni et al., 2015). However, unlike wQMC, which requires a set of O(n^4) weighted quartets as input, TREE-QMC operates directly on the gene trees running in O(n^3k) time. We found that the accuracy of this approach decreased with increasing numbers of species, which led us to introduce two novel normalization techniques. This substantially improved accuracy, with TREE-QMC being either as accurate or else more accurate than ASTRAL-III and FASTRAL, two of the leading methods for this problem, both of which are based on exact optimization within a constrained version of the solution space. Like FASTRAL, TREE-QMC was substantially faster than ASTRAL-III and wQMC. Finally, we re-analyze an avian data set of 3,679 ultraconserved elements with TREE-QMC, comparing the resulting tree to those from prior studies.


    Dr. Daekwon Seo

    Invited industry speaker - VP and Director of the Bioinformatics Dept. at Psomagen

    Bioinformatics in Genomics

    Abstract: Nowadays, next generation sequencing technology is widely adapted to solve biological problems. In this talk, as a sequencing provider, I will introduce services provided by Psomagen Inc. I will also present an application regarding stem cell research in the liver and the first report of successful reprogramming of human hepatocytes to a population of proliferating bipotent cells with regenerative potential.