Emerging single cell genomics technologies such as single cell RNA sequencing provide new opportunities for discovery of previously unknown cell types and facilitating the study of biological processes such as cancer development. Clustering and visualization using dimensionality reduction techniques such as t-SNE and UMAP are the fundamental steps in analyzing high-dimensional data produced by the technologies. However, computational models have been challenged by the exponential growth of the data thanks to the growth of large-scale genomic projects such as the Human Cell Atlas. In this talk, we will introduce Specter, a computational method that utilizes recent algorithmic advances in fast spectral clustering and ensemble learning. Specter achieves a substantial improvement in accuracy over existing methods and identifies rare cell types with high sensitivity. Moreover, its speed allows Specter to scale to millions of cells and leads to fast computation times in practice. In addition, we will present j-SNE and j-UMAP as the generalizations to the joint visualization of multimodal omics data, e.g., CITE- seq data that simultaneously measures gene and protein marker expression. The approach automatically learns the relative importance of each modality in order to obtain a concise representation of the data.
Speaker: Mr. Do Van Hoan, Univ. Munich
Time: 15:30, Tuesday, July 13, 2021
Venue: Webinar; Access code: https://bit.ly/3wvKLSL
Mr. Van-Hoan Do received his Bachelor degree in Mathematics from the Vietnam National University, Hanoi (VNU), Vietnam (2013) and Master degree in Mathematics from the Freie Universität Berlin & Berlin Mathematical School in Germany (2017). Currently he is a 4th year PhD student at Gene Center, LMU Munich. His work has focused on developing computational methods and user-friendly tools for the analysis of large-scale single cell RNA-seq and multimodal omics data. He has developed several computational methods & tools (e.g., Sphetcher, Specter, and Jvis) for analyzing single cell genomics data. The methods have been published in high impact journals such as Genome Research, Genome Biology. His research interests include machine learning, optimization, and big data.