I joined the Research Institute of Oncology and Hematology (RIOH) after working as a bioinformatician at the Supercomputing Institute of the University of Minnesota and the NGS platform at the University of Manitoba. I have previously collaborated on a variety of projects ranging from mammalian cancer genomics to plant genomes and microorganism genomes. My expertise is applying Bioinformatics approaches to solve biological problems, particularly to unveil the genomic and epigenomic regimes using large high throughput data including DNA-seq, RNA-seq, ChIP-seq, Methyl-seq, as well as microarray data. My current research focuses are:
1. Discovering gene signatures and targets for precision medicine of cancer therapy
Even being the same type of cancers, each cancer patient is different in terms of tumor initiation factors and the response to drug treatments. With the advancement in Next-Generation Sequencing (NGS) technology, multiple international consortia have made numerous large genomic and epigenomic data of various cancer types publicly accessible. Our lab develops new bioinformatics models and uses these large data to discover new gene signatures and gene targets aiming to aid tailoring for patient therapies.
2. Investigating tumorigenesis beyond driver gene mutations
It is generally accepted that driver gene mutations initialize the tumorigenesis. However, very few driver mutations are shared among cancers and even in the same patient with multiple tumors of the same types, there are few or none common drivers. Our lab works with an international collaborator group on targeted, whole exome, and whole genome sequencing data. We investigate SNV, CNV, SV, INDEL, LOH, cellular clonal structure in order to uncover new tumorigenesis factors in lung cancer.
3. Developing new algorithms and tools for large sequencing data analysis
The next-generation sequencing (NGS) technologies have produced large amount of various genomic data for a variety of experiments which had a tremendous impact on life science research. RNA-seq uses the large amount of short sequence reads to interrogate the transcriptome, the existence and level of RNAs. How to accurately quantify the RNAs' level without much computational cost has been one of the focuses for NGS bioinformatics research. Our lab designs an alternative RNA-seq estimate method called Feature Structure (FEST) method. In contrast to current RNA-seq expression estimation methods that do not account for GC bias and RNA degradation, FEST will handle read redundant, GC bias probability, and degradation rate in each read to estimate the expression level.