Yun (Renee) Zhang, PhD

Assistant Professor

Yun Zhang, PhD, is an assistant professor in the Informatics Department at the J. Craig Venter Institute (JCVI). She received an MMath in Mathematics and Statistics from the University of Oxford, UK, and a PhD in Statistics from the University of Rochester Medical Center. She also has industrial and research experience in Novartis Oncology and Mayo Clinic.

Dr. Zhang’s research interest includes statistical modeling and methodology development for big data produced by advanced biotechnologies. She is experienced in analyzing time-course microarray data, DNA methylation data, and microRNA sequencing data. She is also a professional developer of R and Bioconductor packages. Her recent focus is on applying statistical approaches to single cell RNA sequencing (scRNAseq) data.

Research Priorities

Mapping cell populations in scRNAseq data
  • Development of statistical approach for comparing new experimental data with cell type reference definitions to determine if new data represent existing or novel cell types
  • Development of statistically-comparable representation of reference cell types for the Human Cell Atlas
Gene set enrichment analysis (GSEA) pipelines with overlapping genes
  • Established pipeline FUNNEL-GSEA for time-course gene expression data using functional data analysis techniques
  • Development of data-driven method to empirically decompose the gene membership among multiple overlapped pathways
  • Extension of FUNNEL to data with limited time points, e.g. cross-sectional data, pre-post study, etc.
Investigation of tissue composition on gene co-expression
  • Investigation of the effect of composite cellular types on reconstruction of gene co-expression network
  • Application of deconvolution algorithm to tissue composition problems

Publications

Scientific data. 2023-01-24; 10.1: 50.
Brain Data Standards - A method for building data-driven cell-type ontologies
Tan SZK, Kir H, Aevermann BD, Gillespie T, Harris N, Hawrylycz MJ, Jorstad NL, Lein ES, Matentzoglu N, Miller JA, Mollenkopf TS, Mungall CJ, Ray PL, Sanchez REA, Staats B, Vermillion J, Yadav A, Zhang Y, Scheuermann RH, Osumi-Sutherland D
PMID: 36693887
Bioinformatics (Oxford, England). 2022-10-14; 38.20: 4735-4744.
FastMix: a versatile data integration pipeline for cell type-specific biomarker inference
Zhang Y, Sun H, Mandava A, Aevermann BD, Kollmann TR, Scheuermann RH, Qiu X, Qian Y
PMID: 36018232
PloS one. 2022-09-23; 17.9: e0275070.
Machine learning for cell type classification from single nucleus RNA sequencing data
Le H, Peng B, Uy J, Carrillo D, Zhang Y, Aevermann BD, Scheuermann RH
PMID: 36149937
Scientific reports. 2022-06-15; 12.1: 9996.
Cell type matching in single-cell RNA-sequencing data using FR-Match
Zhang Y, Aevermann B, Gala R, Scheuermann RH
PMID: 35705694
Nature. 2022-04-01; 604.7904: E8.
Author Correction: Comparative cellular analysis of motor cortex in human, marmoset and mouse
Bakken TE, Jorstad NL, Hu Q, Lake BB, Tian W, Kalmbach BE, Crow M, Hodge RD, Krienen FM, Sorensen SA, Eggermont J, Yao Z, Aevermann BD, Aldridge AI, Bartlett A, Bertagnolli D, Casper T, Castanon RG, Crichton K, Daigle TL, Dalley R, Dee N, Dembrow N, Diep D, Ding SL, Dong W, Fang R, Fischer S, Goldman M, Goldy J, Graybuck LT, Herb BR, Hou X, Kancherla J, Kroll M, Lathia K, van Lew B, Li YE, Liu CS, Liu H, Lucero JD, Mahurkar A, McMillen D, Miller JA, Moussa M, Nery JR, Nicovich PR, Niu SY, Orvis J, Osteen JK, Owen S, Palmer CR, Pham T, Plongthongkum N, Poirion O, Reed NM, Rimorin C, Rivkin A, Romanow WJ, Sedeño-Cortés AE, Siletti K, Somasundaram S, Sulc J, Tieu M, Torkelson A, Tung H, Wang X, Xie F, Yanny AM, Zhang R, Ament SA, Behrens MM, Bravo HC, Chun J, Dobin A, Gillis J, Hertzano R, Hof PR, Höllt T, Horwitz GD, Keene CD, Kharchenko PV, Ko AL, Lelieveldt BP, Luo C, Mukamel EA, Pinto-Duarte A, Preiss S, Regev A, Ren B, Scheuermann RH, Smith K, Spain WJ, White OR, Koch C, Hawrylycz M, Tasic B, Macosko EZ, McCarroll SA, Ting JT, Zeng H, Zhang K, Feng G, Ecker JR, Linnarsson S, Lein ES
PMID: 35319013
Journal of leukocyte biology. 2021-12-01; 110.6: 1225-1239.
Corticosteroid treatment in COVID-19 modulates host inflammatory responses and transcriptional signatures of immune dysregulation
Pinski AN, Steffen TL, Zulu MZ, George SL, Dickson A, Tifrea D, Maroney KJ, Tedeschi N, Zhang Y, Scheuermann RH, Pinto AK, Brien JD, Messaoudi I
PMID: 34730254
Frontiers in immunology. 2021-10-29; 12.690470.
Machine Learning-Based Single Cell and Integrative Analysis Reveals That Baseline mDC Predisposition Correlates With Hepatitis B Vaccine Antibody Response
Aevermann BD, Shannon CP, Novotny M, Ben-Othman R, Cai B, Zhang Y, Ye JC, Kobor MS, Gladish N, Lee AH, Blimkie TM, Hancock RE, Llibre A, Duffy D, Koff WC, Sadarangani M, Tebbutt SJ, Kollmann TR, Scheuermann RH
PMID: 34777332
Nature. 2021-10-06; 598.7879: 111-119.
Comparative cellular analysis of motor cortex in human, marmoset and mouse
Bakken TE, Jorstad NL, Hu Q, Lake BB, Tian W, Kalmbach BE, Crow M, Hodge RD, Krienen FM, Sorensen SA, Eggermont J, Yao Z, Aevermann BD, Aldridge AI, Bartlett A, Bertagnolli D, Casper T, Castanon RG, Crichton K, Daigle TL, Dalley R, Dee N, Dembrow N, Diep D, Ding SL, Dong W, Fang R, Fischer S, Goldman M, Goldy J, Graybuck LT, Herb BR, Hou X, Kancherla J, Kroll M, Lathia K, van Lew B, Li YE, Liu CS, Liu H, Lucero JD, Mahurkar A, McMillen D, Miller JA, Moussa M, Nery JR, Nicovich PR, Niu SY, Orvis J, Osteen JK, Owen S, Palmer CR, Pham T, Plongthongkum N, Poirion O, Reed NM, Rimorin C, Rivkin A, Romanow WJ, Sedeño-Cortés AE, Siletti K, Somasundaram S, Sulc J, Tieu M, Torkelson A, Tung H, Wang X, Xie F, Yanny AM, Zhang R, Ament SA, Behrens MM, Bravo HC, Chun J, Dobin A, Gillis J, Hertzano R, Hof PR, Höllt T, Horwitz GD, Keene CD, Kharchenko PV, Ko AL, Lelieveldt BP, Luo C, Mukamel EA, Pinto-Duarte A, Preissl S, Regev A, Ren B, Scheuermann RH, Smith K, Spain WJ, White OR, Koch C, Hawrylycz M, Tasic B, Macosko EZ, McCarroll SA, Ting JT, Zeng H, Zhang K, Feng G, Ecker JR, Linnarsson S, Lein ES
PMID: 34616062
Genome research. 2021-10-01; 31.10: 1767-1780.
A machine learning method for the discovery of minimum marker gene combinations for cell type identification from single-cell RNA sequencing
Aevermann BD, Aevermann B, Zhang Y, Novotny M, Keshk M, Bakken TE, Bakken T, Miller JA, Miller J, Hodge RD, Hodge R, Lelieveldt B, Lein ES, Lein E, Scheuermann RH
PMID: 34088715
Briefings in bioinformatics. 2021-07-20; 22.4:
FR-Match: robust matching of cell type clusters from single cell RNA sequencing data using the Friedman-Rafsky non-parametric test
Zhang Y, Aevermann BD, Bakken TE, Miller JA, Hodge RD, Lein ES, Scheuermann RH
PMID: 33249453
Scientific reports. 2020-05-14; 10.1: 7954.
Longitudinal Study of Oral Microbiome Variation in Twins
Freire M, Moustafa A, Harkins DM, Torralba MG, Zhang Y, Leong P, Saffery R, Bockmann M, Kuelbs C, Hughes T, Craig JM, Nelson KE
PMID: 32409670
Briefings in bioinformatics. 2019-12-08;
The effect of tissue composition on gene co-expression
Zhang Y, Cuerdo J, Halushka MK, McCall MN
PMID: 31813949
Frontiers in immunology. 2019-11-12; 10.2602.
Host-Microbial Interactions in Systemic Lupus Erythematosus and Periodontitis
Pessoa L, Aleti G, Choudhury S, Nguyen D, Yaskell T, Zhang Y, Li W, Nelson KE, Neto LLS, Sant'Ana ACP, Freire M
PMID: 31781106
BMC bioinformatics. 2019-04-15; 20.1: 185.
Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure
Zhang Y, Bandyopadhyay G, Topham DJ, Falsey AR, Qiu X
PMID: 30987598
Briefings in bioinformatics. 2018-05-01; 19.3: 374-386.
Statistical method evaluation for differentially methylated CpGs in base resolution next-generation DNA sequencing data
Zhang Y, Baheti S, Sun Z
PMID: 28040747
Bioinformatics (Oxford, England). 2017-07-01; 33.13: 1944-1952.
FUNNEL-GSEA: FUNctioNal ELastic-net regression in time-course gene set enrichment analysis
Zhang Y, Topham DJ, Thakar J, Qiu X
PMID: 28334094

Research Priorities

Mapping cell populations in scRNAseq data
  • Development of statistical approach for comparing new experimental data with cell type reference definitions to determine if new data represent existing or novel cell types
  • Development of statistically-comparable representation of reference cell types for the Human Cell Atlas
Gene set enrichment analysis (GSEA) pipelines with overlapping genes
  • Established pipeline FUNNEL-GSEA for time-course gene expression data using functional data analysis techniques
  • Development of data-driven method to empirically decompose the gene membership among multiple overlapped pathways
  • Extension of FUNNEL to data with limited time points, e.g. cross-sectional data, pre-post study, etc.
Investigation of tissue composition on gene co-expression
  • Investigation of the effect of composite cellular types on reconstruction of gene co-expression network
  • Application of deconvolution algorithm to tissue composition problems