基于异构网络嵌入的调控元件功能标注方法(张松懋,王勇)

2022-07-11 | 撰稿: | 浏览:

   Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts.
   We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data.
Bioinformatics, Vol. 38, Iss. 10, 2899-2911. 
  Author: 
  Yurun Lu 
  CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 
  School of Mathematics, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China     
  Zhanying Feng 
  CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 
  School of Mathematics, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China    
  Songmao Zhang
  CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 
  Email: smzhang@math.ac.cn      
  Yong Wang 
  CEMS, NCMIS, HCMS, MADIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 
  School of Mathematics, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100049, China 
  Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China 
  Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 330106, China 
  Email: ywang@amss.ac.cn

科研进展中国科学院数学与系统科学研究院应用数学研究所
地址 北京市海淀区中关村东路55号 思源楼6-7层 南楼5-6、8层 邮编:100190 电子邮箱:iam@amss.ac.cn
@2000-2022 京ICP备05058656号-1