当前位置:首页 > 学术报告 > 生物信息中心学术报告

报告题目: Structure and dynamics of core/periphery networks
报 告 人: Dr.Peter Csermely,Department of Medical Chemistry, Semmelweis University, Hungary
时间地点: 2013年9月10日下午4:00 思源楼1013
摘要: Recent studies uncovered important core/periphery network structures characterizing complex sets of cooperative and competitive interactions between network nodes, be they proteins, cells, species or humans. Better characterization of the structure, dynamics and function of core/periphery networks is a key step of our understanding cellular functions, species adaptation, social and market changes. Here we summarize the current knowledge of the structure and dynamics of "traditional" core/periphery networks, rich-clubs, nested, bow-tie and onion networks. Comparing core/periphery structures with network modules, we give a definition of weak sense and strong sense core/periphery networks, and discriminate between global and local cores. The core/periphery network organization lies in the middle of several extreme properties, such as random/condensed structures, clique/star configurations, network symmetry/asymmetry, network assortativity/disassortativity, as well as network hierarchy/anti-hierarchy. These properties of high complexity together with the large degeneracy of core pathways ensuring cooperation and providing multiple options of network flow re-channelling, greatly contribute to the high robustness of complex systems. Core processes enable a coordinated response to various stimuli, decrease noise, and evolve slowly. The integrative function of network cores is an important step in the development of a large variety of complex organisms and organizations. Despite of all these important features and several decades of research interest, studies on core/periphery networks still have a number of unexplored areas. At the end of this review we list several open questions, and encourage our colleagues to enrich further this exciting field.
报告题目: Statistical Methods for Analyzing High-throughput Genomic Data
报 告 人: Jingyi Jessica Li, University of California, Berkeley
时间地点: 2013年4月11日上午10:00 思源楼712
摘要: In the burgeoning field of genomics, high-throughput technologies (e.g. microarrays, next-generation sequencing and label-free mass spectrometry) have enabled biologists to perform global analysis on thousands of genes, mRNAs and proteins simultaneously. Extracting useful information from enormous amounts of high-throughput genomic data is an increasingly pressing challenge to statistical and computational science. In this talk, I will present three projects in which statistical and computational methods were used to analyze high-throughput genomic data to address important biological questions. The first part of my talk will demonstrate the power of simple statistical analysis in correcting biases of large-scale protein level estimates and in understanding the relationship between gene transcription and protein levels. The second part will focus on a statistical method called “SLIDE” that employs probabilistic modeling and L1 sparse estimation to answer an important question in genomics: how to identify and quantify mRNA products of gene transcription (i.e, isoforms) from next generation RNA sequencing data? In the final part, I will introduce an ongoing project where we developed a new statistical measure under a local regression and clustering framework to capture non-functional relationships between a pair of variables. This new measure will have broad potential applications in genomics and other fields.
报告题目: Alignment-Free Genome and Metagenome Comparison Based on NGS Reads
报 告 人: Fengzhu Sun, PhD,Molecular and Computational Biology Program,University of Southern California
时间地点: 2013年3月27日上午10:00 思源楼1013
摘要: Next generation sequencing (NGS) technologieshave generated enormous amount ofshotgun read data and assembly of the reads can be challenging, especially fororganisms without template sequences. We develop novel alignment-free and assemble-free statistics for genome and metagenome comparison and study their limit behaviors. The statistics were used to study the evolutionary relationships among 13 tree species whose templates sequences are not known. They were also used to classify metagenomes from mammalian species, global ocean survey, and human gut.In both applications, our novel statistics yield clustering that is consistent with biological knowledge. Thus, our statistics provide a powerful alternative approach for genome and metagenome comparison based on NGS short reads.

B Jiang et al. (2012) Comparison of metagenomic samples using sequence signatures. BMC genomics 13 (1), 730
K Song et al. (2012) Alignment-free sequence comparison based on next generation sequencing reads. Research in Computational Molecular Biology, 272-285

报告题目: RNA-seq to Capture Transcriptome Landscape: Models, Algorithms and Tools
报 告 人: Dongxiao Zhu, PhD Wayne State University
时间地点: 2012年12月20日 下午4:00 晨兴中心110
摘要: Human transcriptomes are highly diverse, overlapping, complex, and dynamic. The identification and quantification of the complex transcript structure is a central task to the contemporary biomedical research. The advent of next-generation RNA-seq technology provides unprecedented opportunities to attack this important problem while posing new informatics challenges. In this talk, we introduce our informatics algorithms and GUI tools to solve two important problems.
The first problem is ab initio reconstruction of the transcriptome sequence from RNA-seq reads. The latter can be viewed as randomly "sampled" from the former. This reverse engineering problem is complicated by an ultra-high throughput of the reads (hundreds of millions) and a highly non-linear transcriptome structure. We design a novel divide-and-conquer strategy to localize reads to annotated reference genome regions and develop a new algorithm to infer the nonlinear structure within each region. Using simulation studies, we have demonstrated a high accuracy in transcriptome reconstruction.
The second problem is to quantify the identified transcripts from problem 1. Due to the overlapping of the transcript sequences, the observed expression signal can be attributed to a number of isoform transcripts. We develop a novel deconvolution algorithm with shrinkage to infer the relative abundance of the isoform transcripts using the base-pair expression signal from RNA-seq experiments. Similarly we demonstrate a high accuracy in transcriptome quantification using simulation and real-world studies. Finally I briefly introduce innovative algorithms for reconstructing signaling pathway topologies.
报告题目: Needles in Haystacks: Thinking about fast evolutionary processes by genome analysis of tumors and RNA viruses
报 告 人: Prof. Raul Rabadan,Department of Biomedical Informatics, Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, USA
时间地点: 2012年8月16日 上午10:00 思源楼1013室
摘要: Cancer is the result of an accumulation of genomic alterations that synergistically cooperate to cause uncontrolled cell growth. The development of high throughput technologies is allowing to map the landscape of genomic alterations and to disentangle the combinatorial nature of these alterations. We will show some examples of these alterations in tumors of Hematopoietic and Lymphoid Tissues, including Hairy Cell Leukemia (HCL), Acute Lymphoblastic Leukemias, Diffuse Large B-cell Lymphoma and Chronic Lymphocytic Leukemia. For instance, we were able to identify a heterozygous mutation in BRAF (V600E) present in 100% of HCLs, presenting a new therapeutic opportunity through BRAF inhibitors. We will also show a novel fusion protein in Glioblastoma Multiforme. The fusion protein localizes to mitotic spindle poles, has constitutive kinase activity and induces mitotic and chromosomal segregation defects and triggers aneuploidy. Inhibition of FGFR kinase corrects the aneuploidy and oral administration of an FGFR inhibitor prolongs survival of mice harboring intracranial FGFR3-TACC3-initiated glioma. These examples show how the development of computational tools in genomic data leads to the identification of novel therapeutic targets in cancer.
报告题目: Understanding the Mechanisms of Aging: Modern Approaches to an Ancient Problem
报 告 人: Hao Li, Professor,Dept. of Biochemistry and Biophysics,and California Institute for Quantitative Biosciences (QB3),University of California, San Francisco
时间地点: 2012年8月14日 下午3:00 思源楼712室
摘要: Aging is a universal phenomenon. All species age, including humans.Although aging is inevitable, dreams for immortality started early in human civilization. Historically many attempts had been made to cure or slow down aging, and they all failed. In the last few decades, application of molecular genetics to the study of aging has led to the surprising discovery that the lifespan of a species is plastic and can be manipulated by simple genetic or dietary changes, raising the hope that we may be able to significantly extend the human lifespan and at the same time prevent or cure aging related diseases.
In this talk, I will first give an overview of the modern field of molecular genetics of aging. I will then discuss how physics/engineering approaches can be used to define the molecular mechanisms of aging. In particular, I will describe a novel microfluidic system that we have developed to study aging in single yeast cells. This system allows, for the first time, the direct visualization of various cellular and molecular events accompanying aging in single cells throughout their lifespan, leading to new insight into the mechanisms of cellular aging and death.
报告题目: Evolution and genetic variation of microRNA mediated gene regulation in humans
报 告 人: Zhaolei Zhang (张兆雷) Associate Professor, Department of Molecular Genetics, University of Toronto Faculty of Medicine, Toronto, Canada
时间地点: 2012年7月13日下午4:00 思源楼1013室
摘要: MicroRNA (miRNA) mediated gene regulation is of critical functional importance in animals and is often thought to be largely constrained during evolution. Here we show that a number of miRNA binding sites display high level of population differentiation in humans and thus are likely targets of local adaptation. In a subset we demonstrate that allelic differences modulate miRNA regulation in mammalian cells, including an interaction between miR-155 and TYRP1, a melanosomal enzyme associated with human pigmentary differences. We identify alternate alleles of TYRP1 that induce or disrupt miR-155 regulation and demonstrate that these alleles are selected with different modes among human populations, to optimize the protein abundance in response to different level of UV radiation. Our findings illustrate the evolutionary plasticity of the microRNA regulatory network in recent human evolution.
报告题目: An integrative characterization of recurrent molecular aberrations in glioblastoma
报 告 人: Chen-Hsiang Yeang (杨振翔),Institute of Statistical Science, Academia Sinica
时间地点: 2012年7月11日(星期三)下午2:00 思源楼1013室
摘要: Glioblastoma multiform (GBM) is the most common brain tumor in adults. The Cancer Genome Atlas (TCGA) project has mapped many alterations of DNA sequences, copy numbers, methylation states, mRNA and microRNA expressions in GBM cells. Alterations on DNAs may dysregulate gene expressions and drive the malignancy of tumors. It is thus important to uncover causal and statistical dependency between the {\em effector} molecular aberrations and {\em target} gene expressions in GBMs. However, despite the rich studies on combining copy number variations and gene expressions, systematic methods to integrate all types of cancer genomic data are relatively scarce.

We propose an algorithm to build association modules linking effector molecular aberrations and target gene expressions and apply the module-finding algorithm to the integrated TCGA GBM datasets. The inferred association modules are validated by six tests using external information and datasets of central nervous system tumors. Besides well-known GBM molecular aberrations, several modules associated with less well-reported molecular aberrations are also validated. In particular, modules constituting trans-acting effects with chromosome 11 CNVs and cis-acting effects with chromosome 10 CNVs manifest strong negative and positive associations with survival times. Functional and survival analyses indicate that immune/inflammatory responses and epithelial-mesenchymal transitions are among the most important processes of prognosis. Finally, we demonstrate that certain molecular aberrations uniquely recur in GBMs but are rare in non-GBM glioma cells. These results justify the utility of an integrative analysis on cancer genomes and provide testable characterizations of effector aberration events in GBMs.

报告题目: Discovering biological progression underlying high dimensional data
报 告 人: Peng Qiu,Department of Bioinformatics and Computational Biology, Univ of Texas M.D. Anderson Cancer Center
时间地点: 2012年7月11日(星期三)下午4:00 思源楼1013室
摘要: We present a novel computational approach, Sample Progression Discovery (SPD), to discover patterns of biological progression underlying high-dimensional datasets. In contrast to the majority of microarray data analysis methods which focus on identifying differences between sample groups (i.e. normal vs. cancer, treated vs. control), SPD aims to identify an underlying progression among individual samples, both within and across sample groups. This is essentially a new way of asking questions. The traditional analyses ask the following question: what is the different between A and B. In this talk, I am going to ask a different question: how did A become B, or how did one biological sample/phenotype go through gradual changes and eventually progress into another phenotype. In cancer studies, this is to ask: how did normal samples go through progressive changes and eventually become cancerous. The SPD method is designed to address this progression question. To demonstrate the utility of SPD, we applied it to gene expression datasets of cell cycle time series, B-cell differentiation, mouse embryonic stem cell differentiation, and prostate cancer. Each of these datasets is associated with a known biological progression. The known progression was hidden from the algorithm and was only used to validate the results. When applied to these datasets, SPD successfully recovered the underlying progression and genes that are associated with the progression. We will also discuss cases where SPD fails. For example, when applied to dataset without any underlying progression, SPD degenerates into a clustering tool.
报 告 人: Grace S. Shieh (謝叔蓉) ,Institute of Statistical Science, Academia Sinica
时间地点: 2012年7月11日(星期三)下午3:00 思源楼1013室
摘要: Most prokaryotic genomes are circular with a single chromosome (called circular genomes), which consist of bacteria and archaea. Orthologous genes (abbreviated as orthologs) are genes directly evolved from an ancestor gene, and can be traced through different species in evolution. Shared orthologs between bacterial genomes have been used to measure their genome evolution. Here, organization of circular genomes is analyzed via distributions of shared orthologs between genomes. However, these distributions are often asymmetric and bimodal; to date, there is no joint distribution to model such data. This motivated us to develop a family of bivariate distributions with generalized von Mises marginals (BGVM) and its statistical inference.

A new measure based on circular grade correlation and the fraction of shared orthologs is proposed for association between circular genomes, and a visualization tool developed to depict genome structure similarity. The proposed procedures are applied to eight pairs of prokaryotes separated from domain down to species, and 13 mycoplasma bacteria that are mammalian pathogens belonging to the same genus. We close with remarks on further applications to many features of genomic organization, e.g. shared transcription factor binding sites, between any pair of circular genomes. Thus, the proposed procedures may be applied to identifying conserved chromosome backbones, among others, for genome construction in synthetic biology.

All codes of the BGVM procedures and 1000+ prokaryotic genomes are available at http://www.stat.sinica.edu.tw/~gshieh/bgvm.htm

报告题目: Two Birds with One Stone: A Tool for Gene Duplication Inferences via Reconciliation and Species Tree Reconstruction
报 告 人: Prof. Louxin Zhang,National University of Singapore
时间地点: 2012年6月5日下午4:00 思源楼1013室
摘要: Millions of genes in the modern species belong to only thousands of gene families. A gene family includes instances of the same gene in different species and duplicate genes in the same species. Two genes in different species are ortholog if they diverged when the most recent common ancestor of the species speciated. Orthologs are used to infer signaling pathway evolution and correspondence between genotype and phenotype and hence ortholog identification is a basic task in comparative genomics. Because of complex gene evolutionary history, however, ortholog identification is extremely difficult. One key method for it is to use an explicit model of the evolutionary history of the genes subject to study, called the gene (family) tree. It compares the gene tree with the evolutionary history of the species in which the genes reside, called the species tree, using a procedure known as the tree reconciliation. Tree reconciliation presents challenging problems when species trees are not binary in practice.
Here, non-binary gene and species tree reconciliation is studied in a binary refinement model, which unifies gene duplication inference through tree reconciliation with reconstruction of species tree from gene trees. The study produces an automatic tool for inferring gene duplication events through tree reconciliation and for reconstructing species tree from gene trees.
The tool supports quick automated analysis of large data sets.
报告题目: PROSPERous: an integrative tool to rank and predict protease substrate cleavage sites by multiple scoring function
报 告 人: Prof. Jiangning Song 宋江宁,中国科学院天津工业生物技术研究所 (Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, China)
时间地点: 2012年6月4日(星期一)下午5:00 思源楼1013室
摘要: The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner by in silico approaches in order to efficiently identify protein substrates. To address this problem, we present PROSPERous, which is an integrative tool for in silico prediction of protease substrates and their cleavage sites from amino acid sequences. PROSPERous is primarily based on amino acid weights derived from amino acid occurrences and utilizes a variety of scoring functions to score, predict and rank potential cleavage sites of proteases. For proteases with known amino acid specificity, PROSPERous provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, it achieves a greater accuracy and coverage. It is a powerful tool for substrate identification in protease systems biology that complements the prediction by other current tools.
报告题目: On the Size of the Minimum Dominating Set and its Relation to Controllability in a Scale-Free Network
报 告 人: Prof. Tatsuya Akutsu,Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan
时间地点: 2012年6月4日(星期一)下午4:00 思源楼1013室
摘要: In this work, we address complex network controllability from the perspective of the minimum dominating set (MDS). Our theoretical calculations, simulations using artificially generated networks as well as real-world networks analyses show that the more heterogeneous a network degree distribution is, the easier it is to control the entire system. We demonstrate that relatively few nodes are needed to control the entire network if the power-law degree exponent is smaller than 2, whereas many nodes are required if it is larger than 2. This is a joint work with Jose Nacher in Toho University.
报告题目: 生物信息学在植物分子生物学中的应用
报 告 人: 林文慧 副研究员,植物分子生理学重点实验室,中国科学院植物研究所
时间地点: 2012年5月28日(星期一)上午10:00 思源楼1013室
报告题目: Data-Driven ODE Model Constructors for Dynamic Network Modeling with Biomedical Applications
报 告 人: Prof. Hulin Wu,Dean's Professor,Director, Center for Integrative Bioinformatics and Experimental Mathematics,Department of Biostatistics and Computational Biology,University of Rochester School of Medicine and Dentistry
时间地点: 2012年5月25日下午4:00 思源楼712
摘       要: Biological systems such as gene regulatory networks and the interactions with gene products are very complex. Identification of the dynamic networks will help us understand the biological process in a systematic way. However, the construction of such a dynamic network is very challenging for a high-dimensional system. We propose to use a set of ordinary differential equations (ODE), coupled with dimensional reduction by clustering and mixed-effects modeling techniques, to model the dynamic gene regulatory network (GRN). The ODE models allow us to quantify both positive and negative gene regulations as well as feedback effects of one set of genes in a functional module on the dynamic expression changes of the genes in another functional module, which results in a directed graph network. A six-step procedure, Significance screening, Clustering, Smoothing, regulation Identification, parameter Estimates refining and Function enrichment analysis (SCSIEF) is developed to identify the ODE-based dynamic GRN. In the proposed SCSIEF procedure, a series of cutting-edge statistical methods and techniques are employed. We apply the proposed method to identify the dynamic GRN for yeast cell cycle progression data and T cell activation in vitro experiments. We are able to annotate the identified modules through function enrichment analyses. The proposed procedure is a promising tool for constructing a general dynamic GRN and more complicated dynamic networks.
报告题目: Identifying the functional module and depicting the disease progress for gastric cancer by differential PPIs
报 告 人: 陈洛南 研究员,中国科学院上海生命科学研究院系统生物学重点实验室
时间地点: 2011年10月28日下午3:00 思源楼712
摘       要: Gastric cancer (stomach cancer) is a severe disease, which happens due to dysregulation of many functionally correlated genes or pathways instead of the mutation of individual genes. Systematic identification of gastric cancer genes can provide insights into the mechanisms underlying this deadly cancer and help to develop new drugs. We present a novel network-based approach to predict gastric cancer related genes. Specifically, by assuming that gastric cancer is mainly resulted from dysfunction of biomolecular networks rather than individual genes in an organism, we identify a set of genes that are potentially related to gastric cancer based on the analysis of differential protein-protein interactions (PPIs) and their dynamical transitions between different cancer stages. Further analysis shows that many of these identified potential genes have been reported to be cancer genes by other independent studies, which demonstrates the predictive power of our method. Moreover, functional analysis on these genes and their interaction partners reveals how the system dynamically undergoes state-transition or network-rewiring during gastric cancer progression on different stages. In addition, the results on hold-out validation data sets show that the identified gene set related to gastric cancer can be used as an efficient biomarker for detecting or diagnosing gastric cancer in an accurate manner, which not only confirms again the effectiveness of our method but also provides another evidence on the potentiality of our predicted cancer related genes.
报告题目: Protein phosphorylation analysis and prediction
报 告 人: Prof. Dong Xu,Department of Computer Science Christopher S. Bond Life Sciences Center University of Missouri, USA
时间地点: 2011年10月28日下午2:00 思源楼712
摘       要: High-throughput experimental studies using mass spectrometry have identified thousands of phosphorylation sites. However, the vast majority of phosphorylation sites remain undiscovered, even in well-studied systems. Since experimental approaches for identifying phosphorylation events are costly and time consuming, in silico prediction of phosphorylation sites is an attractive alternative strategy for whole proteome annotation. Due to various limitations, current phosphorylation-site prediction tools were not well designed for comprehensive assessment of proteomes. Here, we present a novel software tool, Musite, specifically designed for large-scale prediction of both general and kinase-specific phosphorylation sites. We collected phosphoproteomics data from multiple organisms and used them to train prediction models by a comprehensive machine learning approach that integrates local sequence similarities to known phosphorylation sites, protein disorder scores, and amino acid frequencies. Application of Musite on several proteomes yielded tens of thousands of phosphorylation-site predictions with high confidence. Cross-validation tests show that Musite significantly outperforms existing tools for predicting general phosphorylation sites and it is at least comparable to those for predicting kinase-specific phosphorylation sites. With the graphical user interface, Musite provides a useful bioinformatics tool to biologists for predicting phosphorylation sites en masse and training prediction models from custom phosphorylation data. In addition, with its easily-extensible open-source application programming interface (API), Musite is aimed at being an open platform for community-based development of machine-learning based phosphorylation-site prediction applications. Musite is available at http://musite.sourceforge.net/, together with a web server at http://musite.net/.
报告题目: Dynamic centrality measures and adaptation of networks in crisis
报 告 人: Prof. Peter Csermely,Semmelweis University, Budapest, Hungary
时间地点: 2011年9月12日上午9:00 思源楼1013
摘       要: Determination of centrality became a key question of the network-description of complex systems. Former studies highlighted the importance of local measures (such as degree) discriminating hubs, and of global measures (such as betwenness centrality) often identifying bridges or bottlenecks. We recently developed a modularization method, called ModuLand ([1], www.linkgroup.hu/moduland.php), which detects extensively overlapping network communities, and determines community centrality, i.e. a mesoscopic measure summing up the total influence of all network segments to a given node or link. Community centrality proved to be useful to identify the cores of modules, i.e. those few nodes, which form the center of the module. Analysis of the role of module core nodes proved to be a very good predictor of the function of the entire module in biological systems. Similar influence-like centralities can be derived using our recently developed Turbine algorithm to follow the propagation of perturbations in real world networks (www.linkgroup.hu/Turbine.php).
Based on our earlier studies on spatial games (showing that memory plus randomness not only promote cooperation, but also make the outcome quite independent of the network structure) [2], we constructed NetworGame (www.linkgroup.hu/NetworGame.php), which is a versatile program package to model any types of two-agent games (with 2 to 5 strategies) in any real world, or model networks using any types of strategy update rules, update dynamics and starting strategies. The NetworGame program allowed the definition of game centrality as the ability of a networked agent (or a link of two agents) with a single initial defective strategy to change an overall initial starting cooperation to defection (and vice versa: a cooperative strategy of a linked node-pair/triangle changing overall defection to cooperation). Spatial games can also be rationalized in networks of non-conscious agents, such as amino acids, or proteins [3]. Our game centrality measures correctly identified the major decision makers of social cooperation in benchmark networks, such as the Zachary karate club network or Michael's strike network, and pinpointed key 'actors' determining the cooperation of biological networks.
As an example of general messages of the dynamic behavior of biological systems, we observed a partial decoupling of yeast protein-protein weighted interaction network modules after stress. This rearrangement is beneficial to the cell, because it allows better focusing on vital functions, thus sparing resources, and localizes damage to only the most sensitive modules. It also reduces the propagation of noise throughout the network, allows the individual modules a larger degree of freedom for exploring different adaptation strategies, and helps reduce inter-modular conflicts during a period of major intra-modular changes. Several key proteins of the cellular stress response served as residual or newly induced overlaps and bridges of the yeast interactome. De-coupling/re-coupling cycles emerged as a general model of adaptation and learning of complex systems [4].
[1] I. A. Kovács, R. Palotai, M. S. Szalay, P. Csermely, Community landscapes: a novel, integrative approach for the determination of overlapping network modules. PLoS ONE 7, e12528 (2010).
[2] S. Wang, M. S. Szalay, C. Zhang, P. Csermely, Learning and innovative elements of strategy update rules expand cooperative network topologies. PLoS ONE 3, e1917 (2008).
[3] P. Csermely, R. Palotai, R. Nussinov, Induced fit, conformational selection and independent dynamic segments: an extended view of binding events. Trends Biochem. Sci. 35, 539 546 (2010).
[4] A. Mihalik, P. Csermely, Heat shock partially dissociates the overlapping modules of the yeast protein-protein interaction network: a systems level model of adaptation. PLoS Comput. Biol. 7, e1002187 (2011)
报告题目: Sparse Coding: Structured Sparsity, Models and Algorithms
报 告 人: Chris H.Q. Ding,(丁宏强) ,University of Texas at Arlington and 安徽大学 千人计划特聘教授
时间地点: 2011年7月28日上午10:00 思源楼1013
摘       要: In sparse coding (compressed sensing), an input signal (an image or data instance) is encoded with a small number of dictionary signals. This leads to an improved presentation of the input signal, comparing to traditional orthogonal basis methods. It is soon realized that "structured sparity" is more useful in machine learning and pattern recognition. For example in feature selection, this enforces entire row of the regression coefficients to be zero, thus eliminates this feature dimension. In this talk, we will briefly explain sparse coding methods using L1, L2,1, L0 norm based models, their applications, and solution algorithms. These matrix-based models of pattern recognition demonstrate the power of matrix approach.
Talk based on "Towards Structural Sparsity: An explicit L2/L0 Approach", D. Luo, C. Ding, H. Huang, best-paper-runner-up in ICDM 2010, and "Efficient and Robust Feature Selection via Joint l2,1-Norm Minimization", NIPS 2010, F. Nie, H. Huang, X. Cai, C. Ding.
报告题目: Computational analysis of large-scale sequencing data
报 告 人: Prof.Ting Chen,美国南加州大学分子计算生物学中心
时间地点: 2010年5月27日上午10:30 思源楼1013
摘       要: Sequencing of DNA and cDNA libraries on "next-generation" sequencing (NGS) platforms has become the method of choice for genomic and transcriptional analyses. One obstacle that inhibits wider adoption of NGS techniques is the lack of (1) fast and efficient algorithms and mathematical methods for large-scale data analysis, and (2) comprehensive, yet easy to use software packages with which to conduct data analysis. To meet this need, we have developed several analytic tools, including PerM (short read alignment), ComB (SNP Calling), Clippers (Indel/Junction detection), and WeaV (de novo assembly), and a software workflow called RseqFlow for the analysis of RNA-seq data
报告题目: 复杂网络与运筹学
报 告 人: 史定华 教授 上海大学理学院
时间地点: 2011年5月26日(星期四)下午3:00-5:00 思源楼712室
摘       要: 将介绍复杂网络与运筹学的关系。以我们团队的两个初步工作:网络度分布的理论基础和网络同步能力最优的拓扑结构,说明复杂网络需要运筹学的方法和运筹学也需要复杂网络思想。
报告题目: Uncover disease genes by maximizing information flow in the phenome-interactome network
报 告 人: 江瑞(Jiang Rui) 副教授 清华大学自动化系
时间地点: 5月10日(星期二)下午4:00 思源楼1013室
摘       要: Pinpointing genes that underlie human inherited diseases among candidate genes in susceptibility genetic regions is the primary step towards the understanding of pathogenesis of diseases. Although several probabilistic models have been proposed to prioritize candidate genes using phenotype similarities and protein-protein interactions, no combinatorial approaches have been proposed in the literature. We propose the first combinatorial approach for prioritizing candidate genes. We first construct a phenome-interactome network by integrating the given phenotype similarity profile, protein-protein interaction network and associations between diseases and genes. Then, we introduce a computational method called MAXIF to maximize the information flow in this network for uncovering genes that underlie diseases. We demonstrate the effectiveness of this method in prioritizing candidate genes through a series of crossvalidation experiments, and we show the possibility of using this method to identify diseases with which a query gene may be associated. We demonstrate the competitive performance of our method through a comparison with two existing state-of-the-art methods, and we analyze the robustness of our method with respect to the parameters involved. As an example application, we apply our method to predict driver genes in 50 copy number aberration regions of melanoma. Our method is not only able to identify several driver genes that have been reported in the literature, it also shed some new biological insights on the understanding of the modular property and transcriptional regulation scheme of these driver genes.
报告题目: Template-free detection of macromolecular complexes in cryo electron tomograms
报 告 人: 徐旻(Xu Min) 博士,美国南加州大学分子计算生物学中心
时间地点: 5月10日(星期二)下午3:00 思源楼1013室
摘       要: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states. Applied to small cells cryoET produces 3D snapshots of the cellular distributions of large complexes. However, retrieving this information is non-trivial due to the low resolution and low signal-to-noise ratio in tomograms. Current pattern recognition methods identify complexes by matching known structures to the cryo electron tomogram. However, so far only a small fraction of all protein complexes have been structurally resolved. It is therefore of great importance to develop template-free methods for the discovery of previously unknown protein complexes in cryo electron tomograms.
Here, we have developed an inference method for the template-free discovery of frequently occurring protein complexes in cryo electron tomograms. We provide a first proof-of-principle of the approach and assess its applicability using realistically simulated tomograms, allowing for the inclusion of noise and distortions due to missing wedge and electron optical factors. Our method is a step towards the template-free discovery of the shapes, abundance and spatial distributions of previously unknown macromolecular complexes in whole cell tomograms.
报告题目: High-accuracy network analysis in systems biology
报 告 人: Prof. Katsuhisa Horimoto,Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Japan
时间地点: 2011年4月22日(星期五)下午4:00 思源楼1013室
摘       要: The recent development of epoch-making experimental techniques has enabled measurement of almost all characteristics of cellular molecules. In general, biological studies went through a process from the formulation of a hypothesis to its testing. In systems biology, where the cell is treated as a system built up of molecules, the progress of experimental techniques has made it possible to conduct research using actual measured data. Moreover, the emphasis of research objectives has moved from exploring the characteristics of molecules to exploring the associations (networks) among molecules making up a system. However, this trend toward measuring the characteristics of huge numbers of molecules constituting cells requires tremendous expenditures of time and money. It is therefore necessary to seriously consider for what purpose the measurements are being made before starting a study, and to ensure high computational accuracy in analyzing the measurement data to obtain robust results. Here, we introduce two methods that we have developed to realize the latter requirement; namely, high computational accuracy.
The first method is a technique for estimating control networks that are activated under specific conditions [1-3]. First, for networks having a particular structure, the statistical quantity (log-likelihood) is calculated using the values of measured data. Next, by preparing numerous graphs artificially and calculating the log-likelihoods of these graphs, the distribution of log-likelihoods is obtained. Finally, how rarely the log-likelihood of a given network appears in the distribution, i.e., the graph consistency probability, is calculated. Using this technique, the functions possessed by cells in a specific condition can be shown not merely in a list of the names of genes, but in the form of the networks consisting of the genes. For this purpose, first the known networks and the TF (transcriptional factor)-gene binding data obtained from experiments or databases are prepared. Next, using the data measured in specific conditions, calculations are performed to determine which of the prepared networks have graph consistency probability at a significance level; i.e., which networks' forms rarely fit the data under specific conditions. We have named this procedure "network screening," because it selects the networks that appear only in specific conditions from among a huge number of networks.
The second method is a technique for enhancing the accuracy of parameter estimation in network dynamics [4,5]. The essence of this technique is to obtain, from a system of differential equations presenting network dynamics, another but equivalent system of differential equations, using mathematics referred to as differential elimination. The resulting system of differential equations is adopted as new constraints, together with the error function typically used in parameter estimation. For the error function, the difference between the time-course values and estimated values is considered. On the other hand, since the new constraints include differentiated values, the form of the curve provided by the time-course values such as the intercept, inflection point can be further considered. In fact, this is an overwhelming improvement compared with the approach using the conventional error function method. This is a general-purpose technique in various respects; we aim to apply it not only to the dynamics of biological phenomena but also to engineering issues in which parameter estimation is important.
报告题目: Understanding the Utilization, Function and Evolution of Trace Elements by Computational and Comparative Genomics Approaches
报 告 人: 张焱研究员,中科院上海生命科学研究院
时间地点: 2011年4月1日(星期五)下午15:00 思源楼712
摘       要: Biological trace elements are needed in minute quantities for proper growth, development, and physiology of all organisms. These micronutrients provide proteins with unique coordination, catalytic, structural, electron transfer and other properties in a variety of pathways. Utilization of trace elements is generally rather complex and a growing number of trace element-dependent proteins and trace element utilization pathways highlights importance of these elements for life. In recent years, dramatic advances in genomics and related studies provided an opportunity to investigate the occurrence and evolution of numerous biochemical pathways that an organism utilizes, including trace element utilization. Our studies focus on several important trace elements, such as selenium, zinc, iron, copper, nickel, cobalt and molybdenum. A variety of systematic, genome-wide computational and comparative approaches have been used for the analysis of these elements, which provide important information with regard to fundamental issues of their function and evolutionary dynamics of trace element utilization in biology.
报告题目: Numerical approach to structure and folding of protein and microRNA
报 告 人: 胡进锟(Chin-Kun Hu)教授,台湾“中央研究院”物理研究所
时间地点: 2011年4月1日(星期五)下午14:00 思源楼712
摘       要: In this talk, I briefly review some recent developments in numerical approach to structure and folding of proteins and microRNA. The topics under discussion include: (1) developments of algorithms and computer packages for all-atom simulations of proteins [1], (2) development of algorithm to compute volume V, surface area A, and cavity of proteins by analytic equations [2], (3) unfolding and refolding of immunoglobulin domain I27 and ubiquitin [3], and (4) TAROKO: a webserver for microRNA 3D structures and folding thermodynamics [4].
  • (1) F. Eisenmenger, U. H.E. Hansmann, S. Hayryan, and C.-K. Hu. Computer Phys. Commu. 138, 192-212 (2001) and 174, 422 (2006); C.-Y. Lin, C.-K. Hu, and U.H.E. Hansmann, Proteins 52, 436-445 (2003); S. Hayrian, C.-K. Hu, S.-Y. Hu and R.-J. Shang. J. Comp. Chem. 22, 1287-1296 (2001); R. G. Ghulghazaryan, S. Hayryan and C.-K. Hu. J. Comp. Chem., 28, 715 (2007).
  • (2) S. Hayryan, C.-K. Hu, J. Skvrivanek, E. Hayrjan, I. Pokorny. J. Comp. Chem. 26, 334 (2005); J. Busa, J. Dzurina, E. Hayryan, S. Hayryan, C.-K. Hu, J. Plavka, I. Pokorny, J. Skrivanek, and M-C. Wu. Comp. Phys. Commun. 165, 59 (2005); J. Busa,, S. Hayryan, C.-K. Hu, J. Skrivanek, and M.-C. Wu, J. Comp. Chem. 30, 346-357 (2009) and Comp. Phys. Commun. 181, 2116 (2010).
  • (3) M.-S. Li, C.-K. Hu, D. K. Klimov, and D. Thirumalai, Proc. Natl. Acad. Sci. USA 103, 93 (2006); M.-S. Li, M. Kouza and C.-K. Hu. Biophysical J. 91, 547(2007). M. Kouza, C.-K. Hu and M. S. Li, J. Chem. Phys 128, 045103 (2008).
  • (4) S. Harryan, M.-C. Wu, F. Ding, D. Tsao, N. V. Dokholyan and C.-K. Hu, submitted for publication.
报告题目: Simple models to uncover key factors for protein aggregation
报 告 人: 胡进锟(Chin-Kun Hu)教授,台湾“中央研究院”物理研究所
时间地点: 2011年3月24日(星期四)上午10:00 思源楼1013
摘       要: Neurodegenerative diseases include Alzheimer's disease (AD), Huntington's disease (HD), etc. Such diseases are due to progressive loss of structure or function of neurons caused by protein aggregation. For example, AD is considered to be related to aggregation of Aβ40 and Aβ42 (protein with 42 amino acids). In this talk, I briefly review our recent discovery on key factors for protein aggregation. We have used a lattice model to study the aggregation rates of proteins and found that the probability for a protein sequence to appear in the conformation of the aggregated state can be used to determine the temperature at which proteins can aggregate most easily [1].
We have used molecular dynamics and simple models of polymer chains to study relaxation and aggregation of proteins under various conditions and found that when the bending-angle dependent and torsion-angle dependent interactions are zero or very small, then protein chains tend to aggregate at lower temperatures [2]. Such result is useful for understanding aggregation of Aβ40 and Aβ42. Our results [1,2] form good basis for further studies on protein aggregation.

[1] M. S. Li, N. T. Co, G. Reddy, C. -K. Hu, J. E. Straub, and D. Thirumalai, Phys. Rev. Lett. 105, 218101(2010).
[2] W.-J. Ma and C.-K. Hu, J. Phys. Soc. Japan 79, 024005, 024006, 054001, and 104002 (2010).

报告题目: 计算生物学中的学习方法
报 告 人: 郭茂祖 教授,哈尔滨工业大学计算机学院
时间地点: 10月30日上午(星期六)10:00 思源楼1013室
摘       要: 计算生物学中的算法主要涉及串、树、图等组合算法,以及机器学习等人工智能方法。简要介绍RNA结构预测与挖掘、基因组序列多态性(SNP)分析、蛋白质相互作用(PPI)预测中与机器学习相关的正反训练集划分、特征选择、学习算法等。
报 告 人: Prof. Kwang-Hyun Cho,Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST)
时间地点: 7月27日(星期二)下午16:00 思源楼1013室
摘       要:

Most biological networks have huge complex structures which daunt us to make any sense of them. A question then arises as to whether there exists an essential core subnetwork that actually realizes most of the key regulatory functions and forms a backbone structure, within such a large complex network. We have developed an algorithm by which we can identify such a core structure in consideration of the relationship between network topology and dynamics. Intriguingly, we found that such core structures preserve all the fundamental network dynamics and include most of the biologically important nodes. The proposed concept of a core network can provide us with new insights into the evolutionary design principle of complex biological networks.

报告题目: Maximum Entropy Principle for Composition Vector Method in Phylogenetics
报 告 人: Prof. Raymond H. Chan,Department of Mathematics, The Chinese University of Hong Kong
时间地点: 4月29日(星期四)上午10:30 思源楼712室
摘       要: Molecular Phylogenetics is the study of evolutionary relatedness among species through molecular sequencing data. The composition vector (CV) method is an alignment-free method for phylogenetics. Since biological sequences are often obscured by noise and bias, denoising is necessary when using the CV method. By using the maximum entropy principle for denoising and utilizing the special structure of the constraint matrix to simplify the optimization, we derive several new denoising formulas. By comparing with existing formulas on ten different data sets, we found that one of our formulas gives more accurate phylogenetic trees. An example is the tree for the tetrapod data set where we can correctly group birds and reptiles together, a result that cannot be obtained previously by either alignment method or other denoising formulas
报告题目: Improving protein binding sites prediction with consensus approaches
报 告 人: Dr. Bingding Huang,Senior Researcher, Bioinformatics group, Biotec, TU Dresden, Dresden. Systems Biolgy Division, Zhejiang-California International Nanosystems Institute (ZCNI), Zhejiang University
时间地点: 4月15日(星期四)下午3:00 思源楼712室
摘       要: In the last decades, many computational efforts have been done to predict protein binding sites based on protein structure and resulted in a lot of algorithms, software and web-servers. In this talk, I will present two meta-approaches to predict protein-ligand binding sites and protein-protein interaction sites: metaPocket and metaPPI. MetaPocket uses the predicted pocket sites from four methods: LIGSITEcs, PASS, Q-SiteFinder, and SURFNET to improve the prediction success rate from 70% to 75% at the top 1 prediction. For protein-protein binding site prediction, metaPPI includes PPI-Pred, PPISP, PINUP, Promate and SPPIDER, which predict enzyme-inhibitor interfaces with success rates of 23% to 55% and other interfaces with 10% to 28% on a benchmark dataset of 62 complexes. MetaPPI significantly improves prediction success rates to 70% for enzyme-inhibitor and 44% for other interfaces.
报告题目: Dynamical Systems Analysis of Prostate Cancer
报 告 人: Prof. Kazuyuki Aihara,Institute of Industrial Science, The University of Tokyo
时间地点: 3月29日(星期一)上午10:00 思源楼712室
摘       要:

Prostate cancer is recently becoming a serious social problem. It is the secondly most common cancer in men. Although the incident rate of prostate cancer is not so high in Asian countries like China and Japan fortunately, its increasing rate is highest among cancers of the Japanese men. In this talk, I review our dynamical systems approach to prostate cancer and its therapy based on mathematical modeling.

(1) A.M. Ideta, G. Tanaka, T. Takeuchi, and K. Aihara: J. Nonlinear Science, Vol.18, No.6, pp.593-614 (2008).
(2) G. Tanaka, K. Tsumoto, S. Tsuji, and K. Aihara: Physica D, Vol.237, No.20, pp.2616-2627 (2008).
(3) T. Shimada and K. Aihara: Mathematical Biosciences, Vol.214, No.1/2, pp.134-139 (2008).
(4) Y. Tao, Q. Guo, and K. Aihara, J. Nonlinear Science (in press)

Kazuyuki Aihara received the B.E. degree in electrical engineering in 1977 and the Ph.D. degree in electronic engineering 1982 from the University of Tokyo, Tokyo, Japan. Currently, he is Professor in Institute of Industrial Science, Graduate School of Information Science and Technology, and Graduate School of Engineering, the University of Tokyo. His research interests include mathematical modeling of complex systems, parallel distributed processing with chaotic neural networks, and nonlinear time series analysis.

报告题目: Mathematical modelling and computational analysis of protein folding
报 告 人: Prof. Christof Schuette,Freie Universitaet Berlin德国柏林自由大学
时间地点: 3月19日(星期五)下午2:00,思源楼1013室
摘       要: Characterizing the equilibrium ensemble of folding pathways, including their relative probability, is one of the major challenges in protein folding theory today. Although this information is in principle accessible via all-atom molecular dynamics simulations, it is difficult to compute in practice because protein folding is a rare event and the affordable simulation length is typically not sufficient to observe an appreciable number of folding events, unless very simplified protein models are used. Here we present an approach that allows for the reconstruction of the full ensemble of folding pathways from simulations that are much shorter than the folding time. This approach is based on partitioning the state space into small conformational states and constructing a Markov model between them. The talk will presented the mathematical theory that allows for the extraction of the full ensemble of transition pathways from the unfolded to the folded configurations, and can be likewise applied to many other complex systems exhibiting metastable effective dynamics. The approach will then be applied to the folding of a small protein, the PinWW domain in explicit solvent, where the folding time is two orders of magnitude larger than the length of individual simulations. The results are in good agreement with kinetic experimental data and give detailed insights about the nature of the folding process which is shown to be surprisingly complex and parallel. The analysis reveals the existence of misfolded trap states outside the network of efficient folding intermediates that significantly reduce the folding speed.

Prof. Christof Schuette is a full professor in mathematics at Freie Universitaet Berlin. His speciality is biocomputing. He is one of the Directors of the Berlin Mathematical School - a joint top graduate school in mathematics of the research universities in Berlin - as well as the Vice Director of the Research Center MATHEON - Mathematics for Key Technologies - funded by the German Science Foundation (DFG) as a center for excellence.

报告题目: Quantitative Simulation for Biomolecular Networks
报 告 人: Prof. Luonan Chen,Osaka Sangyo University
时间地点: 7月22日(星期三)下午2:00 思源楼1013室
摘       要: Explicitly considering all variables and chemical reactions in a cell is unrealistic for a biomolecular network from modeling, analyzing and computing viewpoint. However, in a cell, many different time scales characterize the gene regulatory processes, which can be exploited to reduce the complexity of the mathematical models. For instance, the transcription and translation processes generally evolve on a time scale that is much slower than that of phosphorylation, dimerization or binding reactions of transcription factors. Moreover, in biological systems, a large class of biological models can be approximately by stochastic hybrid systems in which some state components are discrete and other are continuous. Continuous state components are usually involved in fast reactions with high copy numbers of molecules, whereas discrete state components are in slow processes and have low copy numbers of molecules. In this work, based on the partial Kramers-Moyal expansion with the central limit theorem, we exploit such properties to simplify a complicated molecular network to a hybrid system by giving several models, which can be applied to the quantitative simulation of a large cellular system. we developed a novel stochastic hybrid model for representing chemical master equation, and provided several computational algorithms to efficiently simulate the stochastically cellular dynamics.
报告题目: Emerging of Stochastic Dynamical Equalities and Steady State Thermodynamics from Darwinian Dynamics
报 告 人: 敖平 教授 (上海交通大学系统生物医学院)
时间地点: 2009.6.23(星期二),10:00,思源楼712室
摘       要:

The evolutionary dynamics first conceived by Darwin and Wallace, referring to as Darwinian dynamics in the present paper, has been found to be universally valid in biology. The statistical mechanics and thermodynamics, while enormous successful in physics, have been in an awkward situation of wanting a consistent dynamical understanding. Here we present from a formal point of view an exploration of the connection between thermodynamics and Darwinian dynamics and a few related topics. We first show that the stochasticity in Darwinian dynamics implies the existence temperature, hence the canonical distribution of Boltzmann-Gibbs type. In term of relative entropy the Second Law of thermodynamics is dynamically demonstrated without detailed balance condition, and is valid regardless of size of the system. In particular, the dynamical component responsible for breaking detailed balance condition does not contribute to the change of the relative entropy. Two types of stochastic dynamical equalities of current interest are explicitly discussed in the present approach: One is based on Feynman-Kac formula and another is a generalization of Einstein relation. Both are directly accessible to experimental tests. Our demonstration indicates that Darwinian dynamics represents logically a simple and straightforward starting point for statistical mechanics and thermodynamics and is complementary to and consistent with conservative dynamics that dominates the physical sciences. Present exploration suggests the existence of a unified stochastic dynamical framework both near and far from equilibrium.

敖平,1983年获北京大学物理学学士。1985年获美国伊利诺大学香槟分校(University of Illinois at Urbana-Champaign, UIUC)物理学硕士。1990年获UIUC物理学博士学位,导师为诺贝尔奖获得者Prof. A. J. Leggett。1990-1994年在美国华盛顿大学(University of Washington)物理系从事博士后研究,合作导师为美国科学院院士Prof. D. J. Thouless。1994-2000年任瑞典Umea大学物理系副教授。2000-2003年任西雅图的美国系统生物学研究所(United States Institute for Systems Biology)高级研究科学家及访问教授,与研究所创始人之一美国科学院院士Leory Hood进行合作研究。2003-2008年任华盛顿大学机械工程系副教授。2008年回国任上海交通大学系统生物医学院特聘教授, 973肥胖症项目首席科学家.

报告题目: Integrative disease classification and phenotype prediction based on cross-platform microarray data
报 告 人: Dr. Chun-Chi Jim Liu (Molecular and Computational Biology, University of Southern California)
时间地点: 2009.1.12 (星期一), 15:00 思源楼712教室
摘       要:


报告题目: Boolean Models and Algorithms for Analyzing Genetic Networks and Metabolic Networks
报 告 人: Prof. Tatsuya Akutsu (Bioinformatics Center, Institute for Chemical Research, Kyoto University)
时间地点: 2009.1.12(星期一), 10:00 思源楼712教室
摘       要:


报告题目: Knowledge-based Approaches for Reconstruction of Biological Networks
报 告 人: Prof. Yang Dai (Department of Bioengineering University of Illinois at Chicago)
时间地点: 2008.7.14 (星期一), 15:00 思源楼1013教室
摘       要:


报告题目: Mathematical modeling of circadian rhythms
报 告 人: Prof. Albert Goldbeter (Université Libre de Bruxelles, Belgium)
时间地点: 2007.5.30 (星期三), 15:30 思源楼1013教室
摘       要:

Circadian oscillations occur spontaneously with a period of about 24 h in nearly all living organisms. These oscillations originate from intertwined feedback processes in genetic regulatory networks. Based on experimental observations, mathematical models of increasing complexity have been proposed for the molecular mechanism of circadian rhythms. Deterministic models were first proposed for circadian rhythms in Drosophila. These models account for the occurrence of sustained oscillations of the limit cycle type and for a variety of dynamical properties such as phase shifting or long-term suppression by light pulses and entrainment by light-dark cycles. Stochastic versions of the models are needed to examine how molecular noise affects the emergence and robustness of circadian oscillations. Extending the model to the case of the mammalian circadian clock allows us to address the dynamical bases of physiological disorders of the sleep-wake cycle in humans.

References :

Leloup, J.C. and Goldbeter, A. 2003. Toward a detailed computational model for the mammalian circadian clock. Proc. Natl. Acad. Sci. USA 100, 7051-7056.

Leloup, J.C. and Goldbeter, A. 2004. Modeling the mammalian circadian clock : Sensitivity analysis and multiplicity of oscillatory mechanisms. J. Theor. Biol. 230, 541-562.

报告题目: A Knowledge-Based, Statistical Informatics Approach for Protein Structure Refinement
报 告 人: Prof. Zhijun Wu (Department of Mathematics, Program on Bioinformatics and Computational Biology Iowa State University, USA )
时间地点: 2006.12.21 (星期四), 15:00 思源楼712教室
摘       要: The protein structures determined by conventional techniques usually are not as accurate as desired. Further refinement including human intervention is always required and sometimes critical. Therefore, the development of an efficient refinement technique is important, and as more and more structures are determined, the need is even more urgent, as the CASP prediction center explained for the call for a structure refinement competition in spring 2006. Here, we describe a computational approach of deriving distance constraints from databases of known protein structures for structure refinement. We calculate the distributions of the distances of various types in known protein structures, and use them to obtain the most probable ranges or the mean-force potentials for the distances. We then impose the constraints on the structures to be refined or include the mean-force potentials in energy minimization so that more plausible structural models may be built. We show that many inter-atomic distances in low-resolution structures deviate significantly from their average distributions in known protein structures, and the structures can be refined when a selected set of distances are constrained to their most probable ranges or optimized with corresponding mean-force potentials. We present the results from refining a set of NMR-determined protein structures by using database derived distance constraints and mean-force potentials, and show the improvements on the structures in terms of several standard measures. We also discuss our results from participating in the CASPR 2006 structural refinement experiments for comparative model refinement, using energy minimization, database derived distance constraints, and massively parallel computing. We describe the development of a database of protein inter-atomic distances that supports computing the distributions of the distances of various types in known protein structures and generating the constraints or potentials for the distances automatically. We discuss the possibilities of extending the system to a broader sense of protein geometry database and using it for structure analysis, classification, as well as refinement.
报告题目: Evolutionary matching of surface patterns for predicting protein functions and binding specificities
报 告 人: Prof. Jie Liang (Department of Bioengineering University of Illinois at Chicago, USA Institute of Systems Biomedicine Shanghai Jiaotong University, CHINA )
时间地点: 2006.10.27 (星期五), 15:00 思源楼712教室
摘       要: Predicting protein functions is a challenging task, as evolutionary relationship reflected by global
sequence and structure similarities are often unreliable for function prediction. For proteins binding to similar substrates or ligands and carrying out similar functions, their binding surfaces experience similar physicochemical constraints, and hence the sets of allowed and forbidden residue substitutions are similar. We develop a method for predicting protein functions by incorporating evolutionary information specific to an individual binding region and by rapidly matching local surfaces. Our method is based on the estimation of substitution rates of amino acids. It computes a profile which characterizes protein binding activities that may involve multiple substrates or ligands. We show that our method can be used to predict enzyme functions, to identify potential substrates, and to assess binding specificity. In an objective large
scale test of 100 enzyme families with 2,196 structures, our predictions are sensitive and specific: At the stringent specificity level of 99.98%, we can correctly predict enzyme functions for 80.55% of the proteins. The overall area under the Receiver Operating Characteristic curve measuring the performance of our prediction is 0.955. Our method also works well in predicting the biochemical functions of orphan proteins from structural genomics project.
报告题目: Multiple Sequence Alignment Using Partial Order Graphs
报 告 人: Dr. Christopher Lee(Chemistry & Biochemistry Department University of California at Los Angeles, USA )
时间地点: 2006.8.30 (星期三), 15:00 思源楼712教室
摘       要:  
报告题目: Inferring Protein Interactions with Correlated Domains by Integrative Databases
报 告 人: Prof. Luonan Chen(Osaka Sangyo University, Japan )
时间地点: 2006.8.23 (星期三), 15:00 思源楼712教室
摘       要:  
报告题目: A Systems Biology Approach for Studying Gene Function and Pathway through Mining Functional Genomic Data
报 告 人: Prof. Dong Xu(Digital Biology Laboratory, Computer Science Department and Life Sciences Center, University of Missouri-Columbia, Columbia, MO, USA )
时间地点: 2006.5.31 (星期三), 15:00 思源楼712教室
摘       要: We have developed a number of computational approaches to infer gene function and pathway through utilizing various functional genomic data, including protein-protein interactions, protein complexes, microarray data, and genomic sequences. We quantify the relationship between functional similarity in the Gene Ontology biological process and functional data, and coded the relationship into a "functional linkage graph", where each node represents one gene and the weight of each edge is characterized by the Bayesian probability of function similarity between the two connected genes. We utilized the graph to predict gene function and signaling pathways in yeast and Arabidopsis. We also analyzed Arabidopsis tiling array data to predict anti-sense gene silencing and validated the prediction using EST data. Some anti-sense predictions were confirmed through RT-PCR.
报告题目: Mathematical analysis of genetic network: Function, Dynamics and Noise
报 告 人: Dr. Sanyi Tang(Warwick University, UK )
时间地点: 2005.12.01 (星期四), 15:00 晨兴510室
摘       要:  
报告题目: 计算模型与复杂适应系统
报 告 人: 张江 博士(北京交通大学经济管理学院 )
时间地点: 2005.11.01 (星期二), 15:30 思源楼712室
摘       要: 计算模型是研究复杂适应系统的主要手段之一,它不仅可以对复杂系统进行模拟和仿真,提供一种可操作的试验平台,而且可以用隐喻的方法为人们提供对复杂适应系统的深刻洞察。本报告主要介绍width我开发的两个计算模型:Autolife和AEM。Autolife是一个数字人工生命系统,运用该模型我们可以研究Agent个体的进化行为、群体的适应性行为、生命和环境的关系,以及组织的涌现、演化、社会性寄生和自修复等现象。AEM是一个模拟的经济系统,从著名的人工社会模型Sugarscape扩展得来。Agent的层级适应性决策建模技术使得我们可以探讨虚拟经济系统的价格波动、社会分工、市场组织的形成与演化、交易网络、Agent流和商品流的形成与演化等规律。
报告题目: DNA Screening and Pooling Designs
报 告 人:

Prof. Ding-Zhu Du (University of Texas at Dallas )

时间地点: 2005.11.01 (星期二), 14:30 思源楼712室
摘       要: A recent important development in biology is the success of Human Genome Project. As the technology for obtaining sequenced genome data is getting mature, more and more sequenced genome data are available to scientific research community, so that the study of gene functions has become a popular research direction. The study of gene functions requires to obtain DNA library of high quality through a large amount of testing and screening. Pooling design is a mathematical tool to reduce the number of tests for DNA library screening. In this talk, we introduce a new method to construct pooling designs.
当前位置:首页 > 学术报告 > 生物信息中心学术报告