当前位置:首页 > 学术报告 > 统计研究中心学术报告

2010年
报告题目: On the Estimation of Integrated Covariance Matrices of High Dimensional Diffusion Processes
报 告 人: Prof. Yingying Li(Hong Kong University of Science and Technology)
时间地点: 2010年6月13日15:00-16:00 思源楼703
摘       要: We consider the estimation of integrated covariance matrices of high dimensional diffusion processes by using high frequency data. We start by studying the most commonly used estimator, the realized covariance matrix (RCV). We show that in the high dimensional case when the dimension p and the observation frequency n grow in the same rate, the limiting empirical spectral distribution of RCV depends on the covolatility processes not only through the underlying integrated covariance matrix Sigma, but also on how the covolatility processes vary in time. In particular, for two high dimensional diffusion processes with the same integrated covariance matrix, the empirical spectral distributions of their RCVs can be very different. Hence in terms of making inference about the spectrum of the integrated covariance matrix, the RCV is in general \emph{not} a good proxy to rely on in the high dimensional case. We then propose an alternative estimator, the time-variation adjusted realized covariance matrix (TVARCV), for a class of diffusion processes. We show that the limiting empirical spectral distribution of our proposed estimator TVARCV does depend solely on that of Sigma through a Marcenko-Pastur equation, and hence the TVARCV can be used to recover the empirical spectral distribution of Sigma by inverting the Marcenko-Pastur equation, which can then be applied to further applications such as portfolio allocation, risk management, etc..

This is based on Joint work with Xinghua Zheng..

2008年
报告题目: One-step Sparse Estimates in Nonconcave Penalized Likelihood Models
报 告 人: Prof. Runze Li (Associate Professor The Pennsylvania State University)
时间地点: 2008年6月27日16:00-17:40 思源楼703
摘       要: Fan and Li (2001) proposed a family of variable selection methods via penalized likelihood using concave penalty functions. The nonconcave penalized likelihood estimators enjoy the oracle properties, but maximizing the penalized likelihood function is computationally challenging, because the objective function is nondifferentiable and nonconcave. In this article we propose a new unified algorithm based on the local linear approximation (LLA) for maximizing the penalized likelihood for a broad class of concave penalty functions. Convergence and other theoretical properties of the LLA algorithm are established. A distinguished feature of the LLA algorithm is that at each LLA step, the LLA estimator can naturally adopt a sparse representation. Thus we suggest using the one-step LLA estimator from the LLA algorithm as the final estimates. Statistically, we show that if the regularization parameter is appropriately chosen, the one-step LLA estimates enjoy the oracle properties with good initial estimators. Computationally, the one-step LLA estimation methods dramatically reduce the computational cost in maximizing the nonconcave penalized likelihood. We conduct some Monte Carlo simulation to assess the finite sample performance of the one-step sparse estimation methods. The results are very encouraging..
报告题目: Quotient Correlation: A Sample Based Alternative To Pearson's Correlation
报 告 人: Prof. Zhengjun Zhang (Princeton University, USA)
时间地点: 2008年6月27日10:40-11:40 思源楼1013
摘       要: The quotient correlation is defined here as an alternative to Pearson's correlation that is more intuitive and flexible in cases where the tail behavior of data is important. It measures nonlinear dependence where the regular correlation coefficient is generally not applicable. One of its most useful features is a test statistic that has high power when testing nonlinear dependence in cases where the Fisher's $Z$-transformation test may fail to reach a right conclusion. Unlike most asymptotic test statistics, which are either normal or $\chi2$, this test statistic has a limiting gamma distribution (henceforth, the gamma test statistic). More than the common usages of correlation, the quotient correlation can easily and intuitively be adjusted to values at tails. This adjustment generates two new concept -- the tail quotient correlation and the tail independence test statistics, which are also gamma statistics. Due to the fact that there is no analogue of the correlation coefficient in extreme value theory, and there does not exist an efficient tail independence test statistic, these two new concepts may open up a new field of study. In addition, an alternative to Spearman's rank correlation, a rank based quotient correlation, is also defined. The advantages of using these new concepts are illustrated with simulated data and real data analysis of internet traffic and asset returns.
报告题目: Statistical semiparametric detection of significant activation for brain fMRI
报 告 人: Prof. Chunming Zhang (Associate Professor University of Wisconsin )
时间地点: 2008年6月27日9:30-10:30 思源楼1013
摘       要: Functional magnetic resonance imaging (fMRI) aims to locate activated regions in human brains when specific tasks are performed. The conventional tool for analyzing fMRI data applies some variant of the linear model, which is restrictive in modeling assumptions. To yield more accurate prediction of the time-course behavior of neuronal responses, the semiparametric inference for the underlying hemodynamic response function is developed to identify significantly activated voxels. Under mild regularity conditions, we demonstrate that a class of the proposed semiparametric test statistics, based on the local linear estimation technique, follow chi-squared distributions under null hypotheses for a number of useful hypotheses. Furthermore, the asymptotic power functions of the constructed tests are derived under the fixed and contiguous alternatives. Simulation evaluations and real fMRI data application suggest that the semiparametric inference procedure provides more efficient detection of activated brain areas than the popular imaging analysis tools AFNI and FSL.
2007年
报告题目: Challenge of Dimensionaly in Classifications and Feature Selection
报 告 人: Prof. Jianqing Fan(Princeton University, USA)
时间地点: 2007年12月27日9:30-10:30 思源楼703
摘       要:  
报告题目: Regression Analysis of Longitudinal Data with Outcome Dependent Observation and Follow-up Times
报 告 人: Prof.(Tony) Jianguo Sun (University of Missouri, USA )
时间地点: 2007年7月13日16:00-17:00 思源楼703
摘       要: Longitudinal data frequently occur in many studies such as longitudinal follow-up studies. To develop statistical methods and theory for the analysis of them, independent or noninformative observation and censoring times are typically assumed, which naturally leads to inference procedures conditional on observation and censoring times (Diggle et al., 1994; Lin and Ying, 2001). In many situations, however, this may not be true or realistic. That is, longitudinal responses may be correlated with observation times as well as censoring time. This paper considers the analysis of longitudinal data where these correlations may exist and a joint modeling approach that uses some latent variables to characterize the correlations is proposed. For inference about regression parameters, estimating equation approaches are developed and both large and final sample properties of the proposed estimators are established. The ethodology is applied to a bladder cancer study that motivated this nvestigation.
报告题目: AGGREGATION OF NONPARAMETRIC ESTIMATORS FOR VOLATILITY MATRIX
报 告 人: Jianqing Fan, Yingying Fan and Jinchi Lv (Princeton University)
时间地点: 2007年6月25日16:00-17:30 思源楼712
摘       要: An aggregated method of nonparametric estimators based on time-domain and state-domain estimators is proposed and studied. To attenuate the curse of dimensionality, we propose a factor modeling strategy. We first investigate the asymptotic behaviors of nonparametric estimators of the volatility matrix in the time domain and in the state domain. The asymptotic normality is separately established for nonparametric estimators in the time domain and state domain. These two estimators are asymptotically independent. Hence, they can be combined, through a dynamic weighting scheme, to improve the efficiency of the estimated volatility matrix. The optimal dynamic weights are derived and it is shown that the aggregated estimator uniformly dominates the volatility matrix estimators using time-domain or state-domain smoothing alone. A simulation study, based on an essentially affine model for the term structure, is conducted and it demonstrates convincingly that the newly proposed procedure outperforms both time- and state-domain estimators. Empirical studies endorse further the advantages of our aggregated method
报告题目: Analysis of Longitudinal Data with Semiparametric Estimation of Covariance Function
报 告 人: Runze Li: Associate Professor (The Pennsylvania State University)
时间地点: 2007年5月18日15:30-16:30 晨兴中心605
摘       要: Improving efficiency for regression coefficients and predicting trajectories of individuals are two important aspects in analysis of longitudinal data. Both involve estimation of the covariance function. Yet, challenges arise in estimating the covariance function of longitudinal data collected at irregular time points. A class of semiparametric models for the covariance function is proposed by imposing a parametric correlation structure while allowing a nonparametric variance function. A kernel estimator is developed for the estimation of the nonparametric variance function. Two methods, a quasi-likelihood approach and a minimum generalized variance method, are proposed for estimating parameters in the correlation structure. We introduce a semiparametric varying coefficient partially linear model for longitudinal data and propose an estimation procedure for model coefficients by using a profile weighted least squares approach. Sampling properties of the proposed estimation procedures are studied and asymptotic normality of the resulting estimators is established. Finite sample performance of the proposed procedures is assessed by Monte Carlo simulation studies. The proposed methodology is illustrated by an analysis of a real data example.
报告题目: Accelerated Life and Degradation Models with Dynamic Environment
报 告 人: Prof.Mikhail Nikulin (Statistique Mathématique et ses Applications, Victor Segalen University)
时间地点: 2006年12月12日16:00-17:00 思源楼712
摘       要: We consider here the statistical models with dynamic environment describing dependence of the lifetime distribution on the time-dependent explanatory variables. Such models are used in reliability and survival analysis to study the reliability of aging bio-technical system, in dependence on their longevity, fatigue and degradation under different conditions of exploration. The reliability theory gives a general approach for construction of efficient statistical models in terms of failure rates to study aging and degradation problems in different areas such as industrial engineering and technology, biophysics, biology, demography, radiobiology, genetics, biostatistics, survival analysis, business and finance, etc... We shall discuss the problems of statistical modelling and of choice of design in accelerated life testing to obtain the statistical estimators of the main reliability characteristics of aging systems.
报告题目: Nonlinear Dependency and Its Application
报 告 人: Wei Gang (魏刚).(School of Mathematics and System Sciences Shandong University )
时间地点: 2006年11月17日16:00-17:00 思源楼703
摘       要: The normal distribution and the linear model have been taken as the central part classical statistic inference in both theory and application. In the last decade, with the demand from the social, medical, and industrial sciences, the nonlinear dependency characterized by the partial ordering, copula construction, and nonparametric dependency have shown great potentials in their theoretical challenges and applied statistics. In this talk, we particularly demonstrate the rich mathematical structures constructed with the aid of copula analysis and some simple but important applications of such non-classical statistical inference techniques.
报告题目: Tree-Structured Survival Analysis Based on Variance of Survival Time
报 告 人: Hua Jin, Ph.D.(School of Mathematical Sciences, South China Normal University)
时间地点: 2006年10月19日10:30
摘       要:

Tree structured survival analysis (TSSA) is a popular alternative to the Cox proportional hazards regression in medical research of survival data. Several methods for constructing a tree of different survival profiles have been developed, including one based on two-sample log-rank test statistics and martingale -type residuals.

Lu, Jin and Mi used variance of restricted mean lifetimes as an index of degree of separation (DOS) to measure the efficiency in separations of survival profiles by a classification method. They proposed a hypothesis testing procedure for comparison of two classification rules, especially for non-inferiority test.

Our objective here is to explore the use of DOS in TSSA. We propose an algorithm in a similar fashion to the least square regression tree for survival analysis. We apply the proposed method to prospective cohort data from the Study of Osteoporotic Fracture that motivated our research and then compare our classification rule to those rules based on the log-rank statistics and martingale residuals.

报告题目: Bayesian Methods for Inferring Epistasis
报 告 人: Prof. Jun Liu (Department of Statistics, Harvard University)
时间地点: 2006年7月24日(周一) 下午2:00 思源楼712
摘       要: I will discuss a Bayesian approach in detecting multi-locus interactions (Epistasis) for case-control association studies. Existing methods are either of low power or computationally infeasible when facing of a large number of markers. Using MCMC sampling techniques, the method can efficiently detect interactions among thousands of markers. I will also discuss the issue of statistical significance and how to adjust multiple comparisons in this situation (much of these are conjectures, though).
报告题目: Embracing Statistical Challenges in the Information Technology Age
报 告 人: Prof. Bin Yu(Department of Statistics, University of California, Berkeley )
时间地点: 2006年7月20日(周四) 上午10:00 思源楼703
摘       要:  
报告题目: Bayesian Hierarchical Modeling for Integrating Low-accuracy and High-accuracy Experiments
报 告 人: Prof.Jeff Wu (Georgia Institute of Technology School of Industrial and Systems Engineering )
时间地点: 2006年7月14日(周五) 上午10:00 思源楼712
摘       要: Standard practice in analyzing data from different types of experiments is to treat data from each type separately. By borrowing strength across multiple sources, an integrated analysis can produce better results. Careful adjustments need to be made to incorporate the systematic differences among various experiments. To this end, some Bayesian hierarchical Gaussian process models (BHGP) are proposed. The heterogeneity among different sources is accounted for by performing flexible location and scale adjustments. The approach tends to produce prediction closer to that from the high-accuracy experiment. The Bayesian computations are aided by the use of Markov chain Monte Carlo and Sample Average Approximation algorithms. The proposed method is illustrated with two examples: one with detailed and approximate finite elements simulations for mechanical material design and the other with physical and computer experiments.
 
报告题目: Fast Functional MRI
报 告 人: Prof. Cun-Hui Zhang (Department of Statistics, Rutgers University, USA )
时间地点: 2006年6月22日(星期四) 下午 4:00--5:00 思源楼703
摘       要: We develop fast functional MRI methods to improve the time-resolution of the current functional MRI technology by sampling a small fraction of the Fourier transform of the spin density, and using a prolate wave filter to approximately obtain, not the usual susceptibility map, but instead the integral of this quantity over regions of interest in the brain at successive time-points. The aim of this space/time trade-off is to obtain, at high time-resolution, the total activity in these regions which processes the specific stimulus/task, and more important in studying higher cognition, the sequence of occurrences of these processes. An fMRI experiment will be reviewed and discussed. This is joint work with Gary Glover, Martin Lindquist and Larry Shepp.
报告题目: Statistical Challenges with High Dimensionality in Feature Selection
报 告 人: Prof. Jianqing Fan (Princeton University, USA )
时间地点: 2006年6月2日(星期五) 上午 10:00--11:00 思源楼712
摘       要: Technological innovations have revolutionized the process of scientific research and knowledge discovery. The availability of massive data and challenges from frontiers of research and development have reshaped statistical thinking, data analysis and theoretical studies. The challenges of high-dimensionality arise in diverse fields of sciences and the humanities, ranging from computational biology and health studies to financial engineering and risk management. In all of these fields, variable selection and feature extraction are crucial for knowledge discovery. We first give a comprehensive overview of statistical challenges with high dimensionality in these diverse disciplines. We then approach the problem of variable selection and feature extraction using a unified framework: penalized likelihood methods. Issues relevant to the choice of penalty functions are addressed. We demonstrate that for a host of statistical problems, as long as the dimensionality is not excessively large, we can estimate the model parameters as well as if the best model is known in advance. The persistence property in risk minimization is also addressed. The applicability of such a theory and method to diverse statistical problems is demonstrated. Other related problems with high-dimensionality are also discussed.
报告题目: Semi/Non-parametric Dynamic Quantile Regression Models and Their Applications
报 告 人: Prof. Zongwu Cai (Department of Mathematics and Statistics & Department of Economics, University of North Carolina, Charlotte, USA)
时间地点: 2006年4月8日(星期六) 下午 4:00--5:00 思源楼712
摘       要: In this talk, first I will briefly review some semiparametric and nonparametric regression models for time series data and their applications such as value-at-risk. In particular, I will focus on a class of smooth coefficient quantile regression time series models based on some applications. We employ a local linear fitting scheme to estimate the smooth coefficients in the quantile framework. The programming involved in the local linear quantile estimation is relatively simple and it can be modified with few efforts from the existing programs for the linear quantile model. We derive the local Bahadur representation of the local linear estimator for alpha-mixing time series and establish the asymptotic normality of the resulting estimator. The asymptotic behaviors of the estimator at the boundaries are examined. A comparison of the local linear quantile estimator with the local constant estimator is presented. A simulation study is carried out to illustrate the performance of the estimates. An empirical application of the model to the exchange rate time series data and the well-known Boston house price data further demonstrates the potential of the proposed modeling procedures.
报告题目: Additive models for spatial processes
报 告 人: Dag Tjostheim 院士(Department of Mathematics,University of Bergen, Norway)
时间地点: 2006年4月11日(星期二)下午 4:00--5:00 思源楼712
摘       要:
报告题目: Estimating Marginal Survival Under Dependent Censoring
报 告 人: Donglin Zeng (Assistant Professor)(Department of Biostatistics, University of North Carolina (Chapel Hill) )
时间地点: 2006年4月13日(星期四)下午 2:00--3:00 思源楼712
摘       要: One goal in survival analysis of right censored data is to estimate marginal survival function in the presence of dependent censoring. When many auxiliary covariates are sufficient to explain the dependent censoring, estimation based on either semiparametric model or nonparametric model of the conditional survival function can be problematic due to the high-dimensionality of the auxiliary information. In this paper, we use two working models to condense these high-dimensional covariates in dimension reduction; then an estimate of the marginal survival function can be derived non-parametrically in a low-dimension space. We show that such an estimator has the following double robust property: when either working model is correct, the estimator is consistent and asymptotically Gaussian; when both working models are correct, the asymptotic variance attains the efficiency bound.
 
报告题目: Maximum Likelihood Estimation in Semiparametric Transformation Models for Counting Processes
报 告 人: Donglin Zeng (Assistant Professor) (Department of Biostatistics, University of North Carolina (Chapel Hill) )
时间地点: 2006年4月18日(星期二)下午 2:00--3:00 思源楼712
摘       要: A class of semiparametric transformation models is proposed to characterize the effects of possibly time-varying covariates on the intensity functions of counting processes. The class includes the proportional intensity model and linear transformation models as special cases. Nonparametric maximum likelihood estimators are developed for the regression parameters and cumulative intensity functions of these models based on censored data. The estimators are shown to be consistent and asymptotically normal. The limiting variances for the estimators of the regression parameters achieve the semiparametric efficiency bounds and can be consistently estimated. The limiting variances for the estimators of smooth functionals of the cumulative intensity function can also be consistently estimated. Simulation studies reveal that the proposed inference procedures perform well in practical settings. Two medical studies are provided.
报告题目: Semiparametric Transformation Models for Survival Data with a Cure Fraction
报 告 人: Donglin Zeng (Assistant Professor) (Department of Biostatistics, University of North Carolina (Chapel Hill) )
时间地点: 2006年4月19日(星期三)下午 2:00--3:00 思源楼712
摘       要: We propose a class of transformation models for survival data with a cure fraction. The class of transformation models is motivated by biological considerations, and it includes both the proportional hazards and the proportional odds cure models as two special cases. An efficient recursive algorithm is proposed to calculate the maximum likelihood estimators. Furthermore, the maximum likelihood estimators for the regression coefficients are shown to be consistent and asymptotically normal, and their asymptotic variances attain the semiparametric efficiency bound. Simulation studies are conducted to examine the finite sample properties of the proposed estimators. The method is illustrated on data from a clinical trial involving the treatment of melanoma.
当前位置:首页 > 学术报告 > 统计研究中心学术报告