µ±Ç°Î»ÖÃ£ºÊ×Ò³ > Ñ§Êõ±¨¸æ > Í³¼ÆÑÐ¾¿ÖÐÐÄÑ§Êõ±¨¸æ

 2013Äê ±¨¸æÌâÄ¿£º Least squares estimation of threshold models: a practical two-stage procedure ±¨ ¸æ ÈË£º Dr. Dong Li, Hong Kong University of Science and Technology, HK Ê±¼äµØµã£º 2013Äê9ÔÂ6ÈÕ£¨ÐÇÆÚÎå£©ÏÂÎç4:00 Ë¼Ô´Â¥712 Õª       Òª£º Threshold models have attracted too much attention and been widely used in econometrics, economics and finance for modeling nonlinear phenomena. Its success is partially due to its simplicity in terms of both model-fitting and model-interpretation. A popular approach to fit a threshold model is the conditional least squares method. However, as modeling data with threshold type of models the computational costs become substantial. This paper proposes a novel method, two-stage grid-search procedure, to quickly search the least squares estimate of the threshold parameter in threshold models. Compared with the standard grid-search procedure used in literature, our new method extremely reduces computational costs, which only requires least-squares operations of order O(\sqrt{n}). Its validity is also verified theoretically. The performance of our procedure is evaluated via Monte Carlo simulation studies in finite samples. ±¨¸æÌâÄ¿£º Clinical Trials for Personalized Medicine: New Designs and Statistical Inference ±¨ ¸æ ÈË£º Professor Feifang Hu,Department of Statistics, University of Virginia and School of Statistics, Renmin University of China Ê±¼äµØµã£º 2012Äê12ÔÂ13ÈÕ£¨ÐÇÆÚËÄ£©ÏÂÎç4:00-5:00 Ë¼Ô´Â¥709ÊÒ Õª       Òª£º In a short period of time, advances in genetics has allowed scientists to identify genes (biomarkers) that are linked with certain diseases. To translate these great scientific findings into real-world products for those who need them (personalized medicine), clinical trials play an essential and important role. Personalized medicine is an approach that will allow physicians to tailor a treatment regimen based on an individual patient's characteristics (which could be biomarkers or other covariates). To develop personalized medicine, we need new designs for clinical trials so that genetics information and other biomarkers can be incorporated to assist in treatment selection. This talk first provides a brief review of design and statistical inference related with personalized medicine. Personalized medicine raises some new challenges for the design of clinical trials as: (1) more covariates (biomarkers) have to be considered, and (2) particular attention needs to be paid to the interaction between treatment and covariate. Then we discuss several new families of designs for personal medicine. New techniques are introduced to study the theoretical properties of the proposed designs. Advantages of the proposed designs are demonstrated through both theoretical and numerical studies. To deal with the complex data structure arise in clinical trials of personalized medicine, some further and important statistical issues are discussed. ±¨¸æÌâÄ¿£º High-Dimensional Sparse Additive Hazards Regression ±¨ ¸æ ÈË£º Jinchi Lv, Assistant Professor, Marshall School of Business, University of Southern California Ê±¼äµØµã£º 2012Äê8ÔÂ8ÈÕÏÂÎç3:15 Ë¼Ô´Â¥703 Õª       Òª£º High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by modern applications in high-throughput genomic data analysis and credit risk analysis. In this article, we propose a class of regularization methods for simultaneous variable selection and estimation in the additive hazards model, by combining the nonconcave penalized likelihood approach and the pseudoscore method. In a high-dimensional setting where the dimensionality can grow fast, polynomially or nonpolynomially, with the sample size, we establish the weak oracle property and oracle property under mild, interpretable conditions, thus providing strong performance guarantees for the proposed methodology. Moreover, we show that the regularity conditions required by the $L_1$ method are substantially relaxed by a certain class of sparsity-inducing concave penalties. As a result, concave penalties such as the smoothly clipped absolute deviation (SCAD), minimax concave penalty (MCP), and smooth integration of counting and absolute deviation (SICA) can significantly improve on the L_1 method and yield sparser models with better prediction performance. We present a coordinate descent algorithm for efficient implementation and rigorously investigate its convergence properties. The practical utility and effectiveness of the proposed methods are demonstrated by simulation studies and a real data example. This is a joint work with Wei Lin. ±¨¸æÌâÄ¿£º Tuning Parameter Selection in High-Dimensional Penalized Likelihood ±¨ ¸æ ÈË£º Yingying Fan, Assistant Professor, Marshall School of Business, University of Southern California Ê±¼äµØµã£º 2012Äê8ÔÂ8ÈÕÏÂÎç2:00 Ë¼Ô´Â¥703 Õª       Òª£º Determining how to appropriately select the tuning parameter is essential in penalized likelihood methods for high-dimensional data analysis. We examine this problem in the setting of penalized likelihood methods for generalized linear models, where the dimensionality of covariates p is allowed to increase exponentially with the sample size n. We propose to select the tuning parameter by optimizing the generalized information criterion (GIC) with an appropriate model complexity penalty. To ensure that we consistently identify the true model, a range for the model complexity penalty is identified in GIC. We find that this model complexity penalty should diverge at the rate of some power of log p depending on the tail probability behavior of the response variables. This reveals that using the AIC or BIC to select the tuning parameter may not be adequate for consistently identifying the true model. Based on our theoretical study, we propose a uniform choice of the model complexity penalty and show that the proposed approach consistently identifies the true model among candidate models with asymptotic probability one. We justify the performance of the proposed procedure by numerical simulations and a gene expression data analysis. This is a joint work with Professor Chengyong Tang. ±¨¸æÌâÄ¿£º Non-Concave Penalized Likelihood with NP-Dimensionality ±¨ ¸æ ÈË£º Jinchi Lv, Assistant Professor, Marshall School of Business, University of Southern California Ê±¼äµØµã£º 2012Äê8ÔÂ7ÈÕÏÂÎç3:15 Ë¼Ô´Â¥703 Õª       Òª£º Penalized likelihood methods are fundamental to ultra-high dimensional variable selection. How high dimensionality such methods can handle remains largely unknown. In this paper, we show that in the context of generalized linear models, such methods possess model selection consistency with oracle properties even for dimensionality of Non-Polynomial (NP) order of sample size, for a class of penalized likelihood approaches using folded-concave penalty functions, which were introduced to ameliorate the bias problems of convex penalty functions. This fills a long-standing gap in the literature where the dimensionality is allowed to grow slowly with the sample size. Our results are also applicable to penalized likelihood with the L_1-penalty, which is a convex function at the boundary of the class of folded-concave penalty functions under consideration. The coordinate optimization is implemented for finding the solution paths, whose performance is evaluated by a few simulation examples and the real data analysis. This is a joint work with Professor Jianqing Fan. ±¨¸æÌâÄ¿£º Variable Selection in Linear Mixed Effects Models ±¨ ¸æ ÈË£º Yingying Fan, Assistant Professor, Marshall School of Business, University of Southern California Ê±¼äµØµã£º 2012Äê8ÔÂ7ÈÕÏÂÎç2:00 Ë¼Ô´Â¥703 Õª       Òª£º This paper is concerned with the selection and estimation of fixed and random effects in linear mixed effects models. We propose a class of nonconcave penalized profile likelihood methods for selecting and estimating important fixed effects. To overcome the difficulty of unknown covariance matrix of random effects, we propose to use a proxy matrix in the penalized profile likelihood. We establish conditions on the choice of the proxy matrix and show that the proposed procedure enjoys the model selection consistency where the number of fixed effects is allowed to grow exponentially with the sample size. We further propose a group variable selection strategy to simultaneously select and estimate important random effects, where the unknown covariance matrix of random effects is replaced with a proxy matrix. We prove that, with the proxy matrix appropriately chosen, the proposed procedure can identify all true random effects with asymptotic probability one, where the dimension of random effects vector is allowed to increase exponentially with the sample size. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. We further illustrate the proposed procedures via a real data example. This is a joint work with Professor Runze Li. ±¨¸æÌâÄ¿£º On the QMLE of a threshold double AR model ±¨ ¸æ ÈË£º Shiqing Ling, Professor, Hong Kong University of Science and Technology, China Ê±¼äµØµã£º 2012Äê7ÔÂ26ÈÕÏÂÎç3:30 Ë¼Ô´Â¥712 Õª       Òª£º This paper proposes a threshold double autoregressive model and studies its quasi-maximum likelihood estimation (QMLE). It is shown that the estimator is strongly consistent and the estimated threshold is $n$-consistent and converges weakly to some functional of a two-sided compound Poisson process. The remaining parameters are asymptotically normal. Our results include the asymptotic theory of the estimator for the threshold AR model with ARCH errors and the threshold ARCH model as special cases, each of which is also new in the literature. A resampling method is presented to simulate the limiting distribution of the estimated threshold, which can be applied to construct confidence intervals of the threshold parameter. Two portmanteau-type statistics are also derived for checking the adequacy of fitted model when either the error is non-normal or the threshold is unknown. Simulation studies are conducted to assess the performance of the QMLE in finite samples. The results are illustrated with an application to the weekly closing prices of Hang Seng Index ±¨¸æÌâÄ¿£º What are structural eqution models? ±¨ ¸æ ÈË£º Kenneth A. Bollen, Professor,University of North Carolina at Chapel Hill Ê±¼äµØµã£º 2012Äê7ÔÂ24ÈÕÏÂÎç3:00 Ë¼Ô´Â¥712 Õª       Òª£º Structural equation models (SEMs) refer to procedures popular in the social and behavioral sciences that are equipped to handle multiple equations with latent and observed variables, multiple measures of concepts, and measurement errors. From one perspective SEMs appear as a general statistical model that includes factor analysis, simultaneous equations, multiple regression, ANOVA, fixed and random effects models, growth curve models, probit regressions, and other models as special cases. But the ¡°structural¡± in SEMs stands for the causal assumptions that researchers bring to the model which are not always part of these other statistical models. This presentation provides a brief overview of latent variable SEMs. I present the equations for SEMs and the major steps in modeling which include model specification, determining the model implied moments, establishing identification, estimating parameters, assessing model fit, and respecifying poorly fitting models. I also provide examples of SEMs. ±¨¸æÌâÄ¿£º Estimation of Change-points in ARMA-GARCH/IGARCH and General Time Series Models ±¨ ¸æ ÈË£º Professor Shiqing Ling, Department of Mathematics, HKUST Ê±¼äµØµã£º 2012Äê7ÔÂ16ÈÕÏÂÎç3:30 Ë¼Ô´Â¥703 Õª       Òª£º This paper first develops a general theory for estimating change-points in a general class of linear and nonlinear time series models. Based on a general objective function, it is shown that the estimated change-point converges weakly to the location of the maxima of a double-sided random walk and other estimated parameters are asymptotically normal. When the magnitude d of changed parameters is small, it is shown that the limiting distribution can be approximated by the known distribution as in Yao (1987). This provides a channel to connect our results with those in Picard (1985) and Bai, Lumsdaine and Stock (1998), where the magnitude of changed parameters depends on the sample size n and tends to zero as n approaches infinity. We then focus on the self-weighted QMLE and the local QMLE of structure-change ARMA-GARCH/IGARCH models. The limiting distribution of the estimated change-point and its approximating distribution are obtained. Some simulation results are reported and one real example is given. ±¨¸æÌâÄ¿£º Estimating individualized treatment rules using outcome weighted learning ±¨ ¸æ ÈË£º Donglin Zeng, Associate Professor, University of North Carolina, Chapel Hill Ê±¼äµØµã£º 2012Äê7ÔÂ10ÈÕÏÂÎç3:15 Ë¼Ô´Â¥712 Õª       Òª£º There is increasing interest in discovering individualized treatment rules for patients who have heterogeneous responses to treatment. In particular, one aims to find an optimal individualized treatment rule which is a deterministic function of patient specific characteristics maximizing expected clinical outcome. In this paper, we first show that estimating such an optimal treatment rule is equivalent to a classification problem where each subject is weighted proportional to his or her clinical outcome. We then propose an outcome weighted learning approach based on the support vector machine framework. We show that the resulting estimator of the treatment rule is consistent. We further obtain a finite sample bound for the difference between the expected outcome using the estimated individualized treatment rule and that of the optimal treatment rule. The performance of the proposed approach is demonstrated via simulation studies and an analysis of chronic depression data ±¨¸æÌâÄ¿£º Estimating treatment effects with treatment switching via semi-competing risks models: an application to a colorectal cancer study ±¨ ¸æ ÈË£º Donglin Zeng, Associate Professor, University of North Carolina, Chapel Hill Ê±¼äµØµã£º 2012Äê7ÔÂ10ÈÕÏÂÎç2:00 Ë¼Ô´Â¥712 Õª       Òª£º Treatment switching is a frequent occurrence in clinical trials, where, during the course of the trial, patients who fail on the control treatment may change to the experimental treatment. Analyzing the data without accounting for switching yields highly biased and inefficient estimates of the treatment effect. In this paper, we propose a class of semiparametric semi-competing risks transition survival models to accommodate treatment switches. Theoretical properties of the proposed model are examined and an efficient expectation-maximization algorithm is derived for obtaining the maximum likelihood estimates. Simulation studies are conducted to demonstrate the superiority of the model compared to the intent-to-treat analysis and other methods proposed in the literature. The proposed method is applied to data from a colorectal cancer clinical trial. ±¨¸æÌâÄ¿£º Bayesian empirical likelihood for quantile regression ±¨ ¸æ ÈË£º Xuming He, Professor, Department of Statistics, University of Michigan Ê±¼äµØµã£º 2012Äê7ÔÂ9ÈÕÏÂÎç3:00 Ë¼Ô´Â¥703 Õª       Òª£º Quantile regression is semiparametric in the sense that no parametric likelihood is assumed in the model. A working likelihood can be used, but the resulting posterior may not have any validity for statistical inference. In this talk we will introduce Bayesian empirical likelihood for quantile regression, and show that it leads to asymptotically valid posterior inference. In addition, this approach enables us to make use of commonality across quantiles to improve efficiency of quantile estimation. We will also introduce a notion of shrinking priors, and demonstrate how this new framework can help explain the efficiency gains of the Bayesian empirical likelihood method over the usual quantile estimates. The talk is based on joint work with Yunwen Yang (Drexel University). ±¨¸æÌâÄ¿£º Empirical likelihood inference for the Cox model with time-dependent coefficients ±¨ ¸æ ÈË£º Yichuan Zhao, Associate Professor, Department of Mathematics and Statistics, Georgia State University Ê±¼äµØµã£º 2012Äê7ÔÂ6ÈÕÏÂÎç3:00 Ë¼Ô´Â¥712 Õª       Òª£º The Cox model with time-dependent coefficients has been studied by a number of authors recently. In this talk, we develop empirical likelihood (EL) point-wise confidence regions for the time-dependent regression coefficients via local partial likelihood smoothing. The EL simultaneous confidence bands for a linear combination of the coefficients are also derived based on the strong approximation methods. The EL ratio is formulated through the local partial log-likelihood for the regression coefficient functions. Our numerical studies indicate that the EL point-wise/simultaneous confidence regions/bands have satisfactory finite sample performances. Compared with the confidence regions derived directly based on the asymptotic normal distribution of the local constant estimator, the EL confidence regions are overall tighter and can better capture the curvature of the underlying regression coefficient functions. Two data sets, the gastric cancer data and the Mayo Clinic primary biliary cirrhosis data, are analysed using the proposed method. This is based on joint work with Yanqing Sun and Rajeshwari Sundaram. ±¨¸æÌâÄ¿£º Large Volatility Matrix Estimation for High-Frequency Financial Data ±¨ ¸æ ÈË£º Professor Yazhen Wang, University of Wisconsin-Madison Ê±¼äµØµã£º 2012Äê7ÔÂ2ÈÕÏÂÎç4:00 Ë¼Ô´Â¥703 Õª       Òª£º Volatilities of asset returns are central to the theory and practice of asset pricing, portfolio allocation, and risk management. In financial economics, there is extensive research on modeling and forecasting volatility up to the daily level based on Black-Scholes, diffusion, GARCH, stochastic volatility models and implied volatilities from option prices. Nowadays, thanks to technological innovations, high-frequency financial data are available for a host of different financial instruments on markets of all locations and at scales like individual bids to buy and sell, and the full distribution of such bids. The availability of high-frequency data stimulates an upsurge interest in statistical research on better estimation of volatility. This talk will start with a review on low-frequency financial time series and high-frequency financial data. Then I will introduce popular realized volatility computed from high-frequency financial data and present my work on large volatility matrix estimation. ±¨¸æÌâÄ¿£º Joint Estimation of Multiple Graphical Models ±¨ ¸æ ÈË£º Associate Professor, University of Michigan Ê±¼äµØµã£º 2012Äê7ÔÂ2ÈÕÏÂÎç3:00 Ë¼Ô´Â¥703 Õª       Òª£º Gaussian graphical models explore dependence relationships between random variables, through estimation of the corresponding inverse covariance matrices. In this paper we develop an estimator for such models appropriate for data from several graphical models that share the same variables and some of the dependence structure. In this setting, estimating a single graphical model would mask the underlying heterogeneity, while estimating separate models for each category does not take advantage of the common structure. We propose a method which jointly estimates the graphical models corresponding to the different categories present in the data, aiming to preserve the common structure, while allowing for differences between the categories. This is achieved through a hierarchical penalty that targets the removal of common zeros in the inverse covariance matrices across categories. We establish the asymptotic consistency and sparsity of the proposed estimator in the high-dimensional case, and illustrate its superior performance on a number of simulated networks. An application to learning semantic connections between terms from webpages collected from computer science departments is also included. This is joint work with Jian Guo, Elizaveta Levina, and George Michailidis. ±¨¸æÌâÄ¿£º Personalized Treatment Selection with Biomarkers ±¨ ¸æ ÈË£º Tianxi Cai, Professor of Biostatistics, Department of Biostatistics, Harvard School of Public Health, USA Ê±¼äµØµã£º 2012Äê6ÔÂ7ÈÕÉÏÎç10:00-11:30 Ë¼Ô´Â¥712 Õª       Òª£º Clinical trials that evaluate treatment benefit focus primarily on estimating the average benefit. However, a treatment reported to be effective may not be beneficial to all patients. For example, the benefit of giving chemotherapy prior to hormone therapy with Tamoxifen in the adjuvant treatment of postmenopausal women with lymph node negative breast cancer depends on the ER-status. Due to the toxicity of chemotherapy, it is crucial to identify patients who will and will not benefit from chemotherapy. This gives rise to the need of accurately predicting benefit based on important markers. In this research, we propose a systematic, two-stage estimation procedure for the subject-level treatment differences for future patient's disease management and treatment selections. To construct this procedure, we first utilize a parametric or semi-parametric method to estimate individual-level treatment differences and use these estimates to create an index scoring system for clustering patients. We subsequently estimate the average treatment difference for each cluster of subjects via a nonparametric function estimation method. Furthermore, pointwise and simultaneous interval estimates are constructed to make inferences about such individual-specific treatment differences. The new proposal is illustrated with the data from an AIDS clinical trial and a randomized trial for treating patients with stable coronary heart disease. ±¨¸æÌâÄ¿£º Risk Prediction with Biomarkers under Complex Study Designs ±¨ ¸æ ÈË£º Tianxi Cai, Professor of Biostatistics, Department of Biostatistics, Harvard School of Public Health, USA Ê±¼äµØµã£º 2012Äê6ÔÂ6ÈÕÉÏÎç10:00-11:30 Ë¼Ô´Â¥712 Õª       Òª£º To evaluate the clinical utility of new biomarkers for risk prediction, a crucial step is to measure their predictive accuracy with prospective studies. However, it is often infeasible to obtain marker values for all study participants. The nested case-control (NCC) design is a useful cost-effective strategy for such settings. Under the NCC design, markers are only ascertained for cases and a fraction of controls sampled randomly from the risk sets. The outcome dependent sampling generates a complex data structure and therefore a challenge for analysis. Existing methods for analyzing NCC studies focus primarily on association measures. When there is a single marker of interest, we propose a class of non-parametric estimators for commonly used accuracy measures. Asymptotic theory for the proposed estimators were derived to account for both the outcome dependent missingness and the correlation induced by finite population sampling due to the NCC design. When there are multiple markers under investigation, we extended the proposed procedures to derive an optimal composite risk score for prediction. We provided inference procedures for the prediction accuracy of the risk score and as well as for making comparisons between two risk scores. The new procedures were illustrated with data from the Nurse¡¯s Health Study to evaluate the accuracy of biomarkers and genetic markers for predicting the risk of developing Rheumatoid Arthritis. ±¨¸æÌâÄ¿£º Evaluating Clinical Utility of Biomarkers for Prediction ±¨ ¸æ ÈË£º Tianxi Cai, Professor of Biostatistics, Department of Biostatistics, Harvard School of Public Health, USA Ê±¼äµØµã£º 2012Äê6ÔÂ5ÈÕÉÏÎç10:00-11:30 Ë¼Ô´Â¥712 Õª       Òª£º Novel biomarkers have the great potential to dramatically change the decision making process of modern medicine. Recently there has been increased interest in the use diagnostic or prognostic markers for accurately diagnosing disease or predicting the risk of future clinical events. In this talk, we introduce various concepts of accuracy measures for quantifying the clinical utility of biomarkers under such settings. When new biomarkers are introduced to improve the diagnostic or prognostic accuracy of existing modalities, it is important to quantify the incremental value of new markers. We will also discuss various procedures for making inference about such incremental values over an entire population and also over various subpopulations. These concepts will be introduced under various clinical settings and illustrated with clinical studies ±¨¸æÌâÄ¿£º Local Polynomial Regression for Symmetric Positive Definite Matrices ±¨ ¸æ ÈË£º Prof. Hongtu Zhu(University of North Carolina at Chapel Hill, USA) Ê±¼äµØµã£º 2010Äê9ÔÂ21ÈÕ10:00 Ë¼Ô´Â¥1013 Õª       Òª£º Local polynomial regression has received extensive attention for the nonparametric estimation of regression functions when both response and covariate are in Euclidean space. However, little has been done when the response is in a Riemannian manifold. We develop an intrinsic local polynomial regression (ILPR) and its associated ILPR estimate for the analysis of symmetric positive definite (SPD) matrices as responses that lies in a Riemannian manifold with covariate in Euclidean space. The primary motivation and application of the proposed methodology is in computer vision and medical imaging. We examine two commonly used metrics including the Riemannian metric and the Log-Euclidean metric on the space of SPD matrices. Under each metric, we develop an associated cross-validation bandwidth selection method, and derive the asymptotic bias, variance, and normality of the intrinsic local constant and local linear estimators and compare their asymptotic mean square errors. Simulation studies are further used to compare the estimators under the two metrics and examine their finite sample performance. We apply our method to detect the diagnostic differences by smoothing diffusion tensors along fiber tracts in a study of human immunodeficiency virus. ±¨¸æÌâÄ¿£º On the Estimation of Integrated Covariance Matrices of High Dimensional Diffusion Processes ±¨ ¸æ ÈË£º Prof. Yingying Li(Hong Kong University of Science and Technology) Ê±¼äµØµã£º 2010Äê6ÔÂ13ÈÕ15:00-16:00 Ë¼Ô´Â¥703 Õª       Òª£º We consider the estimation of integrated covariance matrices of high dimensional diffusion processes by using high frequency data. We start by studying the most commonly used estimator, the realized covariance matrix (RCV). We show that in the high dimensional case when the dimension p and the observation frequency n grow in the same rate, the limiting empirical spectral distribution of RCV depends on the covolatility processes not only through the underlying integrated covariance matrix Sigma, but also on how the covolatility processes vary in time. In particular, for two high dimensional diffusion processes with the same integrated covariance matrix, the empirical spectral distributions of their RCVs can be very different. Hence in terms of making inference about the spectrum of the integrated covariance matrix, the RCV is in general \emph{not} a good proxy to rely on in the high dimensional case. We then propose an alternative estimator, the time-variation adjusted realized covariance matrix (TVARCV), for a class of diffusion processes. We show that the limiting empirical spectral distribution of our proposed estimator TVARCV does depend solely on that of Sigma through a Marcenko-Pastur equation, and hence the TVARCV can be used to recover the empirical spectral distribution of Sigma by inverting the Marcenko-Pastur equation, which can then be applied to further applications such as portfolio allocation, risk management, etc.. This is based on Joint work with Xinghua Zheng.. 2008Äê ±¨¸æÌâÄ¿£º One-step Sparse Estimates in Nonconcave Penalized Likelihood Models ±¨ ¸æ ÈË£º Prof. Runze Li (Associate Professor The Pennsylvania State University) Ê±¼äµØµã£º 2008Äê6ÔÂ27ÈÕ16:00-17:40 Ë¼Ô´Â¥703 Õª       Òª£º Fan and Li (2001) proposed a family of variable selection methods via penalized likelihood using concave penalty functions. The nonconcave penalized likelihood estimators enjoy the oracle properties, but maximizing the penalized likelihood function is computationally challenging, because the objective function is nondifferentiable and nonconcave. In this article we propose a new unified algorithm based on the local linear approximation (LLA) for maximizing the penalized likelihood for a broad class of concave penalty functions. Convergence and other theoretical properties of the LLA algorithm are established. A distinguished feature of the LLA algorithm is that at each LLA step, the LLA estimator can naturally adopt a sparse representation. Thus we suggest using the one-step LLA estimator from the LLA algorithm as the final estimates. Statistically, we show that if the regularization parameter is appropriately chosen, the one-step LLA estimates enjoy the oracle properties with good initial estimators. Computationally, the one-step LLA estimation methods dramatically reduce the computational cost in maximizing the nonconcave penalized likelihood. We conduct some Monte Carlo simulation to assess the finite sample performance of the one-step sparse estimation methods. The results are very encouraging.. ±¨¸æÌâÄ¿£º Quotient Correlation: A Sample Based Alternative To Pearson's Correlation ±¨ ¸æ ÈË£º Prof. Zhengjun Zhang (Princeton University, USA) Ê±¼äµØµã£º 2008Äê6ÔÂ27ÈÕ10:40-11:40 Ë¼Ô´Â¥1013 Õª       Òª£º The quotient correlation is defined here as an alternative to Pearson's correlation that is more intuitive and flexible in cases where the tail behavior of data is important. It measures nonlinear dependence where the regular correlation coefficient is generally not applicable. One of its most useful features is a test statistic that has high power when testing nonlinear dependence in cases where the Fisher's $Z$-transformation test may fail to reach a right conclusion. Unlike most asymptotic test statistics, which are either normal or $\chi2$, this test statistic has a limiting gamma distribution (henceforth, the gamma test statistic). More than the common usages of correlation, the quotient correlation can easily and intuitively be adjusted to values at tails. This adjustment generates two new concept -- the tail quotient correlation and the tail independence test statistics, which are also gamma statistics. Due to the fact that there is no analogue of the correlation coefficient in extreme value theory, and there does not exist an efficient tail independence test statistic, these two new concepts may open up a new field of study. In addition, an alternative to Spearman's rank correlation, a rank based quotient correlation, is also defined. The advantages of using these new concepts are illustrated with simulated data and real data analysis of internet traffic and asset returns. ±¨¸æÌâÄ¿£º Statistical semiparametric detection of significant activation for brain fMRI ±¨ ¸æ ÈË£º Prof. Chunming Zhang (Associate Professor University of Wisconsin ) Ê±¼äµØµã£º 2008Äê6ÔÂ27ÈÕ9:30-10:30 Ë¼Ô´Â¥1013 Õª       Òª£º Functional magnetic resonance imaging (fMRI) aims to locate activated regions in human brains when specific tasks are performed. The conventional tool for analyzing fMRI data applies some variant of the linear model, which is restrictive in modeling assumptions. To yield more accurate prediction of the time-course behavior of neuronal responses, the semiparametric inference for the underlying hemodynamic response function is developed to identify significantly activated voxels. Under mild regularity conditions, we demonstrate that a class of the proposed semiparametric test statistics, based on the local linear estimation technique, follow chi-squared distributions under null hypotheses for a number of useful hypotheses. Furthermore, the asymptotic power functions of the constructed tests are derived under the fixed and contiguous alternatives. Simulation evaluations and real fMRI data application suggest that the semiparametric inference procedure provides more efficient detection of activated brain areas than the popular imaging analysis tools AFNI and FSL. 2007Äê ±¨¸æÌâÄ¿£º Challenge of Dimensionaly in Classifications and Feature Selection ±¨ ¸æ ÈË£º Prof. Jianqing Fan(Princeton University, USA) Ê±¼äµØµã£º 2007Äê12ÔÂ27ÈÕ9:30-10:30 Ë¼Ô´Â¥703 Õª       Òª£º ±¨¸æÌâÄ¿£º Regression Analysis of Longitudinal Data with Outcome Dependent Observation and Follow-up Times ±¨ ¸æ ÈË£º Prof.(Tony) Jianguo Sun (University of Missouri, USA ) Ê±¼äµØµã£º 2007Äê7ÔÂ13ÈÕ16:00-17:00 Ë¼Ô´Â¥703 Õª       Òª£º Longitudinal data frequently occur in many studies such as longitudinal follow-up studies. To develop statistical methods and theory for the analysis of them, independent or noninformative observation and censoring times are typically assumed, which naturally leads to inference procedures conditional on observation and censoring times (Diggle et al., 1994; Lin and Ying, 2001). In many situations, however, this may not be true or realistic. That is, longitudinal responses may be correlated with observation times as well as censoring time. This paper considers the analysis of longitudinal data where these correlations may exist and a joint modeling approach that uses some latent variables to characterize the correlations is proposed. For inference about regression parameters, estimating equation approaches are developed and both large and final sample properties of the proposed estimators are established. The ethodology is applied to a bladder cancer study that motivated this nvestigation. ±¨¸æÌâÄ¿£º AGGREGATION OF NONPARAMETRIC ESTIMATORS FOR VOLATILITY MATRIX ±¨ ¸æ ÈË£º Jianqing Fan, Yingying Fan and Jinchi Lv (Princeton University) Ê±¼äµØµã£º 2007Äê6ÔÂ25ÈÕ16:00-17:30 Ë¼Ô´Â¥712 Õª       Òª£º An aggregated method of nonparametric estimators based on time-domain and state-domain estimators is proposed and studied. To attenuate the curse of dimensionality, we propose a factor modeling strategy. We first investigate the asymptotic behaviors of nonparametric estimators of the volatility matrix in the time domain and in the state domain. The asymptotic normality is separately established for nonparametric estimators in the time domain and state domain. These two estimators are asymptotically independent. Hence, they can be combined, through a dynamic weighting scheme, to improve the efficiency of the estimated volatility matrix. The optimal dynamic weights are derived and it is shown that the aggregated estimator uniformly dominates the volatility matrix estimators using time-domain or state-domain smoothing alone. A simulation study, based on an essentially affine model for the term structure, is conducted and it demonstrates convincingly that the newly proposed procedure outperforms both time- and state-domain estimators. Empirical studies endorse further the advantages of our aggregated method ±¨¸æÌâÄ¿£º Analysis of Longitudinal Data with Semiparametric Estimation of Covariance Function ±¨ ¸æ ÈË£º Runze Li: Associate Professor (The Pennsylvania State University) Ê±¼äµØµã£º 2007Äê5ÔÂ18ÈÕ15:30-16:30 ³¿ÐËÖÐÐÄ605 Õª       Òª£º Improving efficiency for regression coefficients and predicting trajectories of individuals are two important aspects in analysis of longitudinal data. Both involve estimation of the covariance function. Yet, challenges arise in estimating the covariance function of longitudinal data collected at irregular time points. A class of semiparametric models for the covariance function is proposed by imposing a parametric correlation structure while allowing a nonparametric variance function. A kernel estimator is developed for the estimation of the nonparametric variance function. Two methods, a quasi-likelihood approach and a minimum generalized variance method, are proposed for estimating parameters in the correlation structure. We introduce a semiparametric varying coefficient partially linear model for longitudinal data and propose an estimation procedure for model coefficients by using a profile weighted least squares approach. Sampling properties of the proposed estimation procedures are studied and asymptotic normality of the resulting estimators is established. Finite sample performance of the proposed procedures is assessed by Monte Carlo simulation studies. The proposed methodology is illustrated by an analysis of a real data example. ±¨¸æÌâÄ¿£º Accelerated Life and Degradation Models with Dynamic Environment ±¨ ¸æ ÈË£º Prof.Mikhail Nikulin (Statistique Math¨¦matique et ses Applications, Victor Segalen University) Ê±¼äµØµã£º 2006Äê12ÔÂ12ÈÕ16:00-17:00 Ë¼Ô´Â¥712 Õª       Òª£º We consider here the statistical models with dynamic environment describing dependence of the lifetime distribution on the time-dependent explanatory variables. Such models are used in reliability and survival analysis to study the reliability of aging bio-technical system, in dependence on their longevity, fatigue and degradation under different conditions of exploration. The reliability theory gives a general approach for construction of efficient statistical models in terms of failure rates to study aging and degradation problems in different areas such as industrial engineering and technology, biophysics, biology, demography, radiobiology, genetics, biostatistics, survival analysis, business and finance, etc... We shall discuss the problems of statistical modelling and of choice of design in accelerated life testing to obtain the statistical estimators of the main reliability characteristics of aging systems. ±¨¸æÌâÄ¿£º Nonlinear Dependency and Its Application ±¨ ¸æ ÈË£º Wei Gang (Îº¸Õ).(School of Mathematics and System Sciences Shandong University ) Ê±¼äµØµã£º 2006Äê11ÔÂ17ÈÕ16:00-17:00 Ë¼Ô´Â¥703 Õª       Òª£º The normal distribution and the linear model have been taken as the central part classical statistic inference in both theory and application. In the last decade, with the demand from the social, medical, and industrial sciences, the nonlinear dependency characterized by the partial ordering, copula construction, and nonparametric dependency have shown great potentials in their theoretical challenges and applied statistics. In this talk, we particularly demonstrate the rich mathematical structures constructed with the aid of copula analysis and some simple but important applications of such non-classical statistical inference techniques. ±¨¸æÌâÄ¿£º Tree-Structured Survival Analysis Based on Variance of Survival Time ±¨ ¸æ ÈË£º Hua Jin, Ph.D.(School of Mathematical Sciences, South China Normal University) Ê±¼äµØµã£º 2006Äê10ÔÂ19ÈÕ10:30 Õª       Òª£º Tree structured survival analysis (TSSA) is a popular alternative to the Cox proportional hazards regression in medical research of survival data. Several methods for constructing a tree of different survival profiles have been developed, including one based on two-sample log-rank test statistics and martingale -type residuals. Lu, Jin and Mi used variance of restricted mean lifetimes as an index of degree of separation (DOS) to measure the efficiency in separations of survival profiles by a classification method. They proposed a hypothesis testing procedure for comparison of two classification rules, especially for non-inferiority test. Our objective here is to explore the use of DOS in TSSA. We propose an algorithm in a similar fashion to the least square regression tree for survival analysis. We apply the proposed method to prospective cohort data from the Study of Osteoporotic Fracture that motivated our research and then compare our classification rule to those rules based on the log-rank statistics and martingale residuals. ±¨¸æÌâÄ¿£º Bayesian Methods for Inferring Epistasis ±¨ ¸æ ÈË£º Prof. Jun Liu (Department of Statistics, Harvard University) Ê±¼äµØµã£º 2006Äê7ÔÂ24ÈÕ£¨ÖÜÒ»£© ÏÂÎç2:00 Ë¼Ô´Â¥712 Õª       Òª£º I will discuss a Bayesian approach in detecting multi-locus interactions (Epistasis) for case-control association studies. Existing methods are either of low power or computationally infeasible when facing of a large number of markers. Using MCMC sampling techniques, the method can efficiently detect interactions among thousands of markers. I will also discuss the issue of statistical significance and how to adjust multiple comparisons in this situation (much of these are conjectures, though). ±¨¸æÌâÄ¿£º Embracing Statistical Challenges in the Information Technology Age ±¨ ¸æ ÈË£º Prof. Bin Yu(Department of Statistics, University of California, Berkeley ) Ê±¼äµØµã£º 2006Äê7ÔÂ20ÈÕ£¨ÖÜËÄ£© ÉÏÎç10£º00 Ë¼Ô´Â¥703 Õª       Òª£º ±¨¸æÌâÄ¿£º Bayesian Hierarchical Modeling for Integrating Low-accuracy and High-accuracy Experiments ±¨ ¸æ ÈË£º Prof.Jeff Wu (Georgia Institute of Technology School of Industrial and Systems Engineering ) Ê±¼äµØµã£º 2006Äê7ÔÂ14ÈÕ£¨ÖÜÎå£© ÉÏÎç10£º00 Ë¼Ô´Â¥712 Õª       Òª£º Standard practice in analyzing data from different types of experiments is to treat data from each type separately. By borrowing strength across multiple sources, an integrated analysis can produce better results. Careful adjustments need to be made to incorporate the systematic differences among various experiments. To this end, some Bayesian hierarchical Gaussian process models (BHGP) are proposed. The heterogeneity among different sources is accounted for by performing flexible location and scale adjustments. The approach tends to produce prediction closer to that from the high-accuracy experiment. The Bayesian computations are aided by the use of Markov chain Monte Carlo and Sample Average Approximation algorithms. The proposed method is illustrated with two examples: one with detailed and approximate finite elements simulations for mechanical material design and the other with physical and computer experiments. ±¨¸æÌâÄ¿£º Fast Functional MRI ±¨ ¸æ ÈË£º Prof. Cun-Hui Zhang (Department of Statistics, Rutgers University, USA ) Ê±¼äµØµã£º 2006Äê6ÔÂ22ÈÕ£¨ÐÇÆÚËÄ) ÏÂÎç 4:00--5:00 Ë¼Ô´Â¥703 Õª       Òª£º We develop fast functional MRI methods to improve the time-resolution of the current functional MRI technology by sampling a small fraction of the Fourier transform of the spin density, and using a prolate wave filter to approximately obtain, not the usual susceptibility map, but instead the integral of this quantity over regions of interest in the brain at successive time-points. The aim of this space/time trade-off is to obtain, at high time-resolution, the total activity in these regions which processes the specific stimulus/task, and more important in studying higher cognition, the sequence of occurrences of these processes. An fMRI experiment will be reviewed and discussed. This is joint work with Gary Glover, Martin Lindquist and Larry Shepp. ±¨¸æÌâÄ¿£º Statistical Challenges with High Dimensionality in Feature Selection ±¨ ¸æ ÈË£º Prof. Jianqing Fan (Princeton University, USA ) Ê±¼äµØµã£º 2006Äê6ÔÂ2ÈÕ£¨ÐÇÆÚÎå) ÉÏÎç 10:00--11:00 Ë¼Ô´Â¥712 Õª       Òª£º Technological innovations have revolutionized the process of scientific research and knowledge discovery. The availability of massive data and challenges from frontiers of research and development have reshaped statistical thinking, data analysis and theoretical studies. The challenges of high-dimensionality arise in diverse fields of sciences and the humanities, ranging from computational biology and health studies to financial engineering and risk management. In all of these fields, variable selection and feature extraction are crucial for knowledge discovery. We first give a comprehensive overview of statistical challenges with high dimensionality in these diverse disciplines. We then approach the problem of variable selection and feature extraction using a unified framework: penalized likelihood methods. Issues relevant to the choice of penalty functions are addressed. We demonstrate that for a host of statistical problems, as long as the dimensionality is not excessively large, we can estimate the model parameters as well as if the best model is known in advance. The persistence property in risk minimization is also addressed. The applicability of such a theory and method to diverse statistical problems is demonstrated. Other related problems with high-dimensionality are also discussed. ±¨¸æÌâÄ¿£º Semi/Non-parametric Dynamic Quantile Regression Models and Their Applications ±¨ ¸æ ÈË£º Prof. Zongwu Cai (Department of Mathematics and Statistics & Department of Economics, University of North Carolina, Charlotte, USA) Ê±¼äµØµã£º 2006Äê4ÔÂ8ÈÕ£¨ÐÇÆÚÁù) ÏÂÎç 4:00--5:00 Ë¼Ô´Â¥712 Õª       Òª£º In this talk, first I will briefly review some semiparametric and nonparametric regression models for time series data and their applications such as value-at-risk. In particular, I will focus on a class of smooth coefficient quantile regression time series models based on some applications. We employ a local linear fitting scheme to estimate the smooth coefficients in the quantile framework. The programming involved in the local linear quantile estimation is relatively simple and it can be modified with few efforts from the existing programs for the linear quantile model. We derive the local Bahadur representation of the local linear estimator for alpha-mixing time series and establish the asymptotic normality of the resulting estimator. The asymptotic behaviors of the estimator at the boundaries are examined. A comparison of the local linear quantile estimator with the local constant estimator is presented. A simulation study is carried out to illustrate the performance of the estimates. An empirical application of the model to the exchange rate time series data and the well-known Boston house price data further demonstrates the potential of the proposed modeling procedures. ±¨¸æÌâÄ¿£º Additive models for spatial processes ±¨ ¸æ ÈË£º Dag Tjostheim ÔºÊ¿£¨Department of Mathematics,University of Bergen, Norway£© Ê±¼äµØµã£º 2006Äê4ÔÂ11ÈÕ£¨ÐÇÆÚ¶þ£©ÏÂÎç 4:00--5:00 Ë¼Ô´Â¥712 Õª       Òª£º ±¨¸æÌâÄ¿£º Estimating Marginal Survival Under Dependent Censoring ±¨ ¸æ ÈË£º Donglin Zeng (Assistant Professor)(Department of Biostatistics, University of North Carolina (Chapel Hill) ) Ê±¼äµØµã£º 2006Äê4ÔÂ13ÈÕ£¨ÐÇÆÚËÄ£©ÏÂÎç 2:00--3:00 Ë¼Ô´Â¥712 Õª       Òª£º One goal in survival analysis of right censored data is to estimate marginal survival function in the presence of dependent censoring. When many auxiliary covariates are sufficient to explain the dependent censoring, estimation based on either semiparametric model or nonparametric model of the conditional survival function can be problematic due to the high-dimensionality of the auxiliary information. In this paper, we use two working models to condense these high-dimensional covariates in dimension reduction; then an estimate of the marginal survival function can be derived non-parametrically in a low-dimension space. We show that such an estimator has the following double robust property: when either working model is correct, the estimator is consistent and asymptotically Gaussian; when both working models are correct, the asymptotic variance attains the efficiency bound. ±¨¸æÌâÄ¿£º Maximum Likelihood Estimation in Semiparametric Transformation Models for Counting Processes ±¨ ¸æ ÈË£º Donglin Zeng (Assistant Professor) (Department of Biostatistics, University of North Carolina (Chapel Hill) ) Ê±¼äµØµã£º 2006Äê4ÔÂ18ÈÕ£¨ÐÇÆÚ¶þ£©ÏÂÎç 2:00--3:00 Ë¼Ô´Â¥712 Õª       Òª£º A class of semiparametric transformation models is proposed to characterize the effects of possibly time-varying covariates on the intensity functions of counting processes. The class includes the proportional intensity model and linear transformation models as special cases. Nonparametric maximum likelihood estimators are developed for the regression parameters and cumulative intensity functions of these models based on censored data. The estimators are shown to be consistent and asymptotically normal. The limiting variances for the estimators of the regression parameters achieve the semiparametric efficiency bounds and can be consistently estimated. The limiting variances for the estimators of smooth functionals of the cumulative intensity function can also be consistently estimated. Simulation studies reveal that the proposed inference procedures perform well in practical settings. Two medical studies are provided. ±¨¸æÌâÄ¿£º Semiparametric Transformation Models for Survival Data with a Cure Fraction ±¨ ¸æ ÈË£º Donglin Zeng (Assistant Professor) (Department of Biostatistics, University of North Carolina (Chapel Hill) ) Ê±¼äµØµã£º 2006Äê4ÔÂ19ÈÕ£¨ÐÇÆÚÈý£©ÏÂÎç 2:00--3:00 Ë¼Ô´Â¥712 Õª       Òª£º We propose a class of transformation models for survival data with a cure fraction. The class of transformation models is motivated by biological considerations, and it includes both the proportional hazards and the proportional odds cure models as two special cases. An efficient recursive algorithm is proposed to calculate the maximum likelihood estimators. Furthermore, the maximum likelihood estimators for the regression coefficients are shown to be consistent and asymptotically normal, and their asymptotic variances attain the semiparametric efficiency bound. Simulation studies are conducted to examine the finite sample properties of the proposed estimators. The method is illustrated on data from a clinical trial involving the treatment of melanoma.
µ±Ç°Î»ÖÃ£ºÊ×Ò³ > Ñ§Êõ±¨¸æ > Í³¼ÆÑÐ¾¿ÖÐÐÄÑ§Êõ±¨¸æ