Home > Academic Announcements > (Jun. 20) Recursive partitioning: trees and adaptive splinesⅡ

(Jun. 20) Recursive partitioning: trees and adaptive splinesⅡ

Last updated :2017-06-19

Topic:Recursive partitioning: trees and adaptive splinesⅡ
Speaker: Prof. Heping Zhang
(Yale University)
Time: 9:30-11:30 am, Tuesday, June 20, 2017
Venue: Rppm 415, New Mathematics Building, Guangzhou South Campus, SYSU

Many scientific problems reduce to modeling the relationship between two sets of variables. Regression methodology is designed to quantify these relationships. Due to their mathematical simplicity, linear regression for continuous data, logistic regression for binary data, proportional hazard regression for censored survival data, and mixed-effect regression for longitudinal data are among the most commonly used statistical methods. These parametric (or semiparametric) regression methods, however, may not lead to faithful data descriptions when the underlying assumptions are not satisfied. As remedies, extensive literature exists to perform diagnosis of parametric or semiparametric regression models, but the practice of the model diagnosis is uneven at best. Furthermore, model interpretation can be problematic in the presence of higher-order interactions among potent predictors. Nonparametric regression has evolved to relax or remove the restrictive assumptions. In many cases, recursive partitioning provides a useful alternative to the parametric regression methods. In this talk, I will describe nonparametric regression methods built on recursive partitioning. Importantly, recursive partitioning is a statistical technique that forms the basis for two classes of nonparametric regression methods: Classification and Regression Trees (CART) and Multivariate Adaptive Regression Splines (MARS). In the last two decades, many methods have been developed on the basis of or inspired by CART and MARS. Although relatively new, the applications of these methods are far reaching, as a result of increasing complexity of study designs and the massive size of many data sets (a large number of observations or variables), including commercial applications, arts, engineering, and science.