• Classification Methods for Circular-Linear Data Using Periodic Functions

      Chen, Chen Chun (2016-07-08)
      In many fields such as medicine, agriculture and environmental studies, data are collected over time which can have some repeated pattern within a certain time period. Those data with the linear responses or measures such as blood pressure or solar energy with circular predictor, are called circular-linear data. The data having repeated measures over time are usually analyzed using longitudinal analysis methods. However, applying classical longitudinal data analysis to circular-linear data is generally inappropriate since the circular pattern of time would be treated as a simple continuous variable. Parametric approaches for circular-linear data have been developed using various modeling methods. We propose a Bayesian non-parametric MCMC circular smoothing splines approach, which is not only appropriate but also adds more flexibility for modeling and classification for circular-linear data. We first fit the circular-linear data on an estimated circle, to elicit functional pattern from the data, and then classify the patterns. In the development of the classification procedure, we use functional data analysis and some widely used dimension reduction classification methods such as the principal component analysis and support vector machine. We evaluate the performance of the proposed modelling and classification methods through extensive simulation, and demonstrate using the 2005-2006 NHANES physical activity monitor data on insomnia patients. In simulation study, the non-parametric Bayesian smoothing splines method coupled with support vector machine approach yields best performance in classification in terms of concordance rate. Our proposed nonparametric approach performed slightly better than the established parametric methods. Also, the initial data fitting procedures using a periodic regression function to reduce the noise in the data are shown to improve the performance in the classification problem. The result in the analysis of the NHANES data is consistent with simulation