In recent years, we have witnessed the explosion of large-scale data in various fields. Classical statistical methodologies, such as linear regression or generalized linear regression, often show inadequate performance on heterogeneous data because the key homogeneity assumption fails. In this paper, we present a flexible framework to handle heterogeneous populations that can be naturally grouped into several ordered subtypes. A local model technique utilizing ordinal class labels during the training stage is proposed. We define a new 'progression score' that captures the progression of ordinal classes, and use a truncated Gaussian kernel to construct the weight function in a local regression framework. Furthermore, given the weights, we apply sparse shrinkage on the local fitting to handle high dimensionality. In this way, our local model is able to conduct variable selection on each query point. Numerical studies show the superiority of our proposed method over several existing ones. Our method is also applied to the Alzheimer's Disease Neuroimaging Initiative data to make predictions on the longitudinal clinical scores based on different modalities of baseline brain image features.
- local models
- ordinal classification
- random forests
ASJC Scopus subject areas
- Radiological and Ultrasound Technology
- Computer Science Applications
- Electrical and Electronic Engineering