Graduation Date

Fall 2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Programs

Biostatistics

First Advisor

Lynette M Smith

Second Advisor

Ran Dai

Abstract

The early detection of complex diseases is essential for improving patient outcomes, particularly for conditions that remain asymptomatic until advanced stages. Biomarkers serve as key indicators of disease presence and progression, but selecting an optimal subset remains a challenge due to the high dimensionality of modern biological datasets. While advances in omics technologies have identified numerous candidate biomarkers, their effective utilization requires robust selection methods to ensure interpretability, cost-effectiveness, and predictive reliability in both cross-sectional and longitudinal settings. To address these challenges, we propose two novel approaches: Stability Selection Ensemble Learning (STABEL) for cross-sectional data and Longitudinal Stability Selection Ensemble Learning (LSTABEL) for longitudinal data. These methods integrate stability selection with ensemble learning to improve biomarker selection and predictive accuracy while mitigating overfitting. Stability selection enhances traditional variable selection by producing a stable subset of significant variables. Additionally, ensemble learning enhances generalization capabilities by combining the predictions of multiple models in order to mitigate the limitations of individual models. Simulation studies demonstrate the superiority of STABEL and LSTABEL over traditional methods in selecting truly relevant biomarkers and enhancing prediction performance. These methodologies are particularly valuable in applications where early detection is crucial. We illustrate their effectiveness in two case studies: ovarian cancer, where selecting a concise biomarker panel can enhance early diagnosis and treatment strategies, and Alzheimer’s disease, where robust biomarker discovery can improve the prediction of disease progression and cognitive decline. We also developed an R package, stabel, for the STABEL method.

Comments

2025 Copyright, the authors

Available for download on Friday, October 22, 2027

Included in

Biostatistics Commons

Share

COinS