Graduation Date

Summer 2023

Document Type


Degree Name

Doctor of Philosophy (PhD)



First Advisor

Ying Zhang, Ph.D.

Second Advisor

Cheng Zheng, Ph.D.

Third Advisor

Lynette Smith, Ph.D.

Fourth Advisor

James Dong, Ph.D.


There is a growing interest in modeling multiple event data in biomedical and public health investigations. Competing and semi-competing risks data are two special types of multiple event data. Statistical modeling of semi-competing risks data involves an intermediate event and a terminal event and emphasizes the investigation of the effect of the intermediate event on the terminal event. The investigation of the semi-competing risks data model not only enables us to determine if a disease episode is associated with mortality but also provides a toolkit for predicting death if the episode occurred at a particular time. Meanwhile, the interpretation of the causal effect of a treatment or exposure in competing risks data is also of great interest. Endogeneity and potential confounding issues in observational studies prevent investigators from determining the causal significance of the primary exposure of interest. An issue frequently encountered in both competing and semi-competing risks data is incomplete observation of the event types or event misclassification. Naive analyses by disregarding case mis-ascertainment or complete case analyses by excluding any subjects who may be subject to case misclassification or absent event status may generate biased estimates that lead to invalid inferences. In the literature, several statistical remedies have been developed to assist in correctly modeling competing risks data. To the best of our knowledge, there are limited statistical methods in the literature that can be directly applied to the semi-competing risks data with missing event types. In this dissertation, we propose pseudo-likelihood

approaches equipped with an EM-like algorithm to study semi-competing risks data with missing event types under the restricted Gamma-Frailty conditional Markov model. We demonstrate that the resulting pseudo-likelihood estimates are consistent and asymptotically normal, and we use simulation studies to validate these theoretical properties. We also propose a modified two-stage residual inclusion method for studying the causal effect of competing risks data with missing event types. Finally, we apply our proposed methods to an ongoing multi-center HIV cohort study from the East Africa Regional Consortium of the International Epidemiology Databases to Evaluate AIDS (EA-IeDEA).


2023 Copyright, the authors

Available for download on Saturday, July 26, 2025

Included in

Biostatistics Commons