Graduation Date

Summer 8-9-2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Programs

Biostatistics

First Advisor

Christoper Wichman

Second Advisor

Jane Meza

Third Advisor

Kendra Schmid

Fourth Advisor

Elizabeth Wellsandt

Abstract

Bounded data often give rise to uncorrectable skew and heteroscedasticity. Bounded data are a relatively frequent occurrence in clinical and research settings. For example, in neuropsychology, most neurocognitive tests are bounded, and subjects are repeatedly measured over time. The statistician needs to choose a model that accounts for the correlated nature of the repeated measures. The Beta distribution is a natural choice for modeling bounded data. Currently, generalized linear mixed models (GLMM) and generalized estimating equations (GEE) are two methods that can be used to model Beta distributed data with repeated measures. However, GLMMs and GEEs have limitations, i.e., GLMMs require numerical integration and GEEs are not based on a joint likelihood making model selection more ambiguous. Therefore, we present two alternative models (LNMVB and SLMVB) that are based on a joint likelihood and do not require numerical integration for the estimation of the model parameters. We compare our proposed models to the Beta GLMM and the Beta GEE using simulated data and a real dataset from the National NeuroAIDS Tissue Consortium. Through simulation, we found the LNMVB and the Beta GEE were the only models that produced unbiased estimates of the location parameter for all scenarios considered. The LNMVB tended to have better control of the Type I error rate compared to the Beta GEE, especially for smaller sample sizes (i.e., N < 30). The coverage probabilities for both the LNMVB and the Beta GEE tended towards 95% as sample size increased with the LNMVB generally closer to the desired 95% coverage probability. Lastly, the Beta GEE was the only model that consistently had a mean bias near zero when estimating the correlation parameter. Based on simulated data, we conclude that the LNMVB is preferred for analyzing small sample (i.e., < 30), repeatedly-measured proportional data. Either the LNMVB or the Beta GEE is sufficient to analyze large sample (i.e., > 50), correlated Beta distributed data. Furthermore, if the correlation is the parameter of interest, the Beta GEE is the preferred model.

Included in

Biostatistics Commons

Share

COinS