Graduation Date

Fall 12-18-2020

Document Type


Degree Name

Doctor of Philosophy (PhD)



First Advisor

Hongying Dai

Second Advisor

Jane Meza

Third Advisor

Kendra Schmid

Fourth Advisor

Steven From


Small area estimation (SAE) has been widely used in a variety of applications to draw estimates in geographic domains represented as a metropolitan area, district, county, or state. The direct estimation methods provide accurate estimates when the sample size of study participants within each area unit is sufficiently large, but it might not always be realistic to have large sample sizes of study participants when considering small geographical regions. Meanwhile, high dimensional socio-ecological data exist at the community level, providing an opportunity for model-based estimation by incorporating rich auxiliary information at the individual and area levels. Thus, it is critical to develop advanced statistical modeling to extract accurate information.

Most existing methods of maximum likelihood estimation include complex and computationally expensive integral approximations. Some require prior assumptions for the unobserved random effects. In this dissertation, we proposed a Calibrated Hierarchical (CH) likelihood approach, which does not involve such integral approximations. This work covered three aims:

Aim 1. We developed a novel modeling approach for SAE via hierarchical generalized linear models based on the CH likelihood with improved parameter estimations through bias correction (CHBC). Unified analysis through the -likelihood provides flexibility in statistical inferences for unobserved random variables and leads to a single algorithm, expressed as a set of interlinked and augmented generalized linear models, to be used for fitting a broad class of new models with random effects.

Aim 2. We then extended this methodology to joint modeling of multiple outcome variables through shared random effects and multivariate random effects. The joint modeling approach has the flexibility of extending to multidimensional models using different types of outcomes by considering the association among them.

Aim 3. Extensive simulation studies were carried out to assess the empirical performance of estimation accuracy at varying scenarios. We also used COVID-19 data to study the association between confirmed cases and number of deaths based on the multivariate joint modeling approach. Joint modeling through shared random effects are illustrated using the Youth Risk Behavior Surveillance System (YRBSS) data to assess the effects of tobacco consumption at the county level. The asymptotic properties of MHLES were studied. Last, we developed an R package for SAE modeling for the CHBC approach. The development version of R package is available on