Graduation Date

Spring 5-7-2022

Document Type


Degree Name

Doctor of Philosophy (PhD)


Biomedical Informatics

First Advisor

John Windle, M.D.

Second Advisor

Stephen Scott, Ph.D.

Third Advisor

Scott Campbell, Ph.D.

Fourth Advisor

Leen-Kiat Soh, Ph.D.

MeSH Headings

Hypertension, Machine Learning, Supervised Machine Learning, Artificial Intelligence


Hypertension is the world's leading factor in cardiovascular disease. Forty-seven percent or close to one in two Americans aged 18 and older are affected. It predicts approximately a thousand deaths per day. Based on recent statistics from the Centers for Disease Control and Prevention, one in three patients with hypertension does not know they are hypertensive. Seventy-five percent of hypertensive patients have uncontrolled hypertension - meaning that they are not treated to target. While there is extensive literature on hypertension diagnosis and management, there is an apparent gap in understanding and acknowledging that a person is hypertensive. Moreover, blood pressure in a patient is not constant and can cover all the four hypertension stages delineated by the 2017 American College of Cardiology/American Heart Association. Hence, hypertension is a problem list item that can serve as an excellent use case to showcase the curation of the problem list using Artificial Intelligence algorithms.

This dissertation presents a framework for data preprocessing and feature engineering to assist in the process of developing clinic-oriented Artificial Intelligence models for hypertension diagnosis. We also developed models to that end, employing dynamic Artificial Intelligence algorithms to adequately model the problem given the fluctuating nature of blood pressure over time. We provide an extensive discussion of the problem list and how models like those we have developed can act as problem-knowledge couplers to assist with curating the problem list.

The initial sample consisted of 4,956,739 outpatient clinic visits. Data preprocessing led to 2,505,004 visits with quality data. We employed upstream machine learning to classify the visits into blood pressure stages needed for the downstream dynamic models. The upstream models' predicted stages revealed a statistically significant (p=0.0001) difference with hypertension status per the tenth revision of the International Classification of Diseases. Four different variants of the final dynamic models, going from two to five time steps, were successfully trained and tested using two different approaches: deep learning recurrent neural network and dynamic Bayesian belief network.