Author ORCID Identifier

Date of Award


Degree Type


Degree Name

Master of Public Health (MPH)


Public Health

First Advisor

Ruiyan Luo

Second Advisor

Ike Okosun


INTRODUCTION: Diabetic ketoacidosis (DKA) is a serious life-threatening complication among pediatric patients with Type 1 diabetes. Even one instance of DKA can predispose a patient to more episodes of DKA in the future compounding the complications and risks.

AIM: The aim of this study is to use LASSO, a new variable selection method, to determine novel risk factors for DKA.

METHODS: The T1D Exchange dataset was used for a new variable selection technique for diabetic ketoacidosis (DKA) among pediatric patients in the United States. With DKA as a binary outcome, the HPGENSELECT procedure was used while LASSO or L1 regression was employed to create sparse models for variable selection.

RESULTS: The following modifiable variables were selected: number of blood glucose checks per patient per day, BMI, albumin creatinine ratio, systolic and diastolic blood pressure, BUN levels, having a hypoglycemic event in the previous three months and lipid levels (HDL/LDL/total cholesterol/triglycerides). The non-modifiable variables that were selected in the model are the following: age, diabetes duration in years, height and months from exam date. The model did produce an acceptable AUC for predictive ability.

DISCUSSION: The problem of finding modifiable risk factors for pediatric patients continues to be challenging, even if it is vitally important. The data in this study were both collected retrospectively and voluntarily, and their use for a predictive model should be used with caution. Machine learning techniques offer the potential to identify novel risk factors for DKA among pediatric patients if EHR are used and the dataset is large enough.


File Upload Confirmation