AbstractsMedical & Health Science

Trinary Predictive Classification of Diabetic Episode Recurrence

by Cathy L Farrell

Institution: Central Connecticut State University
Year: 2016
Keywords: Data mining.; Diabetes.
Posted: 02/05/2017
Record ID: 2109912
Full text PDF: http://content.library.ccsu.edu/u?/ccsutheses,2296


This thesis explores the value of trinary predictive modeling to the understanding of diabetesrelated hospitalizations. According the U.S. Centers for Disease Control and Prevention, diabetes is the 7th leading cause of death in the United States, killing more than 75,000 people each year. In 2010, there were more than 600,000 hospital inpatient discharges in the U.S. with a primary diagnosis of diabetes. The average stay was 4.6 days. Between 2009 and 2012, diabetes was estimated to affect more than 12% of the U.S. population, including both diagnosed and undiagnosed cases. If effective models were available to predict which patients were likely to be readmitted to the hospital, interventions might be possible to improve outcomes for such patients, and to reduce the overall costs of their care. The thesis attempts to address this need with an analysis of the dataset 'Diabetes 130-US hospitals for years 1999-2008 Data Set', which was obtained from UCI’s Machine Learning Repository. Each record in the dataset represents a single hospitalization for a specific anonymized patient. The thesis explores whether an effective trinary classification model can be built to predict the likelihood of patient readmissions. The intended model would classify patient hospitalization records into three categories: those likely to be readmitted in under thirty days of discharge, those likely to be readmitted more than 30 days after discharge, and those likely to not be readmitted. The study determined it is possible to build a model that predicts the three classification categories with lower misclassification costs than the baseline model. A global segmentation model was built using Classification and Regression Modeling, with misclassification costs applied, on eight clusters in the data identified using a 1x9 Kohonen network. 'Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Data Mining.; Thesis advisor: Daniel Larose.; M.S.,Central Connecticut State University,,2016.; Advisors/Committee Members: Larose, Daniel T..