Date of Award

Summer 8-11-2020

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Rajshekhar Sunderraman

Second Advisor

Michael Weeks

Third Advisor

Yanqing Zhang

Fourth Advisor

Yi Jiang


With the rise of very powerful hardware and evolution of deep learning architectures, healthcare data analysis and its applications have been drastically transformed. These transformations mainly aim to aid a healthcare personnel with diagnosis and prognosis of a disease or abnormality at any given point of healthcare routine workflow. For instance, many of the cancer metastases detection depends on pathological tissue procedures and pathologist reviews. The reports of severity classification vary amongst different pathologist, which then leads to different treatment options for a patient. This labor-intensive work can lead to errors or mistreatments resulting in high cost of healthcare. With the help of machine learning and deep learning modules, some of these traditional diagnosis techniques can be improved and aid a doctor in decision making with an unbiased view. Some of such modules can help reduce the cost, shortage of an expertise, and time in identifying the disease.

However, there are many other datapoints that are available with medical images, such as omics data, biomarker calculations, patient demographics and history. All these datapoints can enhance disease classification or prediction of progression with the help of machine learning/deep learning modules. However, it is very difficult to find a comprehensive dataset with all different modalities and features in healthcare setting due to privacy regulations. Hence in this thesis, we explore both medical imaging data with clinical datapoints as well as genomics datasets separately for classification tasks using combinational deep learning architectures. We use deep neural networks with 3D volumetric structural magnetic resonance images of Alzheimer Disease dataset for classification of disease. A separate study is implemented to understand classification based on clinical datapoints achieved by machine learning algorithms. For bioinformatics applications, sequence classification task is a crucial step for many metagenomics applications, however, requires a lot of preprocessing that requires sequence assembly or sequence alignment before making use of raw whole genome sequencing data, hence time consuming especially in bacterial taxonomy classification. There are only a few approaches for sequence classification tasks that mainly involve some convolutions and deep neural network. A novel method is developed using an intrinsic nature of recurrent neural networks for 16s rRNA sequence classification which can be adapted to utilize read sequences directly. For this classification task, the accuracy is improved using optimization techniques with a hybrid neural network.

File Upload Confirmation