Date of Award

12-16-2020

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Yi Pan

Second Advisor

Yanqing Zhang

Third Advisor

Rolando Estrada

Fourth Advisor

Yichuan Zhao

Abstract

Deep learning has achieved great performance in various areas, such as computer vision, natural language processing, and speech recognition. In this research, we design methods to improve the prediction performance and decrease training time of deep learning models. We first propose an efficient evolutionary algorithm (EA) to automatically tune hyperparameters in a deep learning model in Chapter 2. We use a variable length genetic algorithm (GA) to systematically and automatically tune the hyperparameters of a Convolutional Neural Network (CNN) to improve its performance. Experiment results show that our algorithm can find good CNN model hyperparameters efficiently. In Chapter 3, we propose a method to intelligently freeze layers during the training process to decrease training time. Our method involves designing a formula to calculate normalized gradient differences for all layers with weights in the model and then use the calculated values to decide how many layers should be frozen. We implemented our method on top of stochastic gradient descent (SGD) and performed experiments on standard image classification dataset CIFAR-10. Results show that our method can accelerate training on VGG nets, ResNets, and DenseNets while having similar test accuracy. Next, in Chapter 4, we propose to incorporate prior knowledge into the training process to improve classification accuracy. We incorporate class similarity knowledge into CNN models using a graph convolution layer. We evaluate our method on two benchmark image datasets: MNIST and CIFAR-10 and analyze the results on different data and model sizes. Experimental results show that our model can improve classification accuracy, especially when the amount of available data is small. In Chapter 5, we map Electronic Health Records (EHRs) to images and feed them to CNNs for feature relationship learning. The relationships between EHR features are quantitatively measured before mapping to images. We add this relationship-learning part as a boosting module on the original machine learning model. Experimental results show that our proposed models have better performance compared with the baseline models. In summary, this research proposes various methods to improve the training time and performance of deep learning models.

DOI

https://doi.org/10.57709/20465697

File Upload Confirmation

1

Share

COinS