Author ORCID Identifier

0000-0003-3417-7106

Date of Award

12-12-2022

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Shihao Ji

Abstract

In recent years, deep neural networks (DNNs) have achieved state-of-the-art performance on a wide range of learning tasks. Among those tasks, two fundamental tasks are discriminative models and generative models. However, they are largely separated although prior works have shown that generative training is beneficial to classifiers to alleviate several notorious issues. Energy-based Model (EBM) especially the Joint Energy-based Model(JEM) only needs to train a single network with shared features for discriminative and generative tasks. However, EBMs are expensive to train and very unstable. It is crucial to understand the behavior of EBM training and thus improve the stability, speed, accuracy, and generative quality altogether. This dissertation mainly summarizes my research on EBMs for Hybrid Image Discriminative-Generative Models. We first proposed GMMC which models the joint density p(x, y). As an alternative to the SoftMax classifier utilized in JEM, GMMC has a well-formulated latent feature distribution, which fits well with the generative process of image synthesis. Then we came up with a variety of new training techniques to improve JEM's accuracy, training stability, and speed altogether, and we named it JEM++. Based on JEM++, we analyzed and improved it from three different aspects, 1) the manifold, 2) the data augmentation, 3) the energy landscape. Hence, we propose Manifold-Aware EBM/JEM and Sharpness-Aware JEM to further improve the speed, generation quality, stability, and classification significantly. Beyond MCMC-based EBM, we found we can combine two recent emergent approaches Vision Transformer (ViT) and Denoising Diffusion Probabilistic Model (DDPM) to learn a simple but powerful model for image classification and generation. The new direction can get rid of most disadvantages of EBM, such as the expensive MCMC sampling and instability. Finally, we discuss future research topics including the speed, generation quality, and applications of hybrid models.

File Upload Confirmation

1

Share

COinS