Author ORCID Identifier
Date of Award
8-7-2024
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Advisor
Shihao Ji
Second Advisor
Rajshekhar Sunderraman
Third Advisor
Murray Patterson
Fourth Advisor
Wenzhan Song
Abstract
Deep Neural Networks (DNNs) have achieved significant success across various applications. However, the increasing number of parameters in state-of-the-art architectures presents challenges such as overfitting and high computational costs. Additionally, with the rising adoption of large language models (LLMs) and the growing demand for per-user or per-task model customization, parameter-efficient fine-tuning has become crucial. Consequently, the exploration of neural network efficiency has emerged as a vibrant and dynamic research area, focusing on optimizing model performance while minimizing resource usage.
This dissertation explores neural network efficiency in two directions: pruning and parameter-efficient fine-tuning. Three novel pruning algorithms—L0-ARM, NPN, and Dep-L0—are introduced. L0-ARM enhances L0-based pruning with the Augment-Reinforce-Merge gradient estimator, demonstrating superior performance in sparsifying networks. Building on L0-ARM, the Neural Plasticity Network (NPN) enables both network pruning and expansion within the same framework. To address the inconsistencies of L0-based methods on large-scale tasks, Dep-L0 introduces dependency-enabled L0 regularization, leveraging dependency modeling for binary gates.
In the realm of parameter-efficient fine-tuning (PEFT), this dissertation introduces VB-LoRA, which implements a novel "divide-and-share" paradigm to address the limitations of low-rank decomposition across matrix dimensions, modules, and layers by globally sharing parameters through a vector bank. The proposed VB-LoRA method composites all low-rank matrices of LoRA from a shared vector bank using a differentiable top-k admixture module. This approach enables VB-LoRA to achieve extreme parameter efficiency while maintaining performance that is comparable to or better than state-of-the-art PEFT methods.
DOI
https://doi.org/10.57709/37370410
Recommended Citation
Li, Yang, "Towards Pruning and Parameter Efficient Fine-tuning of Deep Neural Networks." Dissertation, Georgia State University, 2024.
doi: https://doi.org/10.57709/37370410
File Upload Confirmation
1