Author ORCID Identifier

https://orcid.org/0000-0002-8347-8274

Date of Award

Summer 8-7-2024

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Zhipeng Cai

Second Advisor

Daniel Takabi

Third Advisor

Wei Li

Fourth Advisor

Yi Ding

Abstract

Machine learning (ML) has emerged as a transformative technology with extensive applications in diverse fields such as computer vision, natural language processing, and audio and speech processing. The ML pipeline generally consists of two fundamental phases: training, where models are trained using labeled data, and inference, where these trained models are deployed to analyze and make predictions on unseen data. Neural networks, renowned for their exceptional performance, are extensively adopted in practice and necessitate large volumes of data for effective training. Although major technology companies provide Machine Learning as a Service (MLaaS) to facilitate access to ML services, these services present substantial privacy concerns, particularly when dealing with sensitive data such as health records or financial information. Regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) further emphasize the need for privacy protection within MLaaS platforms.

This dissertation addresses these critical privacy concerns by proposing novel frameworks to ensure the privacy of user data throughout the machine learning computations, with a particular emphasis on the training phase. We explore the integration of advanced cryptographic techniques, specifically functional encryption (FE) and fully homomorphic encryption (FHE), into the ML workflow. One significant contribution of this work is the development of secure activation functions and the enhancement of FE's performance for privacy-preserving machine learning (PPML). This approach enables the execution of ML algorithms on encrypted data, thereby safeguarding sensitive information. This facilitates the development of a privacy-preserving neural network pipeline specifically designed for image classification tasks. This pipeline leverages FE to perform computations on encrypted data, ensuring that the underlying sensitive information remains confidential.

Additionally, we address the challenges associated with FE-based machine learning by developing fine-tuning systems using FHE. These systems enable the adaptation of pre-trained models to new tasks while maintaining data privacy. Finally, this work extends the application of privacy-preserving techniques to modern ML architectures, including vision transformer models and large language models (LLMs). By incorporating privacy-preserving mechanisms into these advanced models, we demonstrate the feasibility and effectiveness of our proposed systems in safeguarding user data across a wide range of ML applications. The findings of this dissertation significantly advance the field of privacy-preserving machine learning, providing a foundation for privacy-preserving MLaaS deployments and paving the way for future research in this critical area.

DOI

https://doi.org/10.57709/37395685

File Upload Confirmation

1

Available for download on Saturday, July 26, 2025

Share

COinS