Author ORCID Identifier

0000-0002-6562-9825

Date of Award

8-8-2023

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Wei Li

Second Advisor

Zhipeng Cai

Third Advisor

Yingshu Li

Fourth Advisor

Yanqing Wang

Abstract

Under the need of processing huge amounts of data, providing high-quality service, and protecting user privacy in Artificial Intelligence of Things (AIoT), Federated Learning (FL) has been adopted as a promising technique to facilitate its broad applications. Although the importance of developing privacy-preserving FL has attracted lots of attention in different aspects, the existing research is still far from perfect in real applications. In this dissertation, we propose three privacy-related research accordingly towards three realistic weaknesses of federated learning in the AIoT scenarios, which solve the problems of private data inference, private data generation, and private data deletion in different stages of data life. First, to solve the privacy inference problem of traditional FL, we design a dual differentially private FL mechanism to achieve privacy preservation efficiently for both server side and local clients. In particular, our proposed method focuses on FL with non-independent identically distributed (non-i.i.d.) data distribution and gives theoretical analysis on privacy leakage as well as algorithm convergence. The second problem is to generate heterogeneous data privately in FL. To handle this challenging problem, we design a distributed generative model framework that can learn a powerful generator in hierarchical AIoT systems. Thirdly, we investigate a newly emerged machine unlearning problem, which is to remove a data point and its influence from the trained machine learning model with efficiency and effectiveness. Moreover, as the very first work on exact federated machine unlearning in literature, we design a quantization based method, which can remove unlearned data from multiple clients with significantly higher speed-up. All of the proposed methods are evaluated on different datasets, and the results output by our models express superiority over existing baselines.

DOI

https://doi.org/10.57709/35862659

File Upload Confirmation

1

Share

COinS