Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Zhipeng Cai

Second Advisor

Yingshu Li

Third Advisor

Wei Li

Fourth Advisor

Yan Huang


Social media data has become an invaluable source of information for data mining. However, developing a high-utility social media model requires a significant amount of training data, which can pose significant privacy challenges. The collection and use of social media data can lead to privacy rights violations and the misuse of personal information, making the trade-off between utility and privacy a complex issue. This dissertation examines the trade-off between utility and privacy in social media data mining from several perspectives. Firstly, it explores how to balance the robustness and fidelity of the social media data mining model in the design of the model structure. Specifically, the study analyzes the use of a pairwise graph convolutional network structure to enhance the model's resistance to adversarial attacks while maintaining accuracy. Secondly, the study examines the trade-off between privacy and utility of social media data in the training framework. To do this, it uses a federated learning framework to investigate the impact of centralizing or decentralizing training on privacy protection and model performance. Finally, the dissertation focuses solely on graph de-anonymization and presents a neural network-based approach to this issue. It explores ways to improve the efficiency and performance of graph de-anonymization through graph embedding vectors. The dissertation also includes a significant number of experiments to validate the feasibility of the proposed framework from both utility and privacy perspectives. The results demonstrate that an appropriate model or framework design can reasonably balance the privacy and utility of social media data mining.


File Upload Confirmation