Artificial Intelligence Approaches for Financial Cybercrime Information Analysis
Gao, Chunlan
Citations
Abstract
Artificial Intelligence (AI) has become an indispensable tool in combating cybercrime. Through machine learning and deep learning techniques, AI systems can automatically detect fraudulent activities by recognizing patterns, identifying anomalies, and predicting potential security threats. While most existing research on financial fraud focuses on structured datasets such as transaction records and financial statements, this dissertation targets a more complex and challenging data source—financial fraud-related content on Telegram. Telegram data is loosely formatted, combining structured patterns (e.g., hashtags, prices) with unstructured text, slang, emojis, and embedded images, which poses unique challenges for automated analysis and classification. This dissertation presents four interrelated studies on AI-driven financial fraud detection. The first introduces AutoCut-2D, a feature selection method that adaptively determines cut-off thresholds across multiple dimensions of feature importance, improving prediction accuracy and significantly reducing computational cost. The second study focuses on fraud category classification of Telegram messages, comparing diverse embedding and machine learning techniques to enhance model performance. The third extends this research into multimodal learning, integrating BERT-based textual embeddings with Swin Transformer-based visual embeddings through attention-based fusion, achieving substantially higher accuracy than using either modality alone in identifying fraudulent advertisements. The fourth study advances the research toward Knowledge Base Construction (KBC) from Telegram messages. A weakly supervised extraction pipeline is proposed to derive structured triples such as (brand, original price, discount price) from loosely formatted content using a combination of rule-based heuristics and machine learning methods. This KBC framework effectively bridges the gap between unstructured communication and structured financial intelligence, providing a scalable and interpretable foundation for automated cybercrime investigation and future knowledge-driven fraud analysis.
