Date of Award
Summer 8-8-2024
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science
First Advisor
Rafal Angryk
Second Advisor
Dustin Kempton
Abstract
Most classification techniques assume a uniform distribution of training data classes. However, well-balanced data is rare, with infrequent events often being the most valuable. This well known class imbalance issue severely hinders the performance of supervised algorithms, limiting their ability to accurately predict minority classes and resulting in analyses that lack practical operational value. Generative models in machine learning are designed to learn the underlying patterns or distribution of existing data, enabling them to generate new data that resembles the original dataset. Inspired by the success of Generative Adversarial Networks (GANs) in synthetic image generation, we employed Conditional GAN (CGAN) to generate synthetic multivariate time series data on a benchmark dataset for solar flare forecasting. Our experiments show that the synthetic data produced is statistically comparable to real data and and effective in addressing class imbalance. However, CGANs, typically trained on well-processed and balanced datasets like MNIST and CIFAR-10, face challenges when trained with imbalanced datasets. These challenges can negatively impact CGAN performance, reducing the quality of synthetic samples, especially for minority class(es). To handle this issue, we propose a Two-stage CGAN framework that enhances the quality and diversity of synthetic samples for minority classes in both image and time series generation tasks. We also introduce FFAD, a novel internal evaluation metric specifically designed to assess the fidelity of synthetic time series data and evaluate generative model performance. Our comprehensive evaluation on solar flare forecasting demonstrates the efficacy of our CGAN-based approach in mitigating class imbalance issues, highlighting its potential to enhance predictive capabilities for rare but critical events across various domains.
DOI
https://doi.org/10.57709/37354731
Recommended Citation
Chen, Yang, "Mitigating Class Imbalance in Time Series Classification via Generative Modeling." Dissertation, Georgia State University, 2024.
doi: https://doi.org/10.57709/37354731
File Upload Confirmation
1