Date of Award

Summer 8-8-2024

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Rafal Angryk

Second Advisor

Dustin Kempton

Abstract

Most classification techniques assume a uniform distribution of training data classes. However, well-balanced data is rare, with infrequent events often being the most valuable. This well known class imbalance issue severely hinders the performance of supervised algorithms, limiting their ability to accurately predict minority classes and resulting in analyses that lack practical operational value. Generative models in machine learning are designed to learn the underlying patterns or distribution of existing data, enabling them to generate new data that resembles the original dataset. Inspired by the success of Generative Adversarial Networks (GANs) in synthetic image generation, we employed Conditional GAN (CGAN) to generate synthetic multivariate time series data on a benchmark dataset for solar flare forecasting. Our experiments show that the synthetic data produced is statistically comparable to real data and and effective in addressing class imbalance. However, CGANs, typically trained on well-processed and balanced datasets like MNIST and CIFAR-10, face challenges when trained with imbalanced datasets. These challenges can negatively impact CGAN performance, reducing the quality of synthetic samples, especially for minority class(es). To handle this issue, we propose a Two-stage CGAN framework that enhances the quality and diversity of synthetic samples for minority classes in both image and time series generation tasks. We also introduce FFAD, a novel internal evaluation metric specifically designed to assess the fidelity of synthetic time series data and evaluate generative model performance. Our comprehensive evaluation on solar flare forecasting demonstrates the efficacy of our CGAN-based approach in mitigating class imbalance issues, highlighting its potential to enhance predictive capabilities for rare but critical events across various domains.

DOI

https://doi.org/10.57709/37354731

File Upload Confirmation

1

Share

COinS