Interval-based Multivariate Time Series Analysis with Applications to Space Weather Forecasting
Ji, Anli
Citations
Abstract
Time series analysis, a foundational field within data mining, is dedicated to extracting patterns in sequential data gathered over time. While its sequential nature is crucial for exploration in event forecasting, its complexity grows with multivariate data due to the temporal dependencies and unknown inter-variable correlations among individual feature spaces. Identifying meaningful patterns, especially those embedded within shorter sub-sequences or localized intervals, becomes particularly challenging.
This dissertation presents a comprehensive study of interval-based multivariate time series analysis. Building upon the Time Series Forest (TSF) model, this research proposes a novel variant, termed SLIiding Window Multivariate Time Series Forest (SLIM-TSF). This advanced model integrates global feature extraction by using sliding windows to capture multi-scale temporal dependencies and interaction information that are often missed in standard analyses. Statistical attributes like mean, standard deviation, and slope are applied to summarize each segment's (i.e., interval) distribution characteristics, enabling detailed monitoring of local variations and more accurate identification of patterns that might not be visible at a global scale, such as sudden changes or specific events.
The effectiveness of SLIM-TSF is demonstrated through extensive case studies on real-world solar event datasets from the SWAN-SF benchmark, highlighting the practical relevance and scientific applicability. Results confirm that the model improves classification capabilities in multivariate feature spaces, outperforming traditional methods. Furthermore, this research enhances interpretability by using Gini and permutation-based ranking to pinpoint the most influential features and intervals. The methodologies proposed contribute to the advancement of time series classification, offering a deeper understanding of feature-driven interval analysis in high-dimensional temporal datasets.
