Pacific Ocean
Multi-view Matrix Factorization for Linear Dynamical System Estimation
Mahdi Karami, Martha White, Dale Schuurmans, Csaba Szepesvari
We consider maximum likelihood estimation of linear dynamical systems with generalized-linear observation models. Maximum likelihood is typically considered to be hard in this setting since latent states and transition parameters must be inferred jointly. Given that expectation-maximization does not scale and is prone to local minima, moment-matching approaches from the subspace identification literature have become standard, despite known statistical efficiency issues. In this paper, we instead reconsider likelihood maximization and develop an optimization based strategy for recovering the latent states and transition parameters. Key to the approach is a two-view reformulation of maximum likelihood estimation for linear dynamical systems that enables the use of global optimization algorithms for matrix factorization. We show that the proposed estimation strategy outperforms widely-used identification algorithms such as subspace identification methods, both in terms of accuracy and runtime.
Metadata Matters for Time Series: Informative Forecasting with Transformers
Dong, Jiaxiang, Wu, Haixu, Wang, Yuxuan, Zhang, Li, Wang, Jianmin, Long, Mingsheng
Time series forecasting is prevalent in extensive real-world applications, such as financial analysis and energy planning. Previous studies primarily focus on time series modality, endeavoring to capture the intricate variations and dependencies inherent in time series. Beyond numerical time series data, we notice that metadata (e.g. Inspired by this observation, we propose a Metadata-informed Time Series Transformer (MetaTST), which incorporates multiple levels of context-specific metadata into Transformer forecasting models to enable informative time series forecasting. To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates and leverages large language models (LLMs) to encode these texts into metadata tokens as a supplement to classic series tokens, resulting in an informative embedding. Further, a Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information for more accurate forecasting. This design also allows the model to adaptively learn context-specific patterns across various scenarios, which is particularly effective in handling large-scale, diverse-scenario forecasting tasks. Experimentally, MetaTST achieves state-of-the-art compared to advanced time series models and LLM-based methods on widely acknowledged short-and long-term forecasting benchmarks, covering both single-dataset individual and multi-dataset joint training settings. Time series forecasting is of increasing demand in real-world scenarios encompassing diverse domains, including energy, transportation, and meteorology (Weron, 2014; Lv et al., 2014; Wu et al., 2021; Wang et al., 2024b). Motivated by the substantial practical value, deep time series models have been widely explored and achieved significant advancements, where diverse techniques are developed to capture temporal variations from historical observations for future prediction (Salinas et al., 2020; Nie et al., 2023; Liu et al., 2024a; Dong et al., 2024). Despite the success in uncovering intricate temporal patterns, relying solely on the sequence of observation values can be insufficient to guarantee accurate forecasting. Taking the example of traffic forecasting, two crossroads may exhibit similar patterns in the morning peak but will present disparate future trends due to the closing times of nearby companies.
Autoregressive Moving-average Attention Mechanism for Time Series Forecasting
Lu, Jiecheng, Han, Xu, Sun, Yan, Yang, Shihao
We propose an Autoregressive (AR) Moving-average (MA) attention structure that can adapt to various linear attention mechanisms, enhancing their ability to capture long-range and local temporal patterns in time series. In this paper, we first demonstrate that, for the time series forecasting (TSF) task, the previously overlooked decoder-only autoregressive Transformer model can achieve results comparable to the best baselines when appropriate tokenization and training methods are applied. Moreover, inspired by the ARMA model from statistics and recent advances in linear attention, we introduce the full ARMA structure into existing autoregressive attention mechanisms. By using an indirect MA weight generation method, we incorporate the MA term while maintaining the time complexity and parameter size of the underlying efficient attention models. We further explore how indirect parameter generation can produce implicit MA weights that align with the modeling requirements for local temporal impacts. Experimental results show that incorporating the ARMA structure consistently improves the performance of various AR attentions on TSF tasks, achieving state-of-the-art results.
BACKTIME: Backdoor Attacks on Multivariate Time Series Forecasting
Lin, Xiao, Liu, Zhining, Fu, Dongqi, Qiu, Ruizhong, Tong, Hanghang
Multivariate Time Series (MTS) forecasting is a fundamental task with numerous real-world applications, such as transportation, climate, and epidemiology. While a myriad of powerful deep learning models have been developed for this task, few works have explored the robustness of MTS forecasting models to malicious attacks, which is crucial for their trustworthy employment in high-stake scenarios. To address this gap, we dive deep into the backdoor attacks on MTS forecasting models and propose an effective attack method named BackTime.By subtly injecting a few stealthy triggers into the MTS data, BackTime can alter the predictions of the forecasting model according to the attacker's intent. Specifically, BackTime first identifies vulnerable timestamps in the data for poisoning, and then adaptively synthesizes stealthy and effective triggers by solving a bi-level optimization problem with a GNN-based trigger generator. Extensive experiments across multiple datasets and state-of-the-art MTS forecasting models demonstrate the effectiveness, versatility, and stealthiness of \method{} attacks. The code is available at \url{https://github.com/xiaolin-cs/BackTime}.
MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters
Ma, Aitian, Luo, Dongsheng, Sha, Mo
Recently, there has been a growing interest in Long-term Time Series Forecasting (LTSF), which involves predicting long-term future values by analyzing a large amount of historical time-series data to identify patterns and trends. There exist significant challenges in LTSF due to its complex temporal dependencies and high computational demands. Although Transformer-based models offer high forecasting accuracy, they are often too compute-intensive to be deployed on devices with hardware constraints. On the other hand, the linear models aim to reduce the computational overhead by employing either decomposition methods in the time domain or compact representations in the frequency domain. In this paper, we propose MixLinear, an ultra-lightweight multivariate time series forecasting model specifically designed for resource-constrained devices. MixLinear effectively captures both temporal and frequency domain features by modeling intra-segment and inter-segment variations in the time domain and extracting frequency variations from a low-dimensional latent space in the frequency domain. By reducing the parameter scale of a downsampled $n$-length input/output one-layer linear model from $O(n^2)$ to $O(n)$, MixLinear achieves efficient computation without sacrificing accuracy. Extensive evaluations with four benchmark datasets show that MixLinear attains forecasting performance comparable to, or surpassing, state-of-the-art models with significantly fewer parameters ($0.1K$), which makes it well-suited for deployment on devices with limited computational capacity.
MMFNet: Multi-Scale Frequency Masking Neural Network for Multivariate Time Series Forecasting
Ma, Aitian, Luo, Dongsheng, Sha, Mo
Long-term Time Series Forecasting (LTSF) is critical for numerous real-world applications, such as electricity consumption planning, financial forecasting, and disease propagation analysis. LTSF requires capturing long-range dependencies between inputs and outputs, which poses significant challenges due to complex temporal dynamics and high computational demands. While linear models reduce model complexity by employing frequency domain decomposition, current approaches often assume stationarity and filter out high-frequency components that may contain crucial short-term fluctuations. In this paper, we introduce MMFNet, a novel model designed to enhance long-term multivariate forecasting by leveraging a multi-scale masked frequency decomposition approach. Extensive experimentation with benchmark datasets shows that MMFNet not only addresses the limitations of the existing methods but also consistently achieves good performance. Specifically, MMFNet achieves up to 6.0% reductions in the Mean Squared Error (MSE) compared to state-of-the-art models designed for multivariate forecasting tasks. Time series forecasting is pivotal in a wide range of domains, such as environmental monitoring (Bhandari et al., 2017), electrical grid management (Zufferey et al., 2017), financial analysis (Sezer et al., 2020), and healthcare (Zeroual et al., 2020). Accurate long-term forecasting is essential for informed decision-making and strategic planning. Traditional methods, such as autoregressive (AR) models (Nassar et al., 2004), exponential smoothing (Hyndman & Athanasopoulos, 2008), and structural time series models (Harvey, 1989), have provided a robust foundation for time series analysis by leveraging historical data to predict future values. However, real-world systems frequently exhibit complex, non-stationary behavior, with time series characterized by intricate patterns such as trends, fluctuations, and cycles.
TiVaT: Joint-Axis Attention for Time Series Forecasting with Lead-Lag Dynamics
Ha, Junwoo, Kwon, Hyukjae, Kim, Sungsoo, Lee, Kisu, Kim, Ha Young
Multivariate time series (MTS) forecasting plays a crucial role in various realworld applications, yet simultaneously capturing both temporal and inter-variable dependencies remains a challenge. Conventional Channel-Dependent (CD) models handle these dependencies separately, limiting their ability to model complex interactions such as lead-lag dynamics. To address these limitations, we propose TiVaT (Time-Variable Transformer), a novel architecture that integrates temporal and variate dependencies through its Joint-Axis (JA) attention mechanism. Ti-VaT's ability to capture intricate variate-temporal dependencies, including asynchronous interactions, is further enhanced by the incorporation of Distance-aware Time-Variable (DTV) Sampling, which reduces noise and improves accuracy through a learned 2D map that focuses on key interactions. Notably, it excels in capturing complex patterns within multivariate time series, enabling it to surpass or remain competitive with state-of-the-art methods. This positions TiVaT as a new benchmark in MTS forecasting, particularly in handling datasets characterized by intricate and challenging dependencies. However, handling both temporal and inter-variable dependencies in MTS remains a challenge. MTS models are typically classified as either Channel-Independent (CI) or Channel-Dependent (CD) based on how they handle inter-variable relationships. CI models process variables independently, which makes them resilient to noise and overfitting but neglects crucial inter-variable dependencies required for complex datasets. Recent CD models, such as iTransformer (Liu et al., 2023) and CARD (Wang et al., 2024b), use Transformer architectures to model these dependencies, improving predictive accuracy.
BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation
Li, Bryan, Haider, Samar, Luo, Fiona, Agashe, Adwait, Callison-Burch, Chris
Large language models excel at creative generation but continue to struggle with the issues of hallucination and bias. While retrieval-augmented generation (RAG) provides a framework for grounding LLMs' responses in accurate and up-to-date information, it still raises the question of bias: which sources should be selected for inclusion in the context? And how should their importance be weighted? In this paper, we study the challenge of cross-lingual RAG and present a dataset to investigate the robustness of existing systems at answering queries about geopolitical disputes, which exist at the intersection of linguistic, cultural, and political boundaries. Our dataset is sourced from Wikipedia pages containing information relevant to the given queries and we investigate the impact of including additional context, as well as the composition of this context in terms of language and source, on an LLM's response. Our results show that existing RAG systems continue to be challenged by cross-lingual use cases and suffer from a lack of consistency when they are provided with competing information in multiple languages. We present case studies to illustrate these issues and outline steps for future research to address these challenges. We make our dataset and code publicly available at https://github.com/manestay/bordIRlines.
After meeting, Blinken says Beijing's talk of Ukraine peace 'doesn't add up'
U.S. Secretary of State Antony Blinken underscored strong U.S. concerns about China's support for Russia's defense industrial base in talks Friday with Chinese Foreign Minister Wang Yi, saying Beijing's talk of peace in Ukraine "doesn't add up." In a meeting with Wang on the sidelines of the U.N. General Assembly in New York, Blinken said he also raised China's "dangerous and destabilizing actions" in the South China Sea and discussed improving communication between their militaries. Blinken told a news conference he and Wang also discussed ways to disrupt the flow of drugs into the United States, and the risks posed by artificial intelligence.
Automated conjecturing in mathematics with \emph{TxGraffiti}
\emph{TxGraffiti} is a data-driven, heuristic-based computer program developed to automate the process of generating conjectures across various mathematical domains. Since its creation in 2017, \emph{TxGraffiti} has contributed to numerous mathematical publications, particularly in graph theory. In this paper, we present the design and core principles of \emph{TxGraffiti}, including its roots in the original \emph{Graffiti} program, which pioneered the automation of mathematical conjecturing. We describe the data collection process, the generation of plausible conjectures, and methods such as the \emph{Dalmatian} heuristic for filtering out redundant or transitive conjectures. Additionally, we highlight its contributions to the mathematical literature and introduce a new web-based interface that allows users to explore conjectures interactively. While we focus on graph theory, the techniques demonstrated extend to other areas of mathematics.