AITopics | crossformer

Collaborating Authors

crossformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context

Neural Information Processing SystemsApr-24-2026, 10:31:13 GMT

Nonetheless, the inherent variability of solar irradiance poses a significant challenge for seamlessly integrating solar power into the electrical grid. While the majority of prior research has centered on employing purely time series-based methodologies for solar forecasting, only a limited number of studies have taken into account factors such as cloud cover or the surrounding physical context. In this paper, we put forth a deep learning architecture designed to harness spatio-temporal context using satellite data, to attain highly accurate day-ahead time-series forecasting for any given station, with a particular emphasis on forecasting Global Horizontal Irradiance (GHI). We also suggest a methodology to extract a distribution for each time step prediction, which can serve as a very valuable measure of uncertainty attached to the forecast. When evaluating models, we propose a testing scheme in which we separate particularly difficult examples from easy ones, in order to capture the model performances in crucial situations, which in the case of this study are the days suffering from varying cloudy conditions. Furthermore, we present a new multi-modal dataset gathering satellite imagery over a large zone and time series for solar irradiance and other related physical variables from multiple geographically diverse solar stations. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective integration of solar power into the grid.

forecasting, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada > Quebec (0.28)
Africa > Middle East (0.28)
Europe > Spain (0.28)

Genre: Research Report (0.93)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

AddressingSpatial-Temporal Heterogeneity: GeneralMixedTimeSeriesAnalysisviaLatent ContinuityRecoveryandAlignment

Neural Information Processing SystemsFeb-19-2026, 03:55:10 GMT

Empirically, MiTSformer achieves consistent SOTAonfivemixedtime series analysis tasks, including classification, extrinsic regression,anomalydetection,imputation,andlong-termforecasting.

machine learning, mitsformer, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

070a57c5ef1e58cc90201b11d369b3c2-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 12:26:19 GMT

crossvivit, forecasting, prediction, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Africa > Middle East > Algeria > Tamanrasset Province > Tamanrasset (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(2 more...)

Genre: Research Report (0.93)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.48)
Government > Regional Government > North America Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
(3 more...)

Add feedback

Stiff Circuit System Modeling via Transformer

Yan, Weiman, Chang, Yi-Chia, Zhao, Wanyu

arXiv.org Artificial IntelligenceOct-30-2025

Accurate and efficient circuit behavior modeling is a cornerstone of modern electronic design automation. Among different types of circuits, stiff circuits are challenging to model using previous frameworks. In this work, we propose a new approach using Crossformer, which is a current state-of-the-art Transformer model for time-series prediction tasks, combined with Kolmogorov-Arnold Networks (KANs), to model stiff circuit transient behavior. By leveraging the Crossformer's temporal representation capabilities and the enhanced feature extraction of KANs, our method achieves improved fidelity in predicting circuit responses to a wide range of input conditions. Experimental evaluations on datasets generated through SPICE simulations of analog-to-digital converter (ADC) circuits demonstrate the effectiveness of our approach, with significant reductions in training time and error rates.

artificial intelligence, crossformer, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.24727

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

A Comparative Study of Transformer-Based Models for Multi-Horizon Blood Glucose Prediction

Karagoz, Meryem Altin, Breton, Marc D., Fathi, Anas El

arXiv.org Artificial IntelligenceMay-15-2025

Accurate blood glucose prediction can enable novel interventions for type 1 diabetes treatment, including personalized insulin and dietary adjustments. Although recent advances in transformer-based architectures have demonstrated the power of attention mechanisms in complex multivariate time series prediction, their potential for blood glucose (BG) prediction remains underexplored. We present a comparative analysis of transformer models for multi-horizon BG prediction, examining forecasts up to 4 hours and input history up to 1 week. The publicly available DCLP3 dataset (n=112) was split (80%-10%-10%) for training, validation, and testing, and the OhioT1DM dataset (n=12) served as an external test set. We trained networks with point-wise, patch-wise, series-wise, and hybrid embeddings, using CGM, insulin, and meal data. For short-term blood glucose prediction, Crossformer, a patch-wise transformer architecture, achieved a superior 30-minute prediction of RMSE (15.6 mg / dL on OhioT1DM). For longer-term predictions (1h, 2h, and 4h), PatchTST, another path-wise transformer, prevailed with the lowest RMSE (24.6 mg/dL, 36.1 mg/dL, and 46.5 mg/dL on OhioT1DM). In general, models that used tokenization through patches demonstrated improved accuracy with larger input sizes, with the best results obtained with a one-week history. These findings highlight the promise of transformer-based architectures for BG prediction by capturing and leveraging seasonal patterns in multivariate time-series data to improve accuracy.

machine learning, natural language, prediction, (20 more...)

arXiv.org Artificial Intelligence

2505.08821

Country: North America > United States > Virginia (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CrossFormer: Cross-Segment Semantic Fusion for Document Segmentation

Ni, Tongke, Fan, Yang, Zhou, Junru, Wu, Xiangping, Chen, Qingcai

arXiv.org Artificial IntelligenceApr-2-2025

Text semantic segmentation involves partitioning a document into multiple paragraphs with continuous semantics based on the subject matter, contextual information, and document structure. Traditional approaches have typically relied on preprocessing documents into segments to address input length constraints, resulting in the loss of critical semantic information across segments. To address this, we present CrossFormer, a transformer-based model featuring a novel cross-segment fusion module that dynamically models latent semantic dependencies across document segments, substantially elevating segmentation accuracy. Additionally, CrossFormer can replace rule-based chunk methods within the Retrieval-Augmented Generation (RAG) system, producing more semantically coherent chunks that enhance its efficacy. Comprehensive evaluations confirm CrossFormer's state-of-the-art performance on public text semantic segmentation datasets, alongside considerable gains on RAG benchmarks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.23671

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
South America > Colombia > Bolivar Department > Cartagena (0.04)
North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
(12 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Channel Dependence, Limited Lookback Windows, and the Simplicity of Datasets: How Biased is Time Series Forecasting?

Abdelmalak, Ibram, Madhusudhanan, Kiran, Choi, Jungmin, Stubbemann, Maximilian, Schmidt-Thieme, Lars

arXiv.org Artificial IntelligenceFeb-13-2025

Time-series forecasting research has converged to a small set of datasets and a standardized collection of evaluation scenarios. Such a standardization is to a specific extent needed for comparable research. However, the underlying assumption is, that the considered setting is a representative for the problem as a whole. In this paper, we challenge this assumption and show that the current scenario gives a strongly biased perspective on the state of time-series forecasting research. To be more detailed, we show that the current evaluation scenario is heavily biased by the simplicity of the current datasets. We furthermore emphasize, that when the lookback-window is properly tuned, current models usually do not need any information flow across channels. However, when using more complex benchmark data, the situation changes: Here, modeling channel-interactions in a sophisticated manner indeed enhances performances. Furthermore, in this complex evaluation scenario, Crossformer, a method regularly neglected as an important baseline, is the SOTA method for time series forecasting. Based on this, we present the Fast Channel-dependent Transformer (FaCT), a simplified version of Crossformer which closes the runtime gap between Crossformer and TimeMixer, leading to an efficient model for complex forecasting datasets.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.09683

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Lower Saxony (0.04)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models

Xu, Jingjing, Wu, Caesar, Li, Yuan-Fang, Danoy, Grégoire, Bouvry, Pascal

arXiv.org Artificial IntelligenceJan-2-2025

Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility. However, these models often require extensive hyperparameter optimization (HPO) to achieve the best possible performance, and a unified pipeline for HPO in transformer-based TSF remains lacking. In this paper, we present one such pipeline and conduct extensive experiments on several state-of-the-art (SOTA) transformer-based TSF models. These experiments are conducted on standard benchmark datasets to evaluate and compare the performance of different models, generating practical insights and examples. Our pipeline is generalizable beyond transformer-based architectures and can be applied to other SOTA models, such as Mamba and TimeMixer, as demonstrated in our experiments. The goal of this work is to provide valuable guidance to both industry practitioners and academic researchers in efficiently identifying optimal hyperparameters suited to their specific domain applications. The code and complete experimental results are available on GitHub.

dataset, hyperparameter, time series forecasting, (12 more...)

arXiv.org Artificial Intelligence

2501.01394

Country: Oceania > Australia > Victoria (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation

Doshi, Ria, Walke, Homer, Mees, Oier, Dasari, Sudeep, Levine, Sergey

arXiv.org Artificial IntelligenceAug-21-2024

Modern machine learning systems rely on large datasets to attain broad generalization, and this often poses a challenge in robot learning, where each robotic platform and task might have only a small dataset. By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets, which in turn can lead to better generalization and robustness. However, training a single policy on multi-robot data is challenging because robots can have widely varying sensors, actuators, and control frequencies. We propose CrossFormer, a scalable and flexible transformer-based policy that can consume data from any embodiment. We train CrossFormer on the largest and most diverse dataset to date, 900K trajectories across 20 different robot embodiments. We demonstrate that the same network weights can control vastly different robots, including single and dual arm manipulation systems, wheeled robots, quadcopters, and quadrupeds. Unlike prior work, our model does not require manual alignment of the observation or action spaces. Extensive experiments in the real world show that our method matches the performance of specialist policies tailored for each embodiment, while also significantly outperforming the prior state of the art in cross-embodiment learning.

dataset, embodiment, robot, (14 more...)

arXiv.org Artificial Intelligence

2408.11812

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.82)

Industry: Transportation > Air (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

Shen, Li, Wei, Yuning, Wang, Yangzhu, Li, Hongguang

arXiv.org Artificial IntelligenceDec-10-2023

With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies. Currently, Transformer and MLP are two paradigms for deep time-series forecasting and the former one is more prevailing in virtue of its exquisite attention mechanism and encoder-decoder architecture. However, data scientists seem to be more willing to dive into the research of encoder, leaving decoder unconcerned. Some researchers even adopt linear projections in lieu of the decoder to reduce the complexity. We argue that both extracting the features of input sequence and seeking the relations of input and prediction sequence, which are respective functions of encoder and decoder, are of paramount significance. Motivated from the success of FPN in CV field, we propose FPPformer to utilize bottom-up and top-down architectures respectively in encoder and decoder to build the full and rational hierarchy. The cutting-edge patch-wise attention is exploited and further developed with the combination, whose format is also different in encoder and decoder, of revamped element-wise attention in this work. Extensive experiments with six state-of-the-art baselines on twelve benchmarks verify the promising performances of FPPformer and the importance of elaborately devising decoder in time-series forecasting Transformer. The source code is released in https://github.com/OrigamiSL/FPPformer.

decoder, fppformer, sequence, (15 more...)

arXiv.org Artificial Intelligence

2312.05792

Country:

North America > United States > California (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Smart Houses & Appliances (0.55)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback