crossformer
- North America > Canada > Quebec > Montreal (0.04)
- Africa > Middle East > Algeria > Tamanrasset Province > Tamanrasset (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Energy > Power Industry (1.00)
- Energy > Renewable > Solar (0.48)
- Government > Regional Government > North America Government (0.46)
Stiff Circuit System Modeling via Transformer
Yan, Weiman, Chang, Yi-Chia, Zhao, Wanyu
Accurate and efficient circuit behavior modeling is a cornerstone of modern electronic design automation. Among different types of circuits, stiff circuits are challenging to model using previous frameworks. In this work, we propose a new approach using Crossformer, which is a current state-of-the-art Transformer model for time-series prediction tasks, combined with Kolmogorov-Arnold Networks (KANs), to model stiff circuit transient behavior. By leveraging the Crossformer's temporal representation capabilities and the enhanced feature extraction of KANs, our method achieves improved fidelity in predicting circuit responses to a wide range of input conditions. Experimental evaluations on datasets generated through SPICE simulations of analog-to-digital converter (ADC) circuits demonstrate the effectiveness of our approach, with significant reductions in training time and error rates.
- North America > Canada > Quebec > Montreal (0.04)
- Africa > Middle East > Algeria > Tamanrasset Province > Tamanrasset (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Energy > Power Industry (1.00)
- Energy > Renewable > Solar (0.48)
- Government > Regional Government > North America Government (0.46)
A Comparative Study of Transformer-Based Models for Multi-Horizon Blood Glucose Prediction
Karagoz, Meryem Altin, Breton, Marc D., Fathi, Anas El
Accurate blood glucose prediction can enable novel interventions for type 1 diabetes treatment, including personalized insulin and dietary adjustments. Although recent advances in transformer-based architectures have demonstrated the power of attention mechanisms in complex multivariate time series prediction, their potential for blood glucose (BG) prediction remains underexplored. We present a comparative analysis of transformer models for multi-horizon BG prediction, examining forecasts up to 4 hours and input history up to 1 week. The publicly available DCLP3 dataset (n=112) was split (80%-10%-10%) for training, validation, and testing, and the OhioT1DM dataset (n=12) served as an external test set. We trained networks with point-wise, patch-wise, series-wise, and hybrid embeddings, using CGM, insulin, and meal data. For short-term blood glucose prediction, Crossformer, a patch-wise transformer architecture, achieved a superior 30-minute prediction of RMSE (15.6 mg / dL on OhioT1DM). For longer-term predictions (1h, 2h, and 4h), PatchTST, another path-wise transformer, prevailed with the lowest RMSE (24.6 mg/dL, 36.1 mg/dL, and 46.5 mg/dL on OhioT1DM). In general, models that used tokenization through patches demonstrated improved accuracy with larger input sizes, with the best results obtained with a one-week history. These findings highlight the promise of transformer-based architectures for BG prediction by capturing and leveraging seasonal patterns in multivariate time-series data to improve accuracy.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > Ohio (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
CrossFormer: Cross-Segment Semantic Fusion for Document Segmentation
Ni, Tongke, Fan, Yang, Zhou, Junru, Wu, Xiangping, Chen, Qingcai
Text semantic segmentation involves partitioning a document into multiple paragraphs with continuous semantics based on the subject matter, contextual information, and document structure. Traditional approaches have typically relied on preprocessing documents into segments to address input length constraints, resulting in the loss of critical semantic information across segments. To address this, we present CrossFormer, a transformer-based model featuring a novel cross-segment fusion module that dynamically models latent semantic dependencies across document segments, substantially elevating segmentation accuracy. Additionally, CrossFormer can replace rule-based chunk methods within the Retrieval-Augmented Generation (RAG) system, producing more semantically coherent chunks that enhance its efficacy. Comprehensive evaluations confirm CrossFormer's state-of-the-art performance on public text semantic segmentation datasets, alongside considerable gains on RAG benchmarks.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- South America > Colombia > Bolivar Department > Cartagena (0.04)
- North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
- (12 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Channel Dependence, Limited Lookback Windows, and the Simplicity of Datasets: How Biased is Time Series Forecasting?
Abdelmalak, Ibram, Madhusudhanan, Kiran, Choi, Jungmin, Stubbemann, Maximilian, Schmidt-Thieme, Lars
Time-series forecasting research has converged to a small set of datasets and a standardized collection of evaluation scenarios. Such a standardization is to a specific extent needed for comparable research. However, the underlying assumption is, that the considered setting is a representative for the problem as a whole. In this paper, we challenge this assumption and show that the current scenario gives a strongly biased perspective on the state of time-series forecasting research. To be more detailed, we show that the current evaluation scenario is heavily biased by the simplicity of the current datasets. We furthermore emphasize, that when the lookback-window is properly tuned, current models usually do not need any information flow across channels. However, when using more complex benchmark data, the situation changes: Here, modeling channel-interactions in a sophisticated manner indeed enhances performances. Furthermore, in this complex evaluation scenario, Crossformer, a method regularly neglected as an important baseline, is the SOTA method for time series forecasting. Based on this, we present the Fast Channel-dependent Transformer (FaCT), a simplified version of Crossformer which closes the runtime gap between Crossformer and TimeMixer, leading to an efficient model for complex forecasting datasets.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Germany > Lower Saxony (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models
Xu, Jingjing, Wu, Caesar, Li, Yuan-Fang, Danoy, Grégoire, Bouvry, Pascal
Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility. However, these models often require extensive hyperparameter optimization (HPO) to achieve the best possible performance, and a unified pipeline for HPO in transformer-based TSF remains lacking. In this paper, we present one such pipeline and conduct extensive experiments on several state-of-the-art (SOTA) transformer-based TSF models. These experiments are conducted on standard benchmark datasets to evaluate and compare the performance of different models, generating practical insights and examples. Our pipeline is generalizable beyond transformer-based architectures and can be applied to other SOTA models, such as Mamba and TimeMixer, as demonstrated in our experiments. The goal of this work is to provide valuable guidance to both industry practitioners and academic researchers in efficiently identifying optimal hyperparameters suited to their specific domain applications. The code and complete experimental results are available on GitHub.
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Doshi, Ria, Walke, Homer, Mees, Oier, Dasari, Sudeep, Levine, Sergey
Modern machine learning systems rely on large datasets to attain broad generalization, and this often poses a challenge in robot learning, where each robotic platform and task might have only a small dataset. By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets, which in turn can lead to better generalization and robustness. However, training a single policy on multi-robot data is challenging because robots can have widely varying sensors, actuators, and control frequencies. We propose CrossFormer, a scalable and flexible transformer-based policy that can consume data from any embodiment. We train CrossFormer on the largest and most diverse dataset to date, 900K trajectories across 20 different robot embodiments. We demonstrate that the same network weights can control vastly different robots, including single and dual arm manipulation systems, wheeled robots, quadcopters, and quadrupeds. Unlike prior work, our model does not require manual alignment of the observation or action spaces. Extensive experiments in the real world show that our method matches the performance of specialist policies tailored for each embodiment, while also significantly outperforming the prior state of the art in cross-embodiment learning.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer
Shen, Li, Wei, Yuning, Wang, Yangzhu, Li, Hongguang
With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies. Currently, Transformer and MLP are two paradigms for deep time-series forecasting and the former one is more prevailing in virtue of its exquisite attention mechanism and encoder-decoder architecture. However, data scientists seem to be more willing to dive into the research of encoder, leaving decoder unconcerned. Some researchers even adopt linear projections in lieu of the decoder to reduce the complexity. We argue that both extracting the features of input sequence and seeking the relations of input and prediction sequence, which are respective functions of encoder and decoder, are of paramount significance. Motivated from the success of FPN in CV field, we propose FPPformer to utilize bottom-up and top-down architectures respectively in encoder and decoder to build the full and rational hierarchy. The cutting-edge patch-wise attention is exploited and further developed with the combination, whose format is also different in encoder and decoder, of revamped element-wise attention in this work. Extensive experiments with six state-of-the-art baselines on twelve benchmarks verify the promising performances of FPPformer and the importance of elaborately devising decoder in time-series forecasting Transformer. The source code is released in https://github.com/OrigamiSL/FPPformer.
- North America > United States > California (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- Asia > China > Beijing > Beijing (0.04)
Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context
Boussif, Oussama, Boukachab, Ghait, Assouline, Dan, Massaroli, Stefano, Yuan, Tianle, Benabbou, Loubna, Bengio, Yoshua
Solar power harbors immense potential in mitigating climate change by substantially reducing CO$_{2}$ emissions. Nonetheless, the inherent variability of solar irradiance poses a significant challenge for seamlessly integrating solar power into the electrical grid. While the majority of prior research has centered on employing purely time series-based methodologies for solar forecasting, only a limited number of studies have taken into account factors such as cloud cover or the surrounding physical context. In this paper, we put forth a deep learning architecture designed to harness spatio-temporal context using satellite data, to attain highly accurate \textit{day-ahead} time-series forecasting for any given station, with a particular emphasis on forecasting Global Horizontal Irradiance (GHI). We also suggest a methodology to extract a distribution for each time step prediction, which can serve as a very valuable measure of uncertainty attached to the forecast. When evaluating models, we propose a testing scheme in which we separate particularly difficult examples from easy ones, in order to capture the model performances in crucial situations, which in the case of this study are the days suffering from varying cloudy conditions. Furthermore, we present a new multi-modal dataset gathering satellite imagery over a large zone and time series for solar irradiance and other related physical variables from multiple geographically diverse solar stations. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective integration of solar power into the grid.
- North America > Canada > Quebec > Montreal (0.04)
- Africa > Middle East > Algeria > Tamanrasset Province > Tamanrasset (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Energy > Renewable > Solar (1.00)
- Energy > Power Industry (1.00)