Goto

Collaborating Authors

 tram


TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network

Neural Information Processing Systems

Radar signal interpretation plays a crucial role in remote detection and ranging. With the gradual display of the advantages of neural network technology in signal processing, learning-based radar signal interpretation is becoming a research hot-spot and made great progress. And since radar semantic segmentation (RSS) can provide more fine-grained target information, it has become a more concerned direction in this field. However, the temporal information, which is an important clue for analyzing radar data, has not been exploited sufficiently in present RSS frameworks. In this work, we propose a novel temporal information learning paradigm, i.e., data-driven temporal information aggregation with learned target-history relations. Following this idea, a flexible learning module, called Temporal Relation-Aware Module (TRAM) is carefully designed. TRAM contains two main blocks: i) an encoder for capturing the target-history temporal relations (TH-TRE) and ii) a learnable temporal relation attentive pooling (TRAP) for aggregating temporal information. Based on TRAM, an end-to-end Temporal-Aware RSS Network (TARSS-Net) is presented, which has outstanding performance on publicly available and our collected real-measured datasets.




TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network

Neural Information Processing Systems

Radar signal interpretation plays a crucial role in remote detection and ranging. With the gradual display of the advantages of neural network technology in signal processing, learning-based radar signal interpretation is becoming a research hot-spot and made great progress. And since radar semantic segmentation (RSS) can provide more fine-grained target information, it has become a more concerned direction in this field. However, the temporal information, which is an important clue for analyzing radar data, has not been exploited sufficiently in present RSS frameworks. In this work, we propose a novel temporal information learning paradigm, i.e., data-driven temporal information aggregation with learned target-history relations. Following this idea, a flexible learning module, called Temporal Relation-Aware Module (TRAM) is carefully designed.


AI Models Still Lag Behind Traditional Numerical Models in Predicting Sudden-Turning Typhoons

Xu, Daosheng, Lu, Zebin, Leung, Jeremy Cheuk-Hin, Zhao, Dingchi, Li, Yi, Shi, Yang, Chen, Bin, Nie, Gaozhen, Wu, Naigeng, Tian, Xiangjun, Yang, Yi, Zhang, Shaoqing, Zhang, Banglin

arXiv.org Artificial Intelligence

Given the interpretability, accuracy, and stability of numerical weather prediction (NWP) models, current operational weather forecasting relies heavily on the NWP approach. In the past two years, the rapid development of Artificial Intelligence (AI) has provided an alternative solution for medium-range (1-10 days) weather forecasting. Bi et al. (2023) (hereafter Bi23) introduced the first AI-based weather prediction (AIWP) model in China, named Pangu-Weather, which offers fast prediction without compromising accuracy. In their work, Bi23 made notable claims regarding its effectiveness in extreme weather predictions. However, this claim lacks persuasiveness because the extreme nature of the two tropical cyclones (TCs) examples presented in Bi23, namely Typhoon Kong-rey and Typhoon Yutu, stems primarily from their intensities rather than their moving paths. Their claim may mislead into another meaning which is that Pangu-Weather works well in predicting unusual typhoon paths, which was not explicitly analyzed. Here, we reassess Pangu-Weather's ability to predict extreme TC trajectories from 2020-2024. Results reveal that while Pangu-Weather overall outperforms NWP models in predicting tropical cyclone (TC) tracks, it falls short in accurately predicting the rarely observed sudden-turning tracks, such as Typhoon Khanun in 2023. We argue that current AIWP models still lag behind traditional NWP models in predicting such rare extreme events in medium-range forecasts.


ArrivalNet: Predicting City-wide Bus/Tram Arrival Time with Two-dimensional Temporal Variation Modeling

Li, Zirui, Wolf, Patrick, Wang, Meng

arXiv.org Artificial Intelligence

Accurate arrival time prediction (ATP) of buses and trams plays a crucial role in public transport operations. Current methods focused on modeling one-dimensional temporal information but overlooked the latent periodic information within time series. Moreover, most studies developed algorithms for ATP based on a single or a few routes of public transport, which reduces the transferability of the prediction models and their applicability in public transport management systems. To this end, this paper proposes \textit{ArrivalNet}, a two-dimensional temporal variation-based multi-step ATP for buses and trams. It decomposes the one-dimensional temporal sequence into intra-periodic and inter-periodic variations, which can be recast into two-dimensional tensors (2D blocks). Each row of a tensor contains the time points within a period, and each column involves the time points at the same intra-periodic index across various periods. The transformed 2D blocks in different frequencies have an image-like feature representation that enables effective learning with computer vision backbones (e.g., convolutional neural network). Drawing on the concept of residual neural network, the 2D block module is designed as a basic module for flexible aggregation. Meanwhile, contextual factors like workdays, peak hours, and intersections, are also utilized in the augmented feature representation to improve the performance of prediction. 125 days of public transport data from Dresden were collected for model training and validation. Experimental results show that the root mean square error, mean absolute error, and mean absolute percentage error of the proposed predictor decrease by at least 6.1\%, 14.7\%, and 34.2\% compared with state-of-the-art baseline methods.


Rethinking Knowledge Transfer in Learning Using Privileged Information

Provodin, Danil, Akker, Bram van den, Katsimerou, Christina, Kaptein, Maurits, Pechenizkiy, Mykola

arXiv.org Machine Learning

In supervised machine learning, privileged information (PI) is information that is unavailable at inference, but is accessible during training time. Research on learning using privileged information (LUPI) aims to transfer the knowledge captured in PI onto a model that can perform inference without PI. It seems that this extra bit of information ought to make the resulting model better. However, finding conclusive theoretical or empirical evidence that supports the ability to transfer knowledge using PI has been challenging. In this paper, we critically examine the assumptions underlying existing theoretical analyses and argue that there is little theoretical justification for when LUPI should work. We analyze LUPI methods and reveal that apparent improvements in empirical risk of existing research may not directly result from PI. Instead, these improvements often stem from dataset anomalies or modifications in model design misguidedly attributed to PI. Our experiments for a wide variety of application domains further demonstrate that state-of-the-art LUPI approaches fail to effectively transfer knowledge from PI. Thus, we advocate for practitioners to exercise caution when working with PI to avoid unintended inductive biases.


TRAMS: Training-free Memory Selection for Long-range Language Modeling

Yu, Haofei, Wang, Cunxiang, Zhang, Yue, Bi, Wei

arXiv.org Artificial Intelligence

The Transformer architecture is crucial for numerous AI models, but it still faces challenges in long-range language modeling. Though several specific transformer architectures have been designed to tackle issues of long-range dependencies, existing methods like Transformer-XL are plagued by a high percentage of ineffective memories. In this study, we present a plug-and-play strategy, known as TRAining-free Memory Selection (TRAMS), that selects tokens participating in attention calculation based on one simple metric. This strategy allows us to keep tokens that are likely to have a high attention score with the current queries and ignore the other ones. We have tested our approach on the word-level benchmark (WikiText-103) and the character-level benchmark (enwik8), and the results indicate an improvement without having additional training or adding additional parameters.


Model-based causal feature selection for general response types

Kook, Lucas, Saengkyongam, Sorawit, Lundborg, Anton Rask, Hothorn, Torsten, Peters, Jonas

arXiv.org Machine Learning

Discovering causal relationships from observational data is a fundamental yet challenging task. Invariant causal prediction (ICP, Peters et al., 2016) is a method for causal feature selection which requires data from heterogeneous settings and exploits that causal models are invariant. ICP has been extended to general additive noise models and to nonparametric settings using conditional independence tests. However, the latter often suffer from low power (or poor type I error control) and additive noise models are not suitable for applications in which the response is not measured on a continuous scale, but reflects categories or counts. Here, we develop transformation-model (TRAM) based ICP, allowing for continuous, categorical, count-type, and uninformatively censored responses (these model classes, generally, do not allow for identifiability when there is no exogenous heterogeneity). As an invariance test, we propose TRAM-GCM based on the expected conditional covariance between environments and score residuals with uniform asymptotic level guarantees. For the special case of linear shift TRAMs, we also consider TRAM-Wald, which tests invariance based on the Wald statistic. We provide an open-source R package 'tramicp' and evaluate our approach on simulated data and in a case study investigating causal features of survival in critically ill patients.


TRAM: Bridging Trust Regions and Sharpness Aware Minimization

Sherborne, Tom, Saphra, Naomi, Dasigi, Pradeep, Peng, Hao

arXiv.org Artificial Intelligence

By reducing the curvature of the loss surface in the parameter space, Sharpness-aware minimization (SAM) yields widespread robustness improvement under domain transfer. Instead of focusing on parameters, however, this work considers the transferability of representations as the optimization target for out-of-domain generalization in a fine-tuning setup. To encourage the retention of transferable representations, we consider trust region-based fine-tuning methods, which exploit task-specific skills without forgetting task-agnostic representations from pre-training. We unify parameter- and representation-space smoothing approaches by using trust region bounds to inform SAM-style regularizers on both of these optimization surfaces. We propose Trust Region Aware Minimization (TRAM), a fine-tuning algorithm that optimizes for flat minima and smooth, informative representations without forgetting pre-trained structure. We find that TRAM outperforms both sharpness-aware and trust region-based optimization methods on cross-domain language modeling and cross-lingual transfer, where robustness to domain transfer and representation generality are critical for success. TRAM establishes a new standard in training generalizable models with minimal additional computation.