choi
- Europe > United Kingdom (0.46)
- North America > United States > California (0.14)
- Oceania > Australia (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.49)
Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport
Optimal Transport (OT) problem investigates a transport map that bridges two distributions while minimizing a given cost function. In this regard, OT between tractable prior distribution and data has been utilized for generative modeling tasks. However, OT-based methods are susceptible to outliers and face optimization challenges during training. In this paper, we propose a novel generative model based on the semi-dual formulation of Unbalanced Optimal Transport (UOT).
SiT Dataset: Socially Interactive Pedestrian Trajectory Dataset for Social Navigation Robots
To ensure secure and dependable mobility in environments shared by humans and robots, social navigation robots should possess the capability to accurately perceive and predict the trajectories of nearby pedestrians. In this paper, we present a novel dataset of pedestrian trajectories, referred to as Social Interactive Trajectory (SiT) dataset, which can be used to train pedestrian detection, tracking, and trajectory prediction models needed to design social navigation robots. Our dataset includes sequential raw data captured by two 3D LiDARs and five cameras covering a 360-degree view, two inertial measurement unit (IMU) sensors, and real-time kinematic positioning (RTK), as well as annotations including 2D & 3D boxes, object classes, and object IDs. Thus far, various human trajectory datasets have been introduced to support the development of pedestrian motion forecasting models. Our SiT dataset differs from these datasets in the following two respects.
Meta-Learning with Adaptive Hyperparameters
Despite its popularity, several recent works question the effectiveness of MAML when test tasks are different from training tasks, thus suggesting various task-conditioned methodology to improve the initialization. Instead of searching for better task-aware initialization, we focus on a complementary factor in MAML framework, inner-loop optimization (or fast adaptation). Consequently, we propose a new weight update rule that greatly enhances the fast adaptation process. Specifically, we introduce a small meta-network that can adaptively generate per-step hyperparameters: learning rate and weight decay coefficients. The experimental results validate that the Adaptive Learning of hyperparameters for Fast Adaptation (ALFA) is the equally important ingredient that was often neglected in the recent few-shot learning approaches. Surprisingly, fast adaptation from random initialization with ALFA can already outperform MAML.
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Mixup is a commonly adopted data augmentation technique for image classification. Recent advances in mixup methods primarily focus on mixing based on saliency. However, many saliency detectors require intense computation and are especially burdensome for parameter-heavy transformer models. To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that aims to maximize the saliency of a mixed set of tokens. TokenMixup provides 15 faster saliency-aware data augmentation compared to gradient-based methods. Moreover, we introduce a variant of TokenMixup which mixes tokens within a single instance, thereby enabling multi-scale feature augmentation. Experiments show that our methods significantly improve the baseline models' performance on CIFAR and ImageNet-1K, while being more efficient than previous methods. We also reach state-of-the-art performance on CIFAR-100 among from-scratch transformer models.
Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models
Kim, Ju-Young, Park, Ji-Hong, Lee, Se-Yeon, Park, Sujin, Kim, Gun-Woo
Recent incidents in certain online games and communities, where anonymity is guaranteed, show that unchecked inappropriate remarks frequently escalate into verbal abuse and even criminal behavior, raising significant social concerns. Consequently, there is a growing need for research on techniques that can detect inappropriate utterances within conversational texts to help build a safer communication environment. Although large-scale language models trained on Korean corpora and chain-of-thought reasoning have recently gained attention, research applying these approaches to inappropriate utterance detection remains limited. In this study, we propose a soft inductive bias approach that explicitly defines reasoning perspectives to guide the inference process, thereby promoting rational decision-making and preventing errors that may arise during reasoning. We fine-tune a Korean large language model using the proposed method and conduct both quantitative performance comparisons and qualitative evaluations across different training strategies. Experimental results show that the Kanana-1.5 model achieves an average accuracy of 87.0046, improving by approximately 3.89 percent over standard supervised learning. These findings indicate that the proposed method goes beyond simple knowledge imitation by large language models and enables more precise and consistent judgments through constrained reasoning perspectives, demonstrating its effectiveness for inappropriate utterance detection.
PMA-Diffusion: A Physics-guided Mask-Aware Diffusion Framework for TSE from Sparse Observations
Liu, Lindong, Jin, Zhixiong, Choi, Seongjin
High-resolution highway traffic state information is essential for Intelligent Transportation Systems, but typical traffic data acquired from loop detectors and probe vehicles are often too sparse and noisy to capture the detailed dynamics of traffic flow. We propose PMA-Diffusion, a physics-guided mask-aware diffusion framework that reconstructs unobserved highway speed fields from sparse, incomplete observations. Our approach trains a diffusion prior directly on sparsely observed speed fields using two mask-aware training strategies: Single-Mask and Double-Mask. At the inference phase, the physics-guided posterior sampler alternates reverse-diffusion updates, observation projection, and physics-guided projection based on adaptive anisotropic smoothing to reconstruct the missing speed fields. The proposed framework is tested on the I-24 MOTION dataset with varying visibility ratios. Even under severe sparsity, with only 5% visibility, PMA-Diffusion outperforms other baselines across three reconstruction error metrics. Furthermore, PMA-diffusion trained with sparse observation nearly matches the performance of the baseline model trained on fully observed speed fields. The results indicate that combining mask-aware diffusion priors with a physics-guided posterior sampler provides a reliable and flexible solution for traffic state estimation under realistic sensing sparsity.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- North America > United States > Kansas > Cowley County (0.04)
- (2 more...)
Weaver: Kronecker Product Approximations of Spatiotemporal Attention for Traffic Network Forecasting
Cheong, Christopher, Davis, Gary, Choi, Seongjin
Spatiotemporal forecasting on transportation networks is a complex task that requires understanding how traffic nodes interact within a dynamic, evolving system dictated by traffic flow dynamics and social behavioral patterns. The importance of transportation networks and ITS for modern mobility and commerce necessitates forecasting models that are not only accurate but also interpretable, efficient, and robust under structural or temporal perturbations. Recent approaches, particularly Transformer-based architectures, have improved predictive performance but often at the cost of high computational overhead and diminished architectural interpretability. In this work, we introduce Weaver, a novel attention-based model that applies Kronecker product approximations (KPA) to decompose the PN X PN spatiotemporal attention of O(P^2N^2) complexity into local P X P temporal and N X N spatial attention maps. This Kronecker attention map enables our Parallel-Kronecker Matrix-Vector product (P2-KMV) for efficient spatiotemporal message passing with O(P^2N + N^2P) complexity. To capture real-world traffic dynamics, we address the importance of negative edges in modeling traffic behavior by introducing Valence Attention using the continuous Tanimoto coefficient (CTC), which provides properties conducive to precise latent graph generation and training stability. To fully utilize the model's learning capacity, we introduce the Traffic Phase Dictionary for self-conditioning. Evaluations on PEMS-BAY and METR-LA show that Weaver achieves competitive performance across model categories while training more efficiently.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > District of Columbia > Washington (0.14)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- (13 more...)
- Overview (0.92)
- Research Report > New Finding (0.45)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology (0.92)
- Europe > United Kingdom (0.46)
- North America > United States > California (0.14)
- Oceania > Australia (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.49)
Consistent Kernel Change-Point Detection under m-Dependence for Text Segmentation
Diaz-Rodriguez, Jairo, Jia, Mumin
Kernel change-point detection (KCPD) has become a widely used tool for identifying structural changes in complex data. While existing theory establishes consistency under independence assumptions, real-world sequential data such as text exhibits strong dependencies. We establish new guarantees for KCPD under $m$-dependent data: specifically, we prove consistency in the number of detected change points and weak consistency in their locations under mild additional assumptions. We perform an LLM-based simulation that generates synthetic $m$-dependent text to validate the asymptotics. To complement these results, we present the first comprehensive empirical study of KCPD for text segmentation with modern embeddings. Across diverse text datasets, KCPD with text embeddings outperforms baselines in standard text segmentation metrics. We demonstrate through a case study on Taylor Swift's tweets that KCPD not only provides strong theoretical and simulated reliability but also practical effectiveness for text segmentation tasks.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (13 more...)