AITopics

2508.14008

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.65)

arXiv.org Artificial IntelligenceAug-20-2025

STPFormer: A State-of-the-Art Pattern-Aware Spatio-Temporal Transformer for Traffic Forecasting

Fang, Jiayu, Shao, Zhiqi, Choy, S T Boris, Gao, Junbin

Spatio-temporal traffic forecasting is challenging due to complex temporal patterns, dynamic spatial structures, and diverse input formats. Although Transformer-based models offer strong global modeling, they often struggle with rigid temporal encoding and weak space-time fusion. We propose STPFormer, a Spatio-Temporal Pattern-Aware Transformer that achieves state-of-the-art performance via unified and interpretable representation learning. It integrates four modules: Temporal Position Aggregator (TPA) for pattern-aware temporal encoding, Spatial Sequence Aggregator (SSA) for sequential spatial learning, Spatial-Temporal Graph Matching (STGM) for cross-domain alignment, and an Attention Mixer for multi-scale fusion. Experiments on five real-world datasets show that STPFormer consistently sets new SOTA results, with ablation and visualizations confirming its effectiveness and generalizability.

data mining, justification, machine learning, (19 more...)

2508.13433

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Transportation > Infrastructure & Services (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining (0.93)

arXiv.org Artificial IntelligenceAug-20-2025

Spatial-Temporal Transformer with Curriculum Learning for EEG-Based Emotion Recognition

Lin, Xuetao, Peng, Tianhao, Dai, Peihong, Liang, Yu, Wu, Wenjun

-- EEG-based emotion recognition plays an important role in developing adaptive brain-computer communication systems, yet faces two fundamental challenges in practical implementations: (1) effective integration of non-stationary spatial-temporal neural patterns, (2) robust adaptation to dynamic emotional intensity variations in real-world scenarios. This paper proposes STT -CL, a novel framework integrating spatial-temporal transformers with curriculum learning. Our method introduces two core components: a spatial encoder that models inter-channel relationships and a temporal encoder that captures multi-scale dependencies through windowed attention mechanisms, enabling simultaneous extraction of spatial correlations and temporal dynamics from EEG signals. Complementing this architecture, an intensity-aware curriculum learning strategy progressively guides training from high-intensity to low-intensity emotional states through dynamic sample scheduling based on a dual difficulty assessment. Comprehensive experiments on three benchmark datasets demonstrate state-of-the-art performance across various emotional intensity levels, with ablation studies confirming the necessity of both architectural components and the curriculum learning mechanism. Emotion recognition constitutes a fundamental component of brain-inspired human-computer interaction systems [1].

artificial intelligence, emotion recognition, machine learning, (18 more...)

2507.14698

Country: Asia > China (0.15)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Imbriani, Manuela, Belmonte, Gina, Massink, Mieke, Tofani, Alessandro, Ciancia, Vincenzo

A Multi-Resolution Benchmark Framework for Spatial Reasoning Assessment in Neural Networks

This paper presents preliminary results in the definition of a comprehensive benchmark framework designed to systematically evaluate spatial reasoning capabilities in neural networks, with a particular focus on morphological properties such as connectivity and distance relationships. The framework is currently being used to study the capabilities of nnU-Net, exploiting the spatial model checker V oxLogicA to generate two distinct categories of synthetic datasets: maze connectivity problems for topological analysis and spatial distance computation tasks for geometric understanding. Each category is evaluated across multiple resolutions to assess scalability and generalization properties. The automated pipeline encompasses a complete machine learning workflow including: synthetic dataset generation, standardized training with cross-validation, inference execution, and comprehensive evaluation using Dice coefficient and IoU (Intersection over Union) metrics. Preliminary experimental results demonstrate significant challenges in neural network spatial reasoning capabilities, revealing systematic failures in basic geometric and topological understanding tasks. The framework provides a reproducible experimental protocol, enabling researchers to identify specific limitations. Such limitations could be addressed through hybrid approaches combining neural networks with symbolic reasoning methods for improved spatial understanding in clinical applications, establishing a foundation for ongoing research into neural network spatial reasoning limitations and potential solutions.

artificial intelligence, machine learning, spatial reasoning, (17 more...)

2508.12741

Country:

North America > United States (0.28)
Europe > Italy > Tuscany (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

OmniD: Generalizable Robot Manipulation Policy via Image-Based BEV Representation

Mao, Jilei, Guan, Jiarui, Tang, Yingjuan, Hu, Qirui, Li, Zhihang, Yu, Junjie, Mao, Yongjie, Sun, Yunzhe, Liu, Shuang, Ju, Xiaozhu

Ensuring robust generalization across diverse environments and scenarios remains a central challenge for real-world embodied systems. The generalization challenges primarily manifest in positional variations, background interference, viewpoint shifts, morphological differences, illumination changes, and environmental dynamics[1, 2]. To provide a clearer critique for the model's generalization capability, inspired by [3], we formally define in-distribution (ID), out-of-distribution (OOD) evaluations, and combinatorial-distribution (CD) for embodied scenarios. Taking object position generalization as an example, as shown in Figure 1: when the spatial distribution of pumpkins in test data aligns with the training distribution, it constitutes an ID scenario; significantly divergent distributions indicate OOD cases, while intermediate variations correspond to CD with varying discrepancy levels. Building upon this generalization capability formalization, we systematically evaluate existing methodologies' effectiveness. Methods like DP[4], ACT[5], etc [6] are capable of performing complex manipulation tasks and get a high ID success rate. They are prone to overfit to the specific ID scenario and fail to generalize to OOD. Even minor camera pose perturbations or subtle background variations can lead to significant performance degradation.

artificial intelligence, generalization, spatial reasoning, (16 more...)

2508.11898

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.68)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.41)

EVTP-IVS: Effective Visual Token Pruning For Unifying Instruction Visual Segmentation In Multi-Modal Large Language Models

Zhu, Wenhui, Chen, Xiwen, Wang, Zhipeng, Tang, Shao, Ghosh, Sayan, Dong, Xuanzhao, Koner, Rajat, Wang, Yalin

Instructed Visual Segmentation (IVS) tasks require segmenting objects in images or videos based on natural language instructions. While recent multimodal large language models (MLLMs) have achieved strong performance on IVS, their inference cost remains a major bottleneck, particularly in video. W e empirically analyze visual token sampling in MLLMs and observe a strong correlation between subset token coverage and segmentation performance. This motivates our design of a simple and effective token pruning method that selects a compact yet spatially representative subset of tokens to accelerate inference. In this paper, we introduce a novel visual token pruning method for IVS, called EVTP-IV, which builds upon the k -center by integrating spatial information to ensure better coverage. W e further provide an information-theoretic analysis to support our design. Experiments on standard IVS benchmarks show that our method achieves up to 5 speed-up on video tasks and 3.5 on image tasks, while maintaining comparable accuracy using only 20% of the tokens. Our method also consistently outperforms state-of-the-art pruning baselines under varying pruning ratios.

large language model, natural language, segmentation, (17 more...)

2508.11886

Country: Europe (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)

From Heuristics to Data: Quantifying Site Planning Layout Indicators with Deep Learning and Multi-Modal Data

Cao, Qian, Chen, Jielin, Zhao, Junchao, Stouffs, Rudi

The spatial layout of urban sites shapes land-use efficiency and spatial organization. Traditional site planning often relies on experiential judgment and single-source data, limiting systematic quantification of multifunctional layouts. We propose a Site Planning Layout Indicator (SPLI) system, a data-driven framework integrating empirical knowledge with heterogeneous multi-source data to produce structured urban spatial information. The SPLI supports multimodal spatial data systems for analytics, inference, and retrieval by combining OpenStreetMap (OSM), Points of Interest (POI), building morphology, land use, and satellite imagery. It extends conventional metrics through five dimensions: (1) Hierarchical Building Function Classification, refining empirical systems into clear hierarchies; (2) Spatial Organization, quantifying seven layout patterns (e.g., symmetrical, concentric, axial-oriented); (3) Functional Diversity, transforming qualitative assessments into measurable indicators using Functional Ratio (FR) and Simpson Index (SI); (4) Accessibility to Essential Services, integrating facility distribution and transport networks for comprehensive accessibility metrics; and (5) Land Use Intensity, using Floor Area Ratio (FAR) and Building Coverage Ratio (BCR) to assess utilization efficiency. Data gaps are addressed through deep learning, including Relational Graph Neural Networks (RGNN) and Graph Neural Networks (GNN). Experiments show the SPLI improves functional classification accuracy and provides a standardized basis for automated, data-driven urban spatial analytics.

artificial intelligence, classification, machine learning, (16 more...)

2508.11723

Country: Asia > China (0.46)

Genre: Research Report (0.81)

Industry:

Transportation > Infrastructure & Services (1.00)
Health & Medicine (1.00)
Banking & Finance > Real Estate (0.96)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-18-2025, 12:21:35 GMT

Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis

Recent advancements in satellite technology have enabled granular insight into the evolution of human activity on the planet's surface.

artificial intelligence, machine learning, spatial reasoning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.06)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.68)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.55)
Banking & Finance > Real Estate (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.82)

Neural Information Processing SystemsAug-18-2025, 10:33:32 GMT

Dynamic COVID risk assessment accounting for community virus exposure from a spatial-temporal transmission model

We design a weighting scheme to mitigate multiple selection biases inherited in EHRs of COVID patients.

artificial intelligence, infection rate, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
Europe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.41)

Alemanni, William, Burzacchi, Arianna, Colombi, Davide, Giarratano, Elena

Enhancing Interactive Voting-Based Map Matching: Improving Efficiency and Robustness for Heterogeneous GPS Trajectories

arXiv.org Artificial IntelligenceAug-18-2025

This paper presents an enhanced version of the Interactive Voting-Based Map Matching algorithm, designed to efficiently process trajectories with varying sampling rates. The main aim is to reconstruct GPS trajectories with high accuracy, independent of input data quality. Building upon the original algorithm, developed exclusively for aligning GPS signals to road networks, we extend its capabilities by integrating trajectory imputation. Our improvements also include the implementation of a distance-bounded interactive voting strategy to reduce computational complexity, as well as modifications to address missing data in the road network. Furthermore, we incorporate a custom-built asset derived from OpenStreetMap, enabling this approach to be smoothly applied in any geographic region covered by OpenStreetMap's road network. These advancements preserve the core strengths of the original algorithm while significantly extending its applicability to diverse real-world scenarios.

artificial intelligence, machine learning, trajectory, (21 more...)

2508.11235

Country:

Europe > Italy (0.28)
North America > United States > Arizona (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Information Management (0.93)
Information Technology > Communications > Mobile (0.69)
(2 more...)