AITopics | spatial configuration

Collaborating Authors

spatial configuration

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MOVE: A Simple Motion-Based Data Collection Paradigm for Spatial Generalization in Robotic Manipulation

Wang, Huanqian, Chen, Chi Bene, Yue, Yang, Tao, Danhua, Guo, Tong, Xie, Shaoxuan, Huang, Denghang, Song, Shiji, Yao, Guocai, Huang, Gao

arXiv.org Artificial IntelligenceDec-5-2025

Imitation learning method has shown immense promise for robotic manipulation, yet its practical deployment is fundamentally constrained by the data scarcity. Despite prior work on collecting large-scale datasets, there still remains a significant gap to robust spatial generalization. We identify a key limitation: individual trajectories, regardless of their length, are typically collected from a \emph{single, static spatial configuration} of the environment. This includes fixed object and target spatial positions as well as unchanging camera viewpoints, which significantly restricts the diversity of spatial information available for learning. To address this critical bottleneck in data efficiency, we propose \textbf{MOtion-Based Variability Enhancement} (\emph{MOVE}), a simple yet effective data collection paradigm that enables the acquisition of richer spatial information from dynamic demonstrations. Our core contribution is an augmentation strategy that injects motion into any movable objects within the environment for each demonstration. This process implicitly generates a dense and diverse set of spatial configurations within a single trajectory. We conduct extensive experiments in both simulation and real-world environments to validate our approach. For example, in simulation tasks requiring strong spatial generalization, \emph{MOVE} achieves an average success rate of 39.1\%, a 76.1\% relative improvement over the static data collection paradigm (22.2\%), and yields up to 2--5$\times$ gains in data efficiency on certain tasks. Our code is available at https://github.com/lucywang720/MOVE.

artificial intelligence, spatial reasoning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2512.04813

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.55)

Add feedback

Decoding street network morphologies and their correlation to travel mode choice

Riascos-Goyes, Juan Fernando, Lowry, Michael, Guarín-Zapata, Nicolás, Ospina, Juan P.

arXiv.org Artificial IntelligenceNov-13-2025

Urban morphology has long been recognized as a factor shaping human mobility, yet comparative and formal classifications of urban form across metropolitan areas remain limited. Building on theoretical principles of urban structure and advances in unsupervised learning, we systematically classified the built environment of nine U.S. metropolitan areas using structural indicators such as density, connectivity, and spatial configuration. The resulting morphological types were linked to mobility patterns through descriptive statistics, marginal effects estimation, and post hoc statistical testing. Here we show that distinct urban forms are systematically associated with different mobility behaviors, such as reticular morphologies being linked to significantly higher public transport use (marginal effect = 0.49) and reduced car dependence (-0.41), while organic forms are associated with increased car usage (0.44), and substantial declines in public transport (-0.47) and active mobility (-0.30). These effects are statistically robust (p < 1e-19), highlighting that the spatial configuration of urban areas plays a fundamental role in shaping transportation choices. Our findings extend previous work by offering a reproducible framework for classifying urban form and demonstrate the added value of morphological analysis in comparative urban research. These results suggest that urban form should be treated as a key variable in mobility planning and provide empirical support for incorporating spatial typologies into sustainable urban policy design.

artificial intelligence, configuration, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.19648

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > North Carolina > Wake County > Cary (0.14)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models

Lin, Zijun, Duan, Jiafei, Fang, Haoquan, Fox, Dieter, Krishna, Ranjay, Tan, Cheston, Wen, Bihan

arXiv.org Artificial IntelligenceOct-28-2025

Recent advances in robotic manipulation have integrated low-level robotic control into Vision-Language Models (VLMs), extending them into Vision-Language-Action (VLA) models. Although state-of-the-art VLAs achieve strong performance in downstream robotic applications, supported by large-scale crowd-sourced robot training data, they still inevitably encounter failures during execution. Enabling robots to reason and recover from unpredictable and abrupt failures remains a critical challenge. Existing robotic manipulation datasets, collected in either simulation or the real world, primarily provide only ground-truth trajectories, leaving robots unable to recover once failures occur. Moreover, the few datasets that address failure detection typically offer only textual explanations, which are difficult to utilize directly in VLA models. To address this gap, we introduce FailSafe, a novel failure generation and recovery system that automatically produces diverse failure cases paired with executable recovery actions. FailSafe can be seamlessly applied to any manipulation task in any simulator, enabling scalable creation of failure action data. To demonstrate its effectiveness, we fine-tune LLaVa-OneVision-7B (LLaVa-OV-7B) to build FailSafe-VLM. Experimental results show that FailSafe-VLM successfully helps robotic arms detect and recover from potential failures, improving the performance of three state-of-the-art VLA models (pi0-FAST, OpenVLA, OpenVLA-OFT) by up to 22.6% on average across several tasks in Maniskill. Furthermore, FailSafe-VLM could generalize across different spatial configurations, camera viewpoints, object and robotic embodiments. We plan to release the FailSafe code to the community.

failsafe-vlm, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2510.01642

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Communications > Social Media > Crowdsourcing (0.34)

Add feedback

Interact-Custom: Customized Human Object Interaction Image Generation

Xu, Zhu, Wang, Zhaowen, Peng, Yuxin, Liu, Yang

arXiv.org Artificial IntelligenceAug-29-2025

Compositional Customized Image Generation aims to customize multiple target concepts within generation content, which has gained attention for its wild application. Existing approaches mainly concentrate on the target entity's appearance preservation, while neglecting the fine-grained interaction control among target entities. To enable the model of such interaction control capability, we focus on human object interaction scenario and propose the task of Customized Human Object Interaction Image Generation(CHOI), which simultaneously requires identity preservation for target human object and the interaction semantic control between them. Two primary challenges exist for CHOI:(1)simultaneous identity preservation and interaction control demands require the model to decompose the human object into self-contained identity features and pose-oriented interaction features, while the current HOI image datasets fail to provide ideal samples for such feature-decomposed learning.(2)inappropriate spatial configuration between human and object may lead to the lack of desired interaction semantics. To tackle it, we first process a large-scale dataset, where each sample encompasses the same pair of human object involving different interactive poses. Then we design a two-stage model Interact-Custom, which firstly explicitly models the spatial configuration by generating a foreground mask depicting the interaction behavior, then under the guidance of this mask, we generate the target human object interacting while preserving their identities features. Furthermore, if the background image and the union location of where the target human object should appear are provided by users, Interact-Custom also provides the optional functionality to specify them, offering high content controllability. Extensive experiments on our tailored metrics for CHOI task demonstrate the effectiveness of our approach.

artificial intelligence, machine learning, spatial configuration, (12 more...)

arXiv.org Artificial Intelligence

2508.19575

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

ReachVox: Clutter-free Reachability Visualization for Robot Motion Planning in Virtual Reality

Hauck, Steffen, Abdlkarim, Diar, Dudley, John, Kristensson, Per Ola, Ofek, Eyal, Grubert, Jens

arXiv.org Artificial IntelligenceAug-18-2025

Figure 1: Remote Human-Robot-Collaboration: a) A remote operator needs to align the body of an engine so that a robot arm can access and weld it (a linear arrangement of white points represents the required welding locations). Through this, the user controls the position and rotation of the engine, enabling her to align the engine efficiently. The concentration of unreachable locations along the task area's right side indicates to the user the need to rotate the engine further toward the robot. Human-Robot-Collaboration can enhance workflows by leveraging the mutual strengths of human operators and robots. Planning and understanding robot movements remain major challenges in this domain. This problem is prevalent in dynamic environments that might need constant robot motion path adaptation. Through a user study (n=20), we indicate the strength of the visualization relative to a point-based reachability check-up. Collaboration between human operators and robots can leverage the strengths of both. Humans can better understand ad hoc situations and control them so that they are easily accessible by the robot.

artificial intelligence, human computer interaction, visualization, (18 more...)

arXiv.org Artificial Intelligence

2508.11426

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Enhancing Swarms Durability to Threats via Graph Signal Processing and GNN-based Generative Modeling

Karin, Jonathan, Piran, Zoe, Nitzan, Mor

arXiv.org Artificial IntelligenceJul-8-2025

Swarms, such as schools of fish or drone formations, are prevalent in both natural and engineered systems. While previous works have focused on the social interactions within swarms, the role of external perturbations--such as environmental changes, predators, or communication breakdowns--in affecting swarm stability is not fully understood. Our study addresses this gap by modeling swarms as graphs and applying graph signal processing techniques to analyze perturbations as signals on these graphs. By examining predation, we uncover a "detectability-durability trade-off", demonstrating a tension between a swarm's ability to evade detection and its resilience to predation, once detected. We provide theoretical and empirical evidence for this trade-off, explicitly tying it to properties of the swarm's spatial configuration. Toward task-specific optimized swarms, we introduce SwaGen, a graph neural network-based generative model. We apply SwaGen to resilient swarm generation by defining a task-specific loss function, optimizing the contradicting trade-off terms simultaneously.With this, SwaGen reveals novel spatial configurations, optimizing the trade-off at both ends. Applying the model can guide the design of robust artificial swarms and deepen our understanding of natural swarm dynamics.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.03039

Country:

North America > United States (0.32)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Real-Time 3D Guidewire Reconstruction from Intraoperative DSA Images for Robot-Assisted Endovascular Interventions

Yao, Tianliang, Li, Bingrui, Lu, Bo, Pei, Zhiqiang, Yuan, Yixuan, Qi, Peng

arXiv.org Artificial IntelligenceJun-30-2025

Accurate three-dimensional (3D) reconstruction of guidewire shapes is crucial for precise navigation in robot-assisted endovascular interventions. Conventional 2D Digital Subtraction Angiography (DSA) is limited by the absence of depth information, leading to spatial ambiguities that hinder reliable guidewire shape sensing. This paper introduces a novel multimodal framework for real-time 3D guidewire reconstruction, combining preoperative 3D Computed Tomography Angiography (CTA) with intraoperative 2D DSA images. The method utilizes robust feature extraction to address noise and distortion in 2D DSA data, followed by deformable image registration to align the 2D projections with the 3D CTA model. Subsequently, the inverse projection algorithm reconstructs the 3D guidewire shape, providing real-time, accurate spatial information. This framework significantly enhances spatial awareness for robotic-assisted endovascular procedures, effectively bridging the gap between preoperative planning and intraoperative execution. The system demonstrates notable improvements in real-time processing speed, reconstruction accuracy, and computational efficiency. The proposed method achieves a projection error of 1.76$\pm$0.08 pixels and a length deviation of 2.93$\pm$0.15\%, with a frame rate of 39.3$\pm$1.5 frames per second (FPS). These advancements have the potential to optimize robotic performance and increase the precision of complex endovascular interventions, ultimately contributing to better clinical outcomes.

artificial intelligence, machine learning, real time system, (19 more...)

arXiv.org Artificial Intelligence

2506.21631

Country:

Europe > United Kingdom (0.14)
Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs

Hwang, Eui Jun, Cho, Sukmin, Lee, Junmyeong, Park, Jong C.

arXiv.org Artificial IntelligenceAug-20-2024

Gloss-free Sign Language Translation (SLT) converts sign videos directly into spoken language sentences without relying on glosses. Recently, Large Language Models (LLMs) have shown remarkable translation performance in gloss-free methods by harnessing their powerful natural language generation capabilities. However, these methods often rely on domain-specific fine-tuning of visual encoders to achieve optimal results. By contrast, this paper emphasizes the importance of capturing the spatial configurations and motion dynamics inherent in sign language. With this in mind, we introduce Spatial and Motion-based Sign Language Translation (SpaMo), a novel LLM-based SLT framework. The core idea of SpaMo is simple yet effective. We first extract spatial and motion features using off-the-shelf visual encoders and then input these features into an LLM with a language prompt. Additionally, we employ a visual-text alignment process as a warm-up before the SLT supervision. Our experiments demonstrate that SpaMo achieves state-of-the-art performance on two popular datasets, PHOENIX14T and How2Sign.

large language model, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2408.10593

Country: North America > United States > Florida > Pinellas County > St. Petersburg (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Open-World Visual Reasoning by a Neuro-Symbolic Program of Zero-Shot Symbols

Burghouts, Gertjan, Hillerström, Fieke, Walraven, Erwin, van Bekkum, Michael, Ruis, Frank, Sijs, Joris, van Mil, Jelle, Dijk, Judith

arXiv.org Artificial IntelligenceJul-18-2024

We consider the problem of finding spatial configurations of multiple objects in images, e.g., a mobile inspection robot is tasked to localize abandoned tools on the floor. We define the spatial configuration of objects by first-order logic in terms of relations and attributes. A neuro-symbolic program matches the logic formulas to probabilistic object proposals for the given image, provided by language-vision models by querying them for the symbols. This work is the first to combine neuro-symbolic programming (reasoning) and language-vision models (learning) to find spatial configurations of objects in images in an open world setting. We show the effectiveness by finding abandoned tools on floors and leaking pipes. We find that most prediction errors are due to biases in the language-vision model.

configuration, language-vision model, spatial configuration, (16 more...)

arXiv.org Artificial Intelligence

2407.13382

Country: Europe > Netherlands > South Holland > The Hague (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.35)

Add feedback

Neural Bayes Estimators for Irregular Spatial Data using Graph Neural Networks

Sainsbury-Dale, Matthew, Richards, Jordan, Zammit-Mangion, Andrew, Huser, Raphaël

arXiv.org Machine LearningOct-4-2023

Neural Bayes estimators are neural networks that approximate Bayes estimators in a fast and likelihood-free manner. They are appealing to use with spatial models and data, where estimation is often a computational bottleneck. However, neural Bayes estimators in spatial applications have, to date, been restricted to data collected over a regular grid. These estimators are also currently dependent on a prescribed set of spatial locations, which means that the neural network needs to be re-trained for new data sets; this renders them impractical in many applications and impedes their widespread adoption. In this work, we employ graph neural networks to tackle the important problem of parameter estimation from data collected over arbitrary spatial locations. In addition to extending neural Bayes estimation to irregular spatial data, our architecture leads to substantial computational benefits, since the estimator can be used with any arrangement or number of locations and independent replicates, thus amortising the cost of training for a given spatial model. We also facilitate fast uncertainty quantification by training an accompanying neural Bayes estimator that approximates a set of marginal posterior quantiles. We illustrate our methodology on Gaussian and max-stable processes. Finally, we showcase our methodology in a global sea-surface temperature application, where we estimate the parameters of a Gaussian process model in 2,161 regions, each containing thousands of irregularly-spaced data points, in just a few minutes with a single graphics processing unit.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

2310.026

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback