AITopics | Spatial Reasoning

Collaborating Authors

Spatial Reasoning

News Overviews Instructional Materials AI-Alerts Classics

SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors

Neural Information Processing SystemsMay-30-2025, 10:59:11 GMT

Current state-of-the-art spatial reasoning-enhanced VLMs are trained to excel at spatial visual question answering (VQA). However, we believe that higher-level 3D-aware tasks, such as articulating dynamic scene changes and motion planning, require a fundamental and explicit 3D understanding beyond current spatial VQA datasets. In this work, we present SpatialPIN, a framework designed to enhance the spatial reasoning capabilities of VLMs through prompting and interacting with priors from multiple 3D foundation models in a zero-shot, training-free manner. Extensive experiments demonstrate that our spatial reasoning-imbued VLM performs well on various forms of spatial VQA and can extend to help in various downstream robotics tasks such as pick and stack and trajectory planning.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Validating Climate Models with Spherical Convolutional Wasserstein Distance 1

Neural Information Processing SystemsMay-29-2025, 21:57:05 GMT

The validation of global climate models is crucial to ensure the accuracy and efficacy of model output. We introduce the spherical convolutional Wasserstein distance to more comprehensively measure differences between climate models and reanalysis data.

artificial intelligence, machine learning, scwd, (20 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States > Illinois (0.14)
North America > United States > Texas (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.67)

Add feedback

Object-Centric Representation Learning with Generative Spatial-Temporal Factorization

Neural Information Processing SystemsMay-28-2025, 23:17:28 GMT

Learning object-centric scene representations is essential for attaining structural understanding and abstraction of complex scenes. Yet, as current approaches for unsupervised object-centric representation learning are built upon either a stationary observer assumption or a static scene assumption, they often: i) suffer single-view spatial ambiguities, or ii) infer incorrectly or inaccurately object representations from dynamic scenes. To address this, we propose Dynamicsaware Multi-Object Network (DyMON), a method that broadens the scope of multi-view object-centric representation learning to dynamic scenes. We train Dy-MON on multi-view-dynamic-scene data and show that DyMON learns--without supervision--to factorize the entangled effects of observer motions and scene object dynamics from a sequence of observations, and constructs scene object spatial representations suitable for rendering at arbitrary times (querying across time) and from arbitrary viewpoints (querying across space). We also show that the factorized scene representations (w.r.t.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Supplementary Material: Learning Representations from Audio-Visual Spatial Alignment

Neural Information Processing SystemsMay-28-2025, 22:34:11 GMT

These are transformer networks of base dimension 512 and expansion ration 4. In other words, All models were trained using the Adam optimized. Pre-training hyper-parameters are summarized in Table 2. For semantic segmentation, we used a lightweight FPN segmentation head. Semantic segmentation predictions are then computed based on the features at all levels. This shows the use of spatial negatives is complementary to AVC.

artificial intelligence, machine learning, spatial reasoning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.31)

Add feedback

FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical Imagery

Neural Information Processing SystemsMay-28-2025, 21:01:35 GMT

We introduce the French Land cover from Aerospace ImageRy (FLAIR), an extensive dataset from the French National Institute of Geographical and Forest Information (IGN) that provides a unique and rich resource for large-scale geospatial analysis. FLAIR contains high-resolution aerial imagery with a ground sample distance of 20 cm and over 20 billion individually labeled pixels for precise landcover classification.

artificial intelligence, machine learning, spatial reasoning, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.46)

Genre: Research Report (0.46)

Industry:

Law (1.00)
Food & Agriculture > Agriculture (1.00)
Information Technology (0.67)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

1def1713ebf17722cbe300cfc1c88558-Supplemental.pdf

Neural Information Processing SystemsMay-28-2025, 15:48:25 GMT

artificial intelligence, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)

Add feedback

Road Network Representation Learning with the Third Law of Geography Yile Chen 1

Neural Information Processing SystemsMay-28-2025, 13:46:32 GMT

Road network representation learning aims to learn compressed and effective vectorized representations for road segments that are applicable to numerous tasks. In this paper, we identify the limitations of existing methods, particularly their overemphasis on the distance effect as outlined in the First Law of Geography. In response, we propose to endow road network representation with the principles of the recent Third Law of Geography. To this end, we propose a novel graph contrastive learning framework that employs geographic configuration-aware graph augmentation and spectral negative sampling, ensuring that road segments with similar geographic configurations yield similar representations, and vice versa, aligning with the principles stated in the Third Law. The framework further fuses the Third Law with the First Law through a dual contrastive learning objective to effectively balance the implications of both laws. We evaluate our framework on two real-world datasets across three downstream tasks. The results show that the integration of the Third Law significantly improves the performance of road segment representations in downstream tasks. Our code is available at https://github.com/Haicang/Garner.

artificial intelligence, machine learning, representation, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)

Add feedback

Representation Learning on Spatial Networks

Neural Information Processing SystemsMay-28-2025, 09:46:20 GMT

Spatial networks are networks for which the nodes and edges are constrained by geometry and embedded in real space, which has crucial effects on their topological properties. Although tremendous success has been achieved in spatial and network representation separately in recent years, there exist very little works on the representation of spatial networks. Extracting powerful representations from spatial networks requires the development of appropriate tools to uncover the pairing of both spatial and network information in the appearance of node permutation invariant, and rotation and translation invariant. Hence it can not be modeled merely with either spatial or network models individually. To address these challenges, this paper proposes a generic framework for spatial network representation learning. Specifically, a provably information-lossless and rotation-translation invariant representation of spatial information on networks is presented. Then a higher-order spatial network convolution operation that adapts to our proposed representation is introduced.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.90)
Information Technology > Communications > Networks (0.90)

Add feedback

SpelsNet: Surface Primitive Elements Segmentation by B-Rep Graph Structure Supervision

Neural Information Processing SystemsMay-28-2025, 06:53:19 GMT

Boundary Representation (B-Rep) is the standard approach for modeling shapes in Computer-Aided Design(CAD).

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Software (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.46)

Add feedback

Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams

Tam Le, Makoto Yamada

Neural Information Processing SystemsMay-26-2025, 08:28:15 GMT

Algebraic topology methods have recently played an important role for statistical analysis with complicated geometric structured data such as shapes, linked twist maps, and material data. Among them, persistent homology is a well-known tool to extract robust topological features, and outputs as persistence diagrams (PDs). However, PDs are point multi-sets which can not be used in machine learning algorithms for vector data. To deal with it, an emerged approach is to use kernel methods, and an appropriate geometry for PDs is an important factor to measure the similarity of PDs. A popular geometry for PDs is the Wasserstein metric.

artificial intelligence, machine learning, spatial reasoning, (15 more...)

Neural Information Processing Systems

Country: