AITopics | Spatial Reasoning

Collaborating Authors

Spatial Reasoning

News Overviews Instructional Materials AI-Alerts Classics

Hybrid Spatial Representations for Species Distribution Modeling

arXiv.org Artificial IntelligenceOct-22-2024

We address an important problem in ecology called Species Distribution Modeling (SDM), whose goal is to predict whether a species exists at a certain position on Earth. In particular, we tackle a challenging version of this task, where we learn from presence-only data in a community-sourced dataset, model a large number of species simultaneously, and do not use any additional environmental information. Previous work has used neural implicit representations to construct models that achieve promising results. However, implicit representations often generate predictions of limited spatial precision. We attribute this limitation to their inherently global formulation and inability to effectively capture local feature variations. This issue is especially pronounced with presence-only data and a large number of species. To address this, we propose a hybrid embedding scheme that combines both implicit and explicit embeddings. Specifically, the explicit embedding is implemented with a multiresolution hashgrid, enabling our models to better capture local information. Experiments demonstrate that our results exceed other works by a large margin on various standard benchmarks, and that the hybrid representation is better than both purely implicit and explicit ones. Qualitative visualizations and comprehensive ablation studies reveal that our hybrid representation successfully addresses the two main challenges. Our code is open-sourced at https://github.com/Shiran-Yuan/HSR-SDM.

artificial intelligence, machine learning, representation, (20 more...)

arXiv.org Artificial Intelligence

2410.10937

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.64)

Add feedback

SSMT: Few-Shot Traffic Forecasting with Single Source Meta-Transfer

Bhaumik, Kishor Kumar, Kim, Minha, Niloy, Fahim Faisal, Ali, Amin Ahsan, Woo, Simon S.

arXiv.org Artificial IntelligenceOct-20-2024

Traffic forecasting in Intelligent Transportation Systems (ITS) is vital for intelligent traffic prediction. Yet, ITS often relies on data from traffic sensors or vehicle devices, where certain cities might not have all those smart devices or enabling infrastructures. Also, recent studies have employed meta-learning to generalize spatial-temporal traffic networks, utilizing data from multiple cities for effective traffic forecasting for data-scarce target cities. However, collecting data from multiple cities can be costly and time-consuming. To tackle this challenge, we introduce Single Source Meta-Transfer Learning (SSMT) which relies only on a single source city for traffic prediction. Our method harnesses this transferred knowledge to enable few-shot traffic forecasting, particularly when the target city possesses limited data. Specifically, we use memory-augmented attention to store the heterogeneous spatial knowledge from the source city and selectively recall them for the data-scarce target city. We extend the idea of sinusoidal positional encoding to establish meta-learning tasks by leveraging diverse temporal traffic patterns from the source city. Moreover, to capture a more generalized representation of the positions we introduced a meta-positional encoding that learns the most optimal representation of the temporal pattern across all the tasks. We experiment on five real-world benchmark datasets to demonstrate that our method outperforms several existing methods in time series traffic prediction.

artificial intelligence, machine learning, spatial reasoning, (17 more...)

arXiv.org Artificial Intelligence

2410.15589

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
North America > United States > California > Riverside County > Riverside (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Transportation > Infrastructure & Services (0.48)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.87)

Add feedback

Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image

Zhao, Yu, Fei, Hao, Li, Xiangtai, Qin, Libo, Ji, Jiayi, Zhu, Hongyuan, Zhang, Meishan, Zhang, Min, Wei, Jianguo

arXiv.org Artificial IntelligenceOct-20-2024

In the visual spatial understanding (VSU) area, spatial image-to-text (SI2T) and spatial text-to-image (ST2I) are two fundamental tasks that appear in dual form. Existing methods for standalone SI2T or ST2I perform imperfectly in spatial understanding, due to the difficulty of 3D-wise spatial feature modeling. In this work, we consider modeling the SI2T and ST2I together under a dual learning framework. During the dual framework, we then propose to represent the 3D spatial scene features with a novel 3D scene graph (3DSG) representation that can be shared and beneficial to both tasks. Further, inspired by the intuition that the easier 3D$\to$image and 3D$\to$text processes also exist symmetrically in the ST2I and SI2T, respectively, we propose the Spatial Dual Discrete Diffusion (SD$^3$) framework, which utilizes the intermediate features of the 3D$\to$X processes to guide the hard X$\to$3D processes, such that the overall ST2I and SI2T will benefit each other. On the visual spatial understanding dataset VSD, our system outperforms the mainstream T2I and I2T methods significantly. Further in-depth analysis reveals how our dual learning strategy advances.

diffusion model, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.15312

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(26 more...)

Genre: Research Report (0.82)

Industry: Transportation > Ground > Rail (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
(2 more...)

Add feedback

The S2 Hierarchical Discrete Global Grid as a Nexus for Data Representation, Integration, and Querying Across Geospatial Knowledge Graphs

Stephen, Shirly, Faulk, Mitchell, Janowicz, Krzysztof, Fisher, Colby, Thelen, Thomas, Zhu, Rui, Hitzler, Pascal, Shimizu, Cogan, Currier, Kitty, Schildhauer, Mark, Rehberger, Dean, Wang, Zhangyu, Christou, Antrea

arXiv.org Artificial IntelligenceOct-18-2024

Geospatial Knowledge Graphs (GeoKGs) have become integral to the growing field of Geospatial Artificial Intelligence. Initiatives like the U.S. National Science Foundation's Open Knowledge Network program aim to create an ecosystem of nation-scale, cross-disciplinary GeoKGs that provide AI-ready geospatial data aligned with FAIR principles. However, building this infrastructure presents key challenges, including 1) managing large volumes of data, 2) the computational complexity of discovering topological relations via SPARQL, and 3) conflating multi-scale raster and vector data. Discrete Global Grid Systems (DGGS) help tackle these issues by offering efficient data integration and representation strategies. The KnowWhereGraph utilizes Google's S2 Geometry -- a DGGS framework -- to enable efficient multi-source data processing, qualitative spatial querying, and cross-graph integration. This paper outlines the implementation of S2 within KnowWhereGraph, emphasizing its role in topologically enriching and semantically compressing data. Ultimately, this work demonstrates the potential of DGGS frameworks, particularly S2, for building scalable GeoKGs.

artificial intelligence, geokg, spatial reasoning, (14 more...)

arXiv.org Artificial Intelligence

2410.14808

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.28)
Europe > Austria > Vienna (0.14)
North America > United States > Illinois (0.04)
(15 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.66)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models

Qian, Tangwen, Li, Junhe, Chen, Yile, Cong, Gao, Sun, Tao, Wang, Fei, Xu, Yongjun

arXiv.org Artificial IntelligenceOct-18-2024

Modeling trajectory data with generic-purpose dense representations has become a prevalent paradigm for various downstream applications, such as trajectory classification, travel time estimation and similarity computation. However, existing methods typically rely on trajectories from a single spatial view, limiting their ability to capture the rich contextual information that is crucial for gaining deeper insights into movement patterns across different geospatial contexts. To this end, we propose MVTraj, a novel multi-view modeling method for trajectory representation learning. MVTraj integrates diverse contextual knowledge, from GPS to road network and points-of-interest to provide a more comprehensive understanding of trajectory data. To align the learning process across multiple views, we utilize GPS trajectories as a bridge and employ self-supervised pretext tasks to capture and distinguish movement patterns across different spatial views. Following this, we treat trajectories from different views as distinct modalities and apply a hierarchical cross-modal interaction module to fuse the representations, thereby enriching the knowledge derived from multiple sources. Extensive experiments on real-world datasets demonstrate that MVTraj significantly outperforms existing baselines in tasks associated with various spatial views, validating its effectiveness and practical utility in spatio-temporal modeling.

representation, spatial view, trajectory, (11 more...)

arXiv.org Artificial Intelligence

2410.13196

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Sichuan Province > Chengdu (0.05)
Asia > China > Shaanxi Province > Xi'an (0.05)
(4 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (0.40)
Transportation > Ground > Road (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Hiformer: Hybrid Frequency Feature Enhancement Inverted Transformer for Long-Term Wind Power Prediction

Wan, Chongyang, Lei, Shunbo, Luo, Yuan

arXiv.org Artificial IntelligenceOct-17-2024

The increasing severity of climate change necessitates an urgent transition to renewable energy sources, making the large-scale adoption of wind energy crucial for mitigating environmental impact. However, the inherent uncertainty of wind power poses challenges for grid stability, underscoring the need for accurate wind energy prediction models to enable effective power system planning and operation. While many existing studies on wind power prediction focus on short-term forecasting, they often overlook the importance of long-term predictions. Long-term wind power forecasting is essential for effective power grid dispatch and market transactions, as it requires careful consideration of weather features such as wind speed and direction, which directly influence power output. Consequently, methods designed for short-term predictions may lead to inaccurate results and high computational costs in long-term settings. To adress these limitations, we propose a novel approach called Hybrid Frequency Feature Enhancement Inverted Transformer (Hiformer). Hiformer introduces a unique structure that integrates signal decomposition technology with weather feature extraction technique to enhance the modeling of correlations between meteorological conditions and wind power generation. Additionally, Hiformer employs an encoder-only architecture, which reduces the computational complexity associated with long-term wind power forecasting. Compared to the state-of-the-art methods, Hiformer: (i) can improve the prediction accuracy by up to 52.5\%; and (ii) can reduce computational time by up to 68.5\%.

data mining, forecasting, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.13303

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Promising Solution (0.68)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction

He, Haoyu, Luo, Haozheng, Wang, Qi R.

arXiv.org Artificial IntelligenceOct-17-2024

Predicting human mobility across multiple cities presents significant challenges due to the complex and diverse spatial-temporal dynamics inherent in different urban environments. In this study, we propose a robust approach to predict human mobility patterns called ST-MoE-BERT. Compared to existing methods, our approach frames the prediction task as a spatial-temporal classification problem. Our methodology integrates the Mixture-of-Experts architecture with BERT model to capture complex mobility dynamics and perform the downstream human mobility prediction task. Additionally, transfer learning is integrated to solve the challenge of data scarcity in cross-city prediction. We demonstrate the effectiveness of the proposed model on GEO-BLEU and DTW, comparing it to several state-of-the-art methods. Notably, ST-MoE-BERT achieves an average improvement of 8.29%.

data mining, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.14099

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Adaptive Subsampling and Learned Model Improve Spatiotemporal Resolution of Tactile Skin

Slepyan, Ariel, Li, Dian, Aug, Aidan, Sankar, Sriramana, Tran, Trac, Thakor, Nitish

arXiv.org Artificial IntelligenceOct-17-2024

High-speed tactile arrays are essential for real-time robotic control in unstructured environments, but high pixel counts limit readout rates of most large tactile arrays to below 100Hz. We introduce ACTS - adaptive compressive tactile subsampling - a method that efficiently samples tactile matrices and reconstructs interactions using sparse recovery and a learned tactile dictionary. Tested on a 1024-pixel sensor array (32x32), ACTS increased frame rates by 18X compared to raster scanning, with minimal error. For the first time in large-area tactile skin, we demonstrate rapid object classification within 20ms of contact, high-speed projectile detection, ricochet angle estimation, and deformation tracking through enhanced spatiotemporal resolution. Our method can be implemented in firmware, upgrading existing low-cost, flexible, and robust tactile arrays into high-resolution systems for large-area spatiotemporal touch sensing.

machine learning, measurement level, temporal reasoning, (20 more...)

arXiv.org Artificial Intelligence

2410.13847

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Maryland > Baltimore (0.05)
North America > Canada > Quebec > Montreal (0.04)
(17 more...)

Genre: Research Report (0.50)

Industry:

Semiconductors & Electronics (0.93)
Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

A Prompt-Guided Spatio-Temporal Transformer Model for National-Wide Nuclear Radiation Forecasting

Lyu, Tengfei, Han, Jindong, Liu, Hao

arXiv.org Artificial IntelligenceOct-15-2024

Nuclear radiation (NR), which refers to the energy emitted from atomic nuclei during decay, poses substantial risks to human health and environmental safety. Accurate forecasting of nuclear radiation levels is crucial for informed decision-making by both individuals and governments. However, this task is challenging due to the imbalanced distribution of monitoring stations over a wide spatial range and the non-stationary radiation variation patterns. In this study, we introduce NRFormer, an innovative framework tailored for national-wide prediction of nuclear radiation variations. By integrating a non-stationary temporal attention module, an imbalance-aware spatial attention module, and a radiation propagation prompting module, NRFormer collectively captures complex spatio-temporal dynamics of nuclear radiation. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed framework against seven baselines. This research not only enhances the accuracy and reliability in nuclear radiation forecasting but also contributes to advancing emergency response strategies and monitoring systems, thereby safeguarding environmental and public health.

data mining, forecasting, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.11924

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)
Asia > China > Hong Kong (0.04)
Europe > Ukraine > Kyiv Oblast > Chernobyl (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Power Industry > Utilities > Nuclear (1.00)
Health & Medicine (0.86)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Fed-piLot: Optimizing LoRA Assignment for Efficient Federated Foundation Model Fine-Tuning

Zhang, Zikai, Xu, Jiahao, Liu, Ping, Hu, Rui

arXiv.org Artificial IntelligenceOct-14-2024

Foundation models (FMs) have shown remarkable advancements in enhancing the performance of intelligent applications. To address the need for data privacy in FM fine-tuning, federated learning has emerged as the de facto framework. Specifically, Federated FMs (FedFMs) fine-tuning using low-rank adaptation (LoRA) modules instead of the full model over multiple clients can achieve both parameter efficiency and data privacy. However, recent studies rarely address the challenges posed by clients with heterogeneous resources, particularly in GPU memory capacity. In this paper, we introduce Fed-piLot, an efficient FedFM fine-tuning framework with optimized local LoRA assignments for heterogeneous clients. By emphasizing the different memory consumption for training different LoRA layers, as well as the varying contributions of different layers to model performance, we formulate the LoRA assignment as a Knapsack Optimization Problem. We design a Local-Global Information Gain Score (IG-Score) based value function to optimize LoRA assignment under clients' memory constraints. To further mitigate the impact of heterogeneity in model updates, we propose a novel Spatial-Temporal model aggregation (STAgg) rule using the Dynamic Weight Adjustment (DWA) strategy. Experimental results on three datasets under both IID and non-IID conditions demonstrate the effectiveness and efficiency of Fed-piLot. The code will be publicly available.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.102

Country:

North America > United States > Nevada > Washoe County > Reno (0.14)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.34)

Add feedback