AITopics | Spatial Reasoning

Collaborating Authors

Spatial Reasoning

News Overviews Instructional Materials AI-Alerts Classics

Sphere2Vec: A General-Purpose Location Representation Learning over a Spherical Surface for Large-Scale Geospatial Predictions

Mai, Gengchen, Xuan, Yao, Zuo, Wenyun, He, Yutong, Song, Jiaming, Ermon, Stefano, Janowicz, Krzysztof, Lao, Ni

arXiv.org Artificial IntelligenceJul-2-2023

Generating learning-friendly representations for points in space is a fundamental and long-standing problem in ML. Recently, multi-scale encoding schemes (such as Space2Vec and NeRF) were proposed to directly encode any point in 2D/3D Euclidean space as a high-dimensional vector, and has been successfully applied to various geospatial prediction and generative tasks. However, all current 2D and 3D location encoders are designed to model point distances in Euclidean space. So when applied to large-scale real-world GPS coordinate datasets, which require distance metric learning on the spherical surface, both types of models can fail due to the map projection distortion problem (2D) and the spherical-to-Euclidean distance approximation error (3D). To solve these problems, we propose a multi-scale location encoder called Sphere2Vec which can preserve spherical distances when encoding point coordinates on a spherical surface. We developed a unified view of distance-reserving encoding on spheres based on the DFS. We also provide theoretical proof that the Sphere2Vec preserves the spherical surface distance between any two points, while existing encoding schemes do not. Experiments on 20 synthetic datasets show that Sphere2Vec can outperform all baseline models on all these datasets with up to 30.8% error rate reduction. We then apply Sphere2Vec to three geo-aware image classification tasks - fine-grained species recognition, Flickr image recognition, and remote sensing image classification. Results on 7 real-world datasets show the superiority of Sphere2Vec over multiple location encoders on all three tasks. Further analysis shows that Sphere2Vec outperforms other location encoder models, especially in the polar regions and data-sparse areas because of its nature for spherical surface distance preservation. Code and data are available at https://gengchenmai.github.io/sphere2vec-website/.

artificial intelligence, machine learning, spatial reasoning, (19 more...)

arXiv.org Artificial Intelligence

2306.17624

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
(13 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (1.00)
Health & Medicine > Epidemiology (0.92)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

S-Omninet: Structured Data Enhanced Universal Multimodal Learning Architecture

Xue, Ye, Klabjan, Diego, Utke, Jean

arXiv.org Artificial IntelligenceJul-1-2023

Multimodal multitask learning has attracted an increasing interest in recent years. Singlemodal models have been advancing rapidly and have achieved astonishing results on various tasks across multiple domains. Multimodal learning offers opportunities for further improvements by integrating data from multiple modalities. Many methods are proposed to learn on a specific type of multimodal data, such as vision and language data. A few of them are designed to handle several modalities and tasks at a time. In this work, we extend and improve Omninet, an architecture that is capable of handling multiple modalities and tasks at a time, by introducing cross-cache attention, integrating patch embeddings for vision inputs, and supporting structured data. The proposed Structured-data-enhanced Omninet (S-Omninet) is a universal model that is capable of learning from structured data of various dimensions effectively with unstructured data through cross-cache attention, which enables interactions among spatial, temporal, and structured features. We also enhance spatial representations in a spatial cache with patch embeddings. We evaluate the proposed model on several multimodal datasets and demonstrate a significant improvement over the baseline, Omninet.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2307.00226

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Exploring Spatial-Temporal Variations of Public Discourse on Social Media: A Case Study on the First Wave of the Coronavirus Pandemic in Italy

Michael, Anslow, Martina, Galletti

arXiv.org Artificial IntelligenceJun-28-2023

This paper proposes a methodology for exploring how linguistic behaviour on social media can be used to explore societal reactions to important events such as those that transpired during the SARS CoV2 pandemic. In particular, where spatial and temporal aspects of events are important features. Our methodology consists of grounding spatial-temporal categories in tweet usage trends using time-series analysis and clustering. Salient terms in each category were then identified through qualitative comparative analysis based on scaled f-scores aggregated into hand-coded categories. To exemplify this approach, we conducted a case study on the first wave of the coronavirus in Italy. We used our proposed methodology to explore existing psychological observations which claimed that physical distance from events affects what is communicated about them. We confirmed these findings by showing that the epicentre of the disease and peripheral regions correspond to clear time-series clusters and that those living in the epicentre of the SARS CoV2 outbreak were more focused on solidarity and policy than those from more peripheral regions. Furthermore, we also found that temporal categories corresponded closely to policy changes during the handling of the pandemic.

artificial intelligence, category, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.16031

Country:

Europe > Italy > Lombardy > Lodi Province > Codogno (0.06)
Europe > Italy > Basilicata (0.05)
Asia > China > Hubei Province > Wuhan (0.05)
(20 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction

Guo, Aoqi, Wu, Junnan, Gao, Peng, Zhu, Wenbo, Guo, Qinwen, Gao, Dazhi, Wang, Yujun

arXiv.org Artificial IntelligenceJun-28-2023

Recently, deep learning-based beamforming algorithms have shown promising performance in target speech extraction tasks. However, most systems do not fully utilize spatial information. In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer. To achieve this, we first use the UNet-TCN structure to model input features and improve the estimation accuracy of the speech pre-separation module by avoiding information loss caused by direct dimensionality reduction in other models. Furthermore, we introduce a multi-head cross-attention mechanism that enhances the neural beamformer's perception of spatial information by making full use of the spatial information received by the array. Experimental results demonstrate that our approach, which incorporates a more reasonable target mask estimation network and a spatial information-based cross-attention mechanism into the neural beamformer, effectively improves speech separation performance.

artificial intelligence, information, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.15942

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Survey of Federated Learning Models for Spatial-Temporal Mobility Applications

Belal, Yacine, Mokhtar, Sonia Ben, Haddadi, Hamed, Wang, Jaron, Mashhadi, Afra

arXiv.org Artificial IntelligenceJun-27-2023

Spatial temporal mobility data collected by location-based services (LBS) [42] and other means such as Call Data Records (CDR), WiFi hotspots, smart watches, cars, etc. is very useful from a socio-economical perspective as it is at the heart of many useful applications (e.g., navigation, geo-located search, geo-located games) and it allows answering numerous societal research questions [51]. For example, Call Data Records have been successfully used to provide real-time traffic anomaly as well as event detection [90, 92], and a variety of mobility datasets have been used in shaping policies for urban communities [31] or epidemic management in the public health domain [80, 79]. From an individual-level perspective, users can benefit from personalized recommendations when they are encouraged to share their location data with third parties [22]. While there is no doubt about the usefulness of location-based applications, privacy concerns regarding the collection and sharing of individuals' mobility traces or aggregated flow of movements have prevented the data from being utilized to their full potential [87, 9, 53]. Indeed, various studies have shown that numerous threats are open if location data falls into the hands of inappropriate parties. These threats include re-identification [68], the inference of sensitive information about users [53, 94](e.g., their home and work locations, religious beliefs, political interests or sexual

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.05257

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(18 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(3 more...)

Add feedback

Imitation with Spatial-Temporal Heatmap: 2nd Place Solution for NuPlan Challenge

Hu, Yihan, Li, Kun, Liang, Pingyuan, Qian, Jingyu, Yang, Zhening, Zhang, Haichao, Shao, Wenxin, Ding, Zhuangzhuang, Xu, Wei, Liu, Qiang

arXiv.org Artificial IntelligenceJun-26-2023

This paper presents our 2nd place solution for the NuPlan Challenge 2023. Autonomous driving in real-world scenarios is highly complex and uncertain. Achieving safe planning in the complex multimodal scenarios is a highly challenging task. Our approach, Imitation with Spatial-Temporal Heatmap, adopts the learning form of behavior cloning, innovatively predicts the future multimodal states with a heatmap representation, and uses trajectory refinement techniques to ensure final safety. The experiment shows that our method effectively balances the vehicle's progress and safety, generating safe and comfortable trajectories. In the NuPlan competition, we achieved the second highest overall score, while obtained the best scores in the ego progress and comfort metrics.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2306.157

Country: Asia (0.04)

Genre: Research Report (0.50)

Industry: Transportation (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.62)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.49)

Add feedback

Spatio-temporal Diffusion Point Processes

Yuan, Yuan, Ding, Jingtao, Shao, Chenyang, Jin, Depeng, Li, Yong

arXiv.org Artificial IntelligenceJun-24-2023

Spatio-temporal point process (STPP) is a stochastic collection of events accompanied with time and space. Due to computational complexities, existing solutions for STPPs compromise with conditional independence between time and space, which consider the temporal and spatial distributions separately. The failure to model the joint distribution leads to limited capacities in characterizing the spatio-temporal entangled interactions given past events. In this work, we propose a novel parameterization framework for STPPs, which leverages diffusion models to learn complex spatio-temporal joint distributions. We decompose the learning of the target joint distribution into multiple steps, where each step can be faithfully described by a Gaussian distribution. To enhance the learning of each step, an elaborated spatio-temporal co-attention module is proposed to capture the interdependence between the event time and space adaptively. For the first time, we break the restrictions on spatio-temporal dependencies in existing solutions, and enable a flexible and accurate modeling paradigm for STPPs. Extensive experiments from a wide range of fields, such as epidemiology, seismology, crime, and urban mobility, demonstrate that our framework outperforms the state-of-the-art baselines remarkably, with an average improvement of over 50%. Further in-depth analyses validate its ability to capture spatio-temporal interactions, which can learn adaptively for different scenarios. The datasets and source code are available online: https://github.com/tsinghua-fib-lab/Spatio-temporal-Diffusion-Point-Processes.

data mining, machine learning, point process, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3580305.3599511

2305.12403

Country:

Asia > China (0.15)
North America > United States > New York (0.14)
Europe > Italy (0.14)
Africa (0.14)

Genre:

Workflow (0.68)
Research Report (0.50)

Industry:

Health & Medicine > Epidemiology (0.48)
Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)
(2 more...)

Add feedback

Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation

Lin, Xingyu, Qi, Carl, Zhang, Yunchu, Huang, Zhiao, Fragkiadaki, Katerina, Li, Yunzhu, Gan, Chuang, Held, David

arXiv.org Artificial IntelligenceJun-23-2023

Abstract: Effective planning of long-horizon deformable object manipulation requires suitable abstractions at both the spatial and temporal levels. Previous methods typically either focus on short-horizon tasks or make the strong assumption that full-state information is available. However, full states of deformable objects are often unavailable. In this paper, we propose PlAnning with Spatial and Temporal Abstraction (PASTA), which incorporates both spatial abstraction (reasoning about objects and their relations to each other) and temporal abstraction (reasoning over skills instead of low-level actions). Our framework maps high-dimension 3D point clouds into a set of latent vectors and plans skill sequences with the latent set representation. Our method can solve challenging, novel sequential deformable object manipulation tasks in the real world, which require combining multiple tool-use skills such as cutting with a knife, pushing with a pusher, and spreading dough with a roller. Additional materials can be found on our project website.

artificial intelligence, machine learning, spatial reasoning, (19 more...)

arXiv.org Artificial Intelligence

2210.15751

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

Learned spatial data partitioning

Hori, Keizo, Sasaki, Yuya, Amagata, Daichi, Murosaki, Yuki, Onizuka, Makoto

arXiv.org Artificial IntelligenceJun-19-2023

Due to the significant increase in the size of spatial data, it is essential to use distributed parallel processing systems to efficiently analyze spatial data. In this paper, we first study learned spatial data partitioning, which effectively assigns groups of big spatial data to computers based on locations of data by using machine learning techniques. We formalize spatial data partitioning in the context of reinforcement learning and develop a novel deep reinforcement learning algorithm. Our learning algorithm leverages features of spatial data partitioning and prunes ineffective learning processes to find optimal partitions efficiently. Our experimental study, which uses Apache Sedona and real-world spatial data, demonstrates that our method efficiently finds partitions for accelerating distance join queries and reduces the workload run time by up to 59.4%.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2306.04846

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.06)
North America > United States > Washington > King County > Seattle (0.05)
South America (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation

Zhang, Qianru, Huang, Chao, Xia, Lianghao, Wang, Zheng, Yiu, Siuming, Han, Ruihua

arXiv.org Artificial IntelligenceJun-18-2023

Spatial-temporal graph learning has emerged as a promising solution for modeling structured spatial-temporal data and learning region representations for various urban sensing tasks such as crime forecasting and traffic flow prediction. However, most existing models are vulnerable to the quality of the generated region graph due to the inaccurate graph-structured information aggregation schema. The ubiquitous spatial-temporal data noise and incompleteness in real-life scenarios pose challenges in generating high-quality region representations. To address this challenge, we propose a new spatial-temporal graph learning model (GraphST) for enabling effective self-supervised learning. Our proposed model is an adversarial contrastive learning paradigm that automates the distillation of crucial multi-view self-supervised information for robust spatial-temporal graph augmentation. We empower GraphST to adaptively identify hard samples for better self-supervision, enhancing the representation discrimination ability and robustness. In addition, we introduce a cross-view contrastive learning paradigm to model the inter-dependencies across view-specific region representations and preserve underlying relation heterogeneity. We demonstrate the superiority of our proposed GraphST method in various spatial-temporal prediction tasks on real-life datasets. We release our model implementation via the link: \url{https://github.com/HKUDS/GraphST}.

artificial intelligence, machine learning, spatial reasoning, (16 more...)

arXiv.org Artificial Intelligence

2306.10683

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > Manhattan (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.48)

Industry:

Consumer Products & Services (0.48)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback