Goto

Collaborating Authors

 Spatial Reasoning


On the geometric and Riemannian structure of the spaces of group equivariant non-expansive operators

arXiv.org Artificial Intelligence

Group equivariant non-expansive operators have been recently proposed as basic components in topological data analysis and deep learning. In this paper we study some geometric properties of the spaces of group equivariant operators and show how a space $\mathcal{F}$ of group equivariant non-expansive operators can be endowed with the structure of a Riemannian manifold, so making available the use of gradient descent methods for the minimization of cost functions on $\mathcal{F}$. As an application of this approach, we also describe a procedure to select a finite set of representative group equivariant non-expansive operators in the considered manifold.


Toward Spatial Temporal Consistency of Joint Visual Tactile Perception in VR Applications

arXiv.org Artificial Intelligence

With the development of VR technology, especially the emergence of the metaverse concept, the integration of visual and tactile perception has become an expected experience in human-machine interaction. Therefore, achieving spatial-temporal consistency of visual and tactile information in VR applications has become a necessary factor for realizing this experience. The state-of-the-art vibrotactile datasets generally contain temporal-level vibrotactile information collected by randomly sliding on the surface of an object, along with the corresponding image of the material/texture. However, they lack the position/spatial information that corresponds to the signal acquisition, making it difficult to achieve spatiotemporal alignment of visual-tactile data. Therefore, we develop a new data acquisition system in this paper which can collect visual and vibrotactile signals of different textures/materials with spatial and temporal consistency. In addition, we develop a VR-based application call "V-Touching" by leveraging the dataset generated by the new acquisition system, which can provide pixel-to-taxel joint visual-tactile perception when sliding over the surface of objects in the virtual environment with distinct vibrotactile feedback of different textures/materials.


Spatial-Temporal Interplay in Human Mobility: A Hierarchical Reinforcement Learning Approach with Hypergraph Representation

arXiv.org Artificial Intelligence

In the realm of human mobility, the decision-making process for selecting the next-visit location is intricately influenced by a trade-off between spatial and temporal constraints, which are reflective of individual needs and preferences. This trade-off, however, varies across individuals, making the modeling of these spatial-temporal dynamics a formidable challenge. To address the problem, in this work, we introduce the "Spatial-temporal Induced Hierarchical Reinforcement Learning" (STI-HRL) framework, for capturing the interplay between spatial and temporal factors in human mobility decision-making. Specifically, STI-HRL employs a two-tiered decision-making process: the low-level focuses on disentangling spatial and temporal preferences using dedicated agents, while the high-level integrates these considerations to finalize the decision. To complement the hierarchical decision setting, we construct a hypergraph to organize historical data, encapsulating the multi-aspect semantics of human mobility. We propose a cross-channel hypergraph embedding module to learn the representations as the states to facilitate the decision-making cycle. Our extensive experiments on two real-world datasets validate the superiority of STI-HRL over state-of-the-art methods in predicting users' next visits across various performance metrics.


Superpixel-based and Spatially-regularized Diffusion Learning for Unsupervised Hyperspectral Image Clustering

arXiv.org Artificial Intelligence

Hyperspectral images (HSIs) provide exceptional spatial and spectral resolution of a scene, crucial for various remote sensing applications. However, the high dimensionality, presence of noise and outliers, and the need for precise labels of HSIs present significant challenges to HSIs analysis, motivating the development of performant HSI clustering algorithms. This paper introduces a novel unsupervised HSI clustering algorithm, Superpixel-based and Spatially-regularized Diffusion Learning (S2DL), which addresses these challenges by incorporating rich spatial information encoded in HSIs into diffusion geometry-based clustering. S2DL employs the Entropy Rate Superpixel (ERS) segmentation technique to partition an image into superpixels, then constructs a spatially-regularized diffusion graph using the most representative high-density pixels. This approach reduces computational burden while preserving accuracy. Cluster modes, serving as exemplars for underlying cluster structure, are identified as the highest-density pixels farthest in diffusion distance from other highest-density pixels. These modes guide the labeling of the remaining representative pixels from ERS superpixels. Finally, majority voting is applied to the labels assigned within each superpixel to propagate labels to the rest of the image. This spatial-spectral approach simultaneously simplifies graph construction, reduces computational cost, and improves clustering performance. S2DL's performance is illustrated with extensive experiments on three publicly available, real-world HSIs: Indian Pines, Salinas, and Salinas A. Additionally, we apply S2DL to landscape-scale, unsupervised mangrove species mapping in the Mai Po Nature Reserve, Hong Kong, using a Gaofen-5 HSI. The success of S2DL in these diverse numerical experiments indicates its efficacy on a wide range of important unsupervised remote sensing analysis tasks.


A Method for Auto-Differentiation of the Voronoi Tessellation

arXiv.org Artificial Intelligence

Voronoi tessellation, also known as Voronoi diagram, is an important computational geometry technique that has applications in various scientific disciplines. It involves dividing a given space into regions based on the proximity to a set of points. Autodifferentiation is a powerful tool for solving optimization tasks. Autodifferentiation assumes constructing a computational graph that allows to compute gradients using backpropagation algorithm. However, often the Voronoi tessellation remains the only non-differentiable part of a pipeline, prohibiting end-to-end differentiation. We present the method for autodifferentiation of the 2D Voronoi tessellation. The method allows one to construct the Voronoi tessellation and pass gradients, making the construction end-to-end differentiable. We provide the implementation details and present several important applications. To the best of our knowledge this is the first autodifferentiable realization of the Voronoi tessellation providing full set of Voronoi geometrical parameters in a differentiable way.


GeoAI in Social Science

arXiv.org Artificial Intelligence

GeoAI, or geospatial artificial intelligence, is an exciting new area that leverages artificial intelligence (AI), geospatial big data and massive computing power to solve problems in high automation and intelligence (Li 2020; 2021). The term was first coined at an Association for Computing Machinery (ACM) workshop in 2017 and then quickly picked up by industry giants Microsoft and Esri for providing new ways of analyzing geospatial data in a cloud environment. The rapid advances of GeoAI in both academia and industry are attributed to three factors: (1) the proliferation of geospatial big data has provided abundant information for researchers to study the environment and society; (2) the recent breakthrough in AI and machine learning (especially deep learning) has better positioned AI for complex and realworld problems; and (3) the fast developments in computing technology, such as Graphics Processing Unit computing, have made it possible to run compute-intensive models using big data. GeoAI evolves as AI evolves, but it is not simply an application of AI in geography. Instead, GeoAI is an interdisciplinary field that injects spatial theories and concepts to make AI more powerful and suitable for tackling geospatial problems.


STS-CCL: Spatial-Temporal Synchronous Contextual Contrastive Learning for Urban Traffic Forecasting

arXiv.org Artificial Intelligence

Efficiently capturing the complex spatiotemporal representations from large-scale unlabeled traffic data remains to be a challenging task. In considering of the dilemma, this work employs the advanced contrastive learning and proposes a novel Spatial-Temporal Synchronous Contextual Contrastive Learning (STS-CCL) model. First, we elaborate the basic and strong augmentation methods for spatiotemporal graph data, which not only perturb the data in terms of graph structure and temporal characteristics, but also employ a learning-based dynamic graph view generator for adaptive augmentation. Second, we introduce a Spatial-Temporal Synchronous Contrastive Module (STS-CM) to simultaneously capture the decent spatial-temporal dependencies and realize graph-level contrasting. To further discriminate node individuals in negative filtering, a Semantic Contextual Contrastive method is designed based on semantic features and spatial heterogeneity, achieving node-level contrastive learning along with negative filtering. Finally, we present a hard mutual-view contrastive training scheme and extend the classic contrastive loss to an integrated objective function, yielding better performance. Extensive experiments and evaluations demonstrate that building a predictor upon STS-CCL contrastive learning model gains superior performance than existing traffic forecasting benchmarks. The proposed STS-CCL is highly suitable for large datasets with only a few labeled data and other spatiotemporal tasks with data scarcity issue.


3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V

arXiv.org Artificial Intelligence

In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks. Our investigation reveals that while GPT-4V exhibits proficiency in discerning the position and interrelations of 2D entities through current visual prompting techniques, its abilities in handling 3D spatial tasks have yet to be explored. In our approach, we create a 3D coordinate system tailored to 3D imagery, complete with annotated scale information. By presenting images infused with the 3DAP visual prompt as inputs, we empower GPT-4V to ascertain the spatial positioning information of the given 3D target image with a high degree of precision. Through experiments, We identified three tasks that could be stably completed using the 3DAP method, namely, 2D to 3D Point Reconstruction, 2D to 3D point matching, and 3D Object Detection. We perform experiments on our proposed dataset 3DAP-Data, the results from these experiments validate the efficacy of 3DAP-enhanced GPT-4V inputs, marking a significant stride in 3D spatial task execution.


Spatial-Temporal DAG Convolutional Networks for End-to-End Joint Effective Connectivity Learning and Resting-State fMRI Classification

arXiv.org Artificial Intelligence

Building comprehensive brain connectomes has proved of fundamental importance in resting-state fMRI (rs-fMRI) analysis. Based on the foundation of brain network, spatial-temporal-based graph convolutional networks have dramatically improved the performance of deep learning methods in rs-fMRI time series classification. However, existing works either pre-define the brain network as the correlation matrix derived from the raw time series or jointly learn the connectome and model parameters without any topology constraint. These methods could suffer from degraded classification performance caused by the deviation from the intrinsic brain connectivity and lack biological interpretability of demonstrating the causal structure (i.e., effective connectivity) among brain regions. Moreover, most existing methods for effective connectivity learning are unaware of the downstream classification task and cannot sufficiently exploit useful rs-fMRI label information. To address these issues in an end-to-end manner, we model the brain network as a directed acyclic graph (DAG) to discover direct causal connections between brain regions and propose Spatial-Temporal DAG Convolutional Network (ST-DAGCN) to jointly infer effective connectivity and classify rs-fMRI time series by learning brain representations based on nonlinear structural equation model. The optimization problem is formulated into a continuous program and solved with score-based learning method via gradient descent. We evaluate ST-DAGCN on two public rs-fMRI databases. Experiments show that ST-DAGCN outperforms existing models by evident margins in rs-fMRI classification and simultaneously learns meaningful edges of effective connectivity that help understand brain activity patterns and pathological mechanisms in brain disease.


Artificial Intelligence and Human Geography

arXiv.org Artificial Intelligence

This paper examines the recent advances and applications of AI in human geography especially the use of machine (deep) learning, including place representation and modeling, spatial analysis and predictive mapping, and urban planning and design. AI technologies have enabled deeper insights into complex human-environment interactions, contributing to more effective scientific exploration, understanding of social dynamics, and spatial decision-making. Furthermore, human geography offers crucial contributions to AI, particularly in context-aware model development, human-centered design, biases and ethical considerations, and data privacy. The synergy beween AI and human geography is essential for addressing global challenges like disaster resilience, poverty, and equitable resource access. This interdisciplinary collaboration between AI and geography will help advance the development of GeoAI and promise a better and sustainable world for all.