Spatial Reasoning
Heterogenous Ensemble of Models for Molecular Property Prediction
Darabi, Sajad, Fazeli, Shayan, Liu, Jiwei, Milesi, Alexandre, Morkisz, Pawel, Puget, Jean-François, Titericz, Gilberto
The OGB Large-Scale Challenge (LSC) [Hu et al., 2021] is a Machine Learning (ML) challenge to predict a quantum chemical property, the HUMO-LUMO gap of small molecules. This ground truth is obtained via a density-functional theory (DFT) computation which is known to be time-consuming and could take several hours, even for small molecules. With the rapid advancement of machine learning technology, it is promising to use fast, GPU-accelerated and accurate ML models to replace this expensive DFT optimization process. The PCQM4Mv2 dataset, based on the PubChemQC project Nakata and Shimazaki [2017], provides us with a welldefined ML task of predicting the HOMO-LUMO gap of molecules given their 2D molecular graphs. Each molecule has two natural views. The 2D graph incorporates topological structures defined by bonds, and the 3D view provides spatial information that better reflects the geometry and spatial relation of the different bonds in the molecule.
HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors
Wang, Xiao, Wu, Zongzhen, Jiang, Bo, Bao, Zhimin, Zhu, Lin, Li, Guoqi, Wang, Yaowei, Tian, Yonghong
The main streams of human activity recognition (HAR) algorithms are developed based on RGB cameras which are suffered from illumination, fast motion, privacy-preserving, and large energy consumption. Meanwhile, the biologically inspired event cameras attracted great interest due to their unique features, such as high dynamic range, dense temporal but sparse spatial resolution, low latency, low power, etc. As it is a newly arising sensor, even there is no realistic large-scale dataset for HAR. Considering its great practical value, in this paper, we propose a large-scale benchmark dataset to bridge this gap, termed HARDVS, which contains 300 categories and more than 100K event sequences. We evaluate and report the performance of multiple popular HAR algorithms, which provide extensive baselines for future works to compare. More importantly, we propose a novel spatial-temporal feature learning and fusion framework, termed ESTF, for event stream based human activity recognition. It first projects the event streams into spatial and temporal embeddings using StemNet, then, encodes and fuses the dual-view representations using Transformer networks. Finally, the dual features are concatenated and fed into a classification head for activity prediction. Extensive experiments on multiple datasets fully validated the effectiveness of our model. Both the dataset and source code will be released on \url{https://github.com/Event-AHU/HARDVS}.
Air Pollution Hotspot Detection and Source Feature Analysis using Cross-domain Urban Data
Zhang, Yawen, Hannigan, Michael, Lv, Qin
Air pollution is a major global environmental health threat, in particular for people who live or work near pollution sources. Areas adjacent to pollution sources often have high ambient pollution concentrations, and those areas are commonly referred to as air pollution hotspots. Detecting and characterizing pollution hotspots are of great importance for air quality management, but are challenging due to the high spatial and temporal variability of air pollutants. In this work, we explore the use of mobile sensing data (i.e., air quality sensors installed on vehicles) to detect pollution hotspots. One major challenge with mobile sensing data is uneven sampling, i.e., data collection can vary by both space and time. To address this challenge, we propose a two-step approach to detect hotspots from mobile sensing data, which includes local spike detection and sample-weighted clustering. Essentially, this approach tackles the uneven sampling issue by weighting samples based on their spatial frequency and temporal hit rate, so as to identify robust and persistent hotspots. To contextualize the hotspots and discover potential pollution source characteristics, we explore a variety of cross-domain urban data and extract features from them. As a soft-validation of the extracted features, we build hotspot inference models for cities with and without mobile sensing data. Evaluation results using real-world mobile sensing air quality data as well as cross-domain urban data demonstrate the effectiveness of our approach in detecting and inferring pollution hotspots. Furthermore, the empirical analysis of hotspots and source features yields useful insights regarding neighborhood pollution sources.
The Association Between SOC and Land Prices Considering Spatial Heterogeneity Based on Finite Mixture Modeling
Kang, Woo Seok, Kim, Eunchan, Heo, Wookjae
An understanding of how Social Overhead Capital (SOC) is associated with the land value of the local community is important for effective urban planning. However, even within a district, there are multiple sections used for different purposes; the term for this is spatial heterogeneity. The spatial heterogeneity issue has to be considered when attempting to comprehend land prices. If there is spatial heterogeneity within a district, land prices can be managed by adopting the spatial clustering method. In this study, spatial attributes including SOC, socio-demographic features, and spatial information in a specific district are analyzed with Finite Mixture Modeling (FMM) in order to find (a) the optimal number of clusters and (b) the association among SOCs, socio-demographic features, and land prices. FMM is a tool used to find clusters and the attributes' coefficients simultaneously. Using the FMM method, the results show that four clusters exist in one district and the four clusters have different associations among SOCs, demographic features, and land prices. Policymakers and managerial administration need to look for information to make policy about land prices. The current study finds the consideration of closeness to SOC to be a significant factor on land prices and suggests the potential policy direction related to SOC.
Spatial Analysis of Physical Reservoir Computers
Love, Jake, Mulkers, Jeroen, Msiska, Robin, Bourianoff, George, Leliaert, Jonathan, Everschor-Sitte, Karin
Physical reservoir computing is a computational framework that implements spatiotemporal information processing directly within physical systems. By exciting nonlinear dynamical systems and creating linear models from their state, we can create highly energy-efficient devices capable of solving machine learning tasks without building a modular system consisting of millions of neurons interconnected by synapses. To act as an effective reservoir, the chosen dynamical system must have two desirable properties: nonlinearity and memory. We present task agnostic spatial measures to locally measure both of these properties and exemplify them for a specific physical reservoir based upon magnetic skyrmion textures. In contrast to typical reservoir computing metrics, these metrics can be resolved spatially and in parallel from a single input signal, allowing for efficient parameter search to design efficient and high-performance reservoirs. Additionally, we show the natural trade-off between memory capacity and nonlinearity in our reservoir's behaviour, both locally and globally. Finally, by balancing the memory and nonlinearity in a reservoir, we can improve its performance for specific tasks.
Spatial Temporal Graph Convolution with Graph Structure Self-learning for Early MCI Detection
Zhao, Yunpeng, Zhou, Fugen, Guo, Bin, Liu, Bo
Graph neural networks (GNNs) have been successfully applied to early mild cognitive impairment (EMCI) detection, with the usage of elaborately designed features constructed from blood oxygen level-dependent (BOLD) time series. However, few works explored the feasibility of using BOLD signals directly as features. Meanwhile, existing GNN-based methods primarily rely on hand-crafted explicit brain topology as the adjacency matrix, which is not optimal and ignores the implicit topological organization of the brain. In this paper, we propose a spatial temporal graph convolutional network with a novel graph structure self-learning mechanism for EMCI detection. The proposed spatial temporal graph convolution block directly exploits BOLD time series as input features, which provides an interesting view for rsfMRI-based preclinical AD diagnosis. Moreover, our model can adaptively learn the optimal topological structure and refine edge weights with the graph structure self-learning mechanism. Results on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database show that our method outperforms state-of-the-art approaches. Biomarkers consistent with previous studies can be extracted from the model, proving the reliable interpretability of our method.
So2Sat POP -- A Curated Benchmark Data Set for Population Estimation from Space on a Continental Scale
Doda, Sugandha, Wang, Yuanyuan, Kahl, Matthias, Hoffmann, Eike Jens, Ouan, Kim, Taubenböck, Hannes, Zhu, Xiao Xiang
Obtaining a dynamic population distribution is key to many decision-making processes such as urban planning, disaster management and most importantly helping the government to better allocate socio-technical supply. For the aspiration of these objectives, good population data is essential. The traditional method of collecting population data through the census is expensive and tedious. In recent years, statistical and machine learning methods have been developed to estimate population distribution. Most of the methods use data sets that are either developed on a small scale or not publicly available yet. Thus, the development and evaluation of new methods become challenging. We fill this gap by providing a comprehensive data set for population estimation in 98 European cities. The data set comprises a digital elevation model, local climate zone, land use proportions, nighttime lights in combination with multi-spectral Sentinel-2 imagery, and data from the Open Street Map initiative. We anticipate that it would be a valuable addition to the research community for the development of sophisticated approaches in the field of population estimation.
The Revisiting Problem in Simultaneous Localization and Mapping: A Survey on Visual Loop Closure Detection
Tsintotas, Konstantinos A., Bampis, Loukas, Gasteratos, Antonios
Where am I? This is one of the most critical questions that any intelligent system should answer to decide whether it navigates to a previously visited area. This problem has long been acknowledged for its challenging nature in simultaneous localization and mapping (SLAM), wherein the robot needs to correctly associate the incoming sensory data to the database allowing consistent map generation. The significant advances in computer vision achieved over the last 20 years, the increased computational power, and the growing demand for long-term exploration contributed to efficiently performing such a complex task with inexpensive perception sensors. In this article, visual loop closure detection, which formulates a solution based solely on appearance input data, is surveyed. We start by briefly introducing place recognition and SLAM concepts in robotics. Then, we describe a loop closure detection system's structure, covering an extensive collection of topics, including the feature extraction, the environment representation, the decision-making step, and the evaluation process. We conclude by discussing open and new research challenges, particularly concerning the robustness in dynamic environments, the computational complexity, and scalability in long-term operations. The article aims to serve as a tutorial and a position paper for newcomers to visual loop closure detection.
Spatial-Temporal Synchronous Graph Transformer network (STSGT) for COVID-19 forecasting
Banerjee, Soumyanil, Dong, Ming, Shi, Weisong
COVID-19 has become a matter of serious concern over the last few years. It has adversely affected numerous people around the globe and has led to the loss of billions of dollars of business capital. In this paper, we propose a novel Spatial-Temporal Synchronous Graph Transformer network (STSGT) to capture the complex spatial and temporal dependency of the COVID-19 time series data and forecast the future status of an evolving pandemic. The layers of STSGT combine the graph convolution network (GCN) with the self-attention mechanism of transformers on a synchronous spatial-temporal graph to capture the dynamically changing pattern of the COVID time series. The spatial-temporal synchronous graph simultaneously captures the spatial and temporal dependencies between the vertices of the graph at a given and subsequent time-steps, which helps capture the heterogeneity in the time series and improve the forecasting accuracy. Our extensive experiments on two publicly available real-world COVID-19 time series datasets demonstrate that STSGT significantly outperforms state-of-the-art algorithms that were designed for spatial-temporal forecasting tasks. Specifically, on average over a 12-day horizon, we observe a potential improvement of 12.19% and 3.42% in Mean Absolute Error(MAE) over the next best algorithm while forecasting the daily infected and death cases respectively for the 50 states of US and Washington, D.C. Additionally, STSGT also outperformed others when forecasting the daily infected cases at the state level, e.g., for all the counties in the State of Michigan. The code and models are publicly available at https://github.com/soumbane/STSGT.
GEO-BLEU: Similarity Measure for Geospatial Sequences
Shimizu, Toru, Tsubouchi, Kota, Yabe, Takahiro
In recent geospatial research, the importance of modeling large-scale human mobility data and predicting trajectories is rising, in parallel with progress in text generation using large-scale corpora in natural language processing. Whereas there are already plenty of feasible approaches applicable to geospatial sequence modeling itself, there seems to be room to improve with regard to evaluation, specifically about measuring the similarity between generated and reference trajectories. In this work, we propose a novel similarity measure, GEO-BLEU, which can be especially useful in the context of geospatial sequence modeling and generation. As the name suggests, this work is based on BLEU, one of the most popular measures used in machine translation research, while introducing spatial proximity to the idea of n-gram. We compare this measure with an established baseline, dynamic time warping, applying it to actual generated geospatial sequences. Using crowdsourced annotated data on the similarity between geospatial sequences collected from over 12,000 cases, we quantitatively and qualitatively show the proposed method's superiority.