spatial distance
End-to-end Autonomous Vehicle Following System using Monocular Fisheye Camera
Zhang, Jiale, Qian, Yeqiang, Qin, Tong, Jiang, Mingyang, Chen, Siyuan, Yang, Ming
The increase in vehicle ownership has led to increased traffic congestion, more accidents, and higher carbon emissions. Vehicle platooning is a promising solution to address these issues by improving road capacity and reducing fuel consumption. However, existing platooning systems face challenges such as reliance on lane markings and expensive high-precision sensors, which limits their general applicability. To address these issues, we propose a vehicle following framework that expands its capability from restricted scenarios to general scenario applications using only a camera. This is achieved through our newly proposed end-to-end method, which improves overall driving performance. The method incorporates a semantic mask to address causal confusion in multi-frame data fusion. Additionally, we introduce a dynamic sampling mechanism to precisely track the trajectories of preceding vehicles. Extensive closed-loop validation in real-world vehicle experiments demonstrates the system's ability to follow vehicles in various scenarios, outperforming traditional multi-stage algorithms. This makes it a promising solution for cost-effective autonomous vehicle platooning. A complete real-world vehicle experiment is available at https://youtu.be/zL1bcVb9kqQ.
- Asia > China > Shanghai > Shanghai (0.06)
- North America > United States > California (0.04)
- Asia > China > Shaanxi Province > Xi'an (0.04)
- (2 more...)
- Transportation > Ground > Road (1.00)
- Energy (1.00)
- Transportation > Passenger (0.93)
Enhancing Contrastive Learning for Geolocalization by Discovering Hard Negatives on Semivariograms
Chen, Boyi, Wang, Zhangyu, Deuser, Fabian, Zollner, Johann Maximilian, Werner, Martin
Accurate and robust image-based geo-localization at a global scale is challenging due to diverse environments, visually ambiguous scenes, and the lack of distinctive landmarks in many regions. While contrastive learning methods show promising performance by aligning features between street-view images and corresponding locations, they neglect the underlying spatial dependency in the geographic space. As a result, they fail to address the issue of false negatives -- image pairs that are both visually and geographically similar but labeled as negatives, and struggle to effectively distinguish hard negatives, which are visually similar but geographically distant. To address this issue, we propose a novel spatially regularized contrastive learning strategy that integrates a semivariogram, which is a geostatistical tool for modeling how spatial correlation changes with distance. We fit the semivariogram by relating the distance of images in feature space to their geographical distance, capturing the expected visual content in a spatial correlation. With the fitted semivariogram, we define the expected visual dissimilarity at a given spatial distance as reference to identify hard negatives and false negatives. We integrate this strategy into GeoCLIP and evaluate it on the OSV5M dataset, demonstrating that explicitly modeling spatial priors improves image-based geo-localization performance, particularly at finer granularity.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.16)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > United States > Maine (0.05)
- (3 more...)
Explaining Vision GNNs: A Semantic and Visual Analysis of Graph-based Image Classification
Chaidos, Nikolaos, Dimitriou, Angeliki, Spanos, Nikolaos, Voulodimos, Athanasios, Stamou, Giorgos
Graph Neural Networks (GNNs) have emerged as an efficient alternative to convolutional approaches for vision tasks such as image classification, leveraging patch-based representations instead of raw pixels. These methods construct graphs where image patches serve as nodes, and edges are established based on patch similarity or classification relevance. Despite their efficiency, the explainability of GNN-based vision models remains underexplored, even though graphs are naturally interpretable. In this work, we analyze the semantic consistency of the graphs formed at different layers of GNN-based image classifiers, focusing on how well they preserve object structures and meaningful relationships. A comprehensive analysis is presented by quantifying the extent to which inter-layer graph connections reflect semantic similarity and spatial coherence. Explanations from standard and adversarial settings are also compared to assess whether they reflect the classifiers' robustness. Additionally, we visualize the flow of information across layers through heatmap-based visualization techniques, thereby highlighting the models' explainability. Our findings demonstrate that the decision-making processes of these models can be effectively explained, while also revealing that their reasoning does not necessarily align with human perception, especially in deeper layers.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Spatial distance dependent Chinese restaurant processes for image segmentation
The distance dependent Chinese restaurant process (ddCRP) was recently introduced to accommodate random partitions of non-exchangeable data [1]. The dd-CRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the dd-CRP in a spatial setting with the goal of natural image segmentation. We explore the biases of the spatial ddCRP model and propose a novel hierarchical extension better suited for producing "human-like" segmentations. We then study the sensitivity of the models to various distance and appearance hyperparameters, and provide the first rigorous comparison of nonparametric Bayesian models in the image segmentation domain. On unsupervised image segmentation, we demonstrate that similar performance to existing nonparametric Bayesian models is possible with substantially simpler models and algorithms.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
LTS-NET: End-to-end Unsupervised Learning of Long-Term 3D Stable objects
Hroob, Ibrahim, Molina, Sergi, Polvara, Riccardo, Cielniak, Grzegorz, Hanheide, Marc
In this research, we present an end-to-end data-driven pipeline for determining the long-term stability status of objects within a given environment, specifically distinguishing between static and dynamic objects. Understanding object stability is key for mobile robots since long-term stable objects can be exploited as landmarks for long-term localisation. Our pipeline includes a labelling method that utilizes historical data from the environment to generate training data for a neural network. Rather than utilizing discrete labels, we propose the use of point-wise continuous label values, indicating the spatio-temporal stability of individual points, to train a point cloud regression network named LTS-NET. Our approach is evaluated on point cloud data from two parking lots in the NCLT dataset, and the results show that our proposed solution, outperforms direct training of a classification model for static vs dynamic object classification.
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Lincolnshire > Lincoln (0.04)
Spatial distance dependent Chinese restaurant processes for image segmentation
The distance dependent Chinese restaurant process (ddCRP) was recently introduced to accommodate random partitions of non-exchangeable data. The ddCRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the ddCRP in a spatial setting with the goal of natural image segmentation. We explore the biases of the spatial ddCRP model and propose a novel hierarchical extension better suited for producing "human-like" segmentations. We then study the sensitivity of the models to various distance and appearance hyperparameters, and provide the first rigorous comparison of nonparametric Bayesian models in the image segmentation domain.
Generative Models and Learning Algorithms for Core-Periphery Structured Graphs
Gurugubelli, Sravanthi, Chepuri, Sundeep Prabhakar
We consider core-periphery structured graphs, which are graphs with a group of densely and sparsely connected nodes, respectively, referred to as core and periphery nodes. The so-called core score of a node is related to the likelihood of it being a core node. In this paper, we focus on learning the core scores of a graph from its node attributes and connectivity structure. To this end, we propose two classes of probabilistic graphical models: affine and nonlinear. First, we describe affine generative models to model the dependence of node attributes on its core scores, which determine the graph structure. Next, we discuss nonlinear generative models in which the partial correlations of node attributes influence the graph structure through latent core scores. We develop algorithms for inferring the model parameters and core scores of a graph when both the graph structure and node attributes are available. When only the node attributes of graphs are available, we jointly learn a core-periphery structured graph and its core scores. We provide results from numerical experiments on several synthetic and real-world datasets to demonstrate the efficacy of the developed models and algorithms.
- Europe > United Kingdom > England > Greater London > London (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Texas > Tarrant County > Fort Worth (0.04)
- (12 more...)
STAN: Spatio-Temporal Attention Network for Next Location Recommendation
Luo, Yingtao, Liu, Qiang, Liu, Zhaocheng
The next location recommendation is at the core of various location-based applications. Current state-of-the-art models have attempted to solve spatial sparsity with hierarchical gridding and model temporal relation with explicit time intervals, while some vital questions remain unsolved. Non-adjacent locations and non-consecutive visits provide non-trivial correlations for understanding a user's behavior but were rarely considered. To aggregate all relevant visits from user trajectory and recall the most plausible candidates from weighted representations, here we propose a Spatio-Temporal Attention Network (STAN) for location recommendation. STAN explicitly exploits relative spatiotemporal information of all the check-ins with self-attention layers along the trajectory. This improvement allows a point-to-point interaction between non-adjacent locations and non-consecutive check-ins with explicit spatiotemporal effect. STAN uses a bi-layer attention architecture that firstly aggregates spatiotemporal correlation within user trajectory and then recalls the target with consideration of personalized item frequency (PIF). By visualization, we show that STAN is in line with the above intuition. Experimental results unequivocally show that our model outperforms the existing state-of-the-art methods by 9-17%.
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
Spatial distance dependent Chinese restaurant processes for image segmentation
Ghosh, Soumya, Ungureanu, Andrei B., Sudderth, Erik B., Blei, David M.
The distance dependent Chinese restaurant process (ddCRP) was recently introduced to accommodate random partitions of non-exchangeable data. The ddCRP clusters data in a biased way: each data point is more likely to be clustered with other data that are near it in an external sense. This paper examines the ddCRP in a spatial setting with the goal of natural image segmentation. We explore the biases of the spatial ddCRP model and propose a novel hierarchical extension better suited for producing "human-like" segmentations. We then study the sensitivity of the models to various distance and appearance hyperparameters, and provide the first rigorous comparison of nonparametric Bayesian models in the image segmentation domain.
The Length of Bridge Ties: Structural and Geographic Properties of Online Social Interactions
Volkovich, Yana (Barcelona Media Foundation) | Scellato, Salvatore (University of Cambridge) | Laniado, David (Barcelona Media Foundation) | Mascolo, Cecilia (University of Cambridge) | Kaltenbrunner, Andreas (Barcelona Media Foundation)
The popularity of the Web has allowed individuals to communicate and interact with each other on a global scale: people connect both to close friends and acquaintances, creating ties that can bridge otherwise separated groups of people. Recent evidence suggests that spatial distance is still affecting social links established on online platforms, with online ties preferentially connecting closer people. In this work we study the relationships between interaction strength, spatial distance and structural position of ties between members of a large-scale online social networking platform, Tuenti. We discover that ties in highly connected social groups tend to span shorter distances than connections bridging together otherwise separated portions of the network. We also find that such bridging connections have lower social interaction levels than ties within the inner core of the network and ties connecting to its periphery. Our results suggest that spatial constraints on online social networks are intimately connected to structural network properties, with important consequences for information diffusion.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)