Goto

Collaborating Authors

 adjacent node



Learning Superpoint Graph Cut for 3D Instance Segmentation

Neural Information Processing Systems

In this paper, we propose a learning-based superpoint graph cut method that explicitly learns the local geometric structures of the point cloud for 3D instance segmentation. Specifically, we first oversegment the raw point clouds into superpoints and construct the superpoint graph. Then, we propose an edge score prediction network to predict the edge scores of the superpoint graph, where the similarity vectors of two adjacent nodes learned through cross-graph attention in the coordinate and feature spaces are used for regressing edge scores. By forcing two adjacent nodes of the same instance to be close to the instance center in the coordinate and feature spaces, we formulate a geometry-aware edge loss to train the edge score prediction network. Finally, we develop a superpoint graph cut network that employs the learned edge scores and the predicted semantic classes of nodes to generate instances, where bilateral graph attention is proposed to extract discriminative features on both the coordinate and feature spaces for predicting semantic labels and scores of instances. Extensive experiments on two challenging datasets, ScanNet v2 and S3DIS, show that our method achieves new state-of-the-art performance on 3D instance segmentation.


Design for One, Deploy for Many: Navigating Tree Mazes with Multiple Agents

arXiv.org Artificial Intelligence

Maze-like environments, such as cave and pipe networks, pose unique challenges for multiple robots to coordinate, including communication constraints and congestion. To address these challenges, we propose a distributed multi-agent maze traversal algorithm for environments that can be represented by acyclic graphs. It uses a leader-switching mechanism where one agent, assuming a head role, employs any single-agent maze solver while the other agents each choose an agent to follow. The head role gets transferred to neighboring agents where necessary, ensuring it follows the same path as a single agent would. The multi-agent maze traversal algorithm is evaluated in simulations with groups of up to 300 agents, various maze sizes, and multiple single-agent maze solvers. It is compared against strategies that are naïve, or assume either global communication or full knowledge of the environment. The algorithm outperforms the naïve strategy in terms of makespan and sum-of-fuel. It is superior to the global-communication strategy in terms of makespan but is inferior to it in terms of sum-of-fuel. The findings suggest it is asymptotically equivalent to the full-knowledge strategy with respect to either metric. Moreover, real-world experiments with up to 20 Pi-puck robots confirm the feasibility of the approach.


Learning Superpoint Graph Cut for 3D Instance Segmentation

Neural Information Processing Systems

In this paper, we propose a learning-based superpoint graph cut method that explicitly learns the local geometric structures of the point cloud for 3D instance segmentation. Specifically, we first oversegment the raw point clouds into superpoints and construct the superpoint graph. Then, we propose an edge score prediction network to predict the edge scores of the superpoint graph, where the similarity vectors of two adjacent nodes learned through cross-graph attention in the coordinate and feature spaces are used for regressing edge scores. By forcing two adjacent nodes of the same instance to be close to the instance center in the coordinate and feature spaces, we formulate a geometry-aware edge loss to train the edge score prediction network. Finally, we develop a superpoint graph cut network that employs the learned edge scores and the predicted semantic classes of nodes to generate instances, where bilateral graph attention is proposed to extract discriminative features on both the coordinate and feature spaces for predicting semantic labels and scores of instances. Extensive experiments on two challenging datasets, ScanNet v2 and S3DIS, show that our method achieves new state-of-the-art performance on 3D instance segmentation.


GRID: Protecting Training Graph from Link Stealing Attacks on GNN Models

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) have exhibited superior performance in various classification tasks on graph-structured data. However, they encounter the potential vulnerability from the link stealing attacks, which can infer the presence of a link between two nodes via measuring the similarity of its incident nodes' prediction vectors produced by a GNN model. Such attacks pose severe security and privacy threats to the training graph used in GNN models. In this work, we propose a novel solution, called Graph Link Disguise (GRID), to defend against link stealing attacks with the formal guarantee of GNN model utility for retaining prediction accuracy. The key idea of GRID is to add carefully crafted noises to the nodes' prediction vectors for disguising adjacent nodes as n-hop indirect neighboring nodes. We take into account the graph topology and select only a subset of nodes (called core nodes) covering all links for adding noises, which can avert the noises offset and have the further advantages of reducing both the distortion loss and the computation cost. Our crafted noises can ensure 1) the noisy prediction vectors of any two adjacent nodes have their similarity level like that of two non-adjacent nodes and 2) the model prediction is unchanged to ensure zero utility loss. Extensive experiments on five datasets are conducted to show the effectiveness of our proposed GRID solution against different representative link-stealing attacks under transductive settings and inductive settings respectively, as well as two influence-based attacks. Meanwhile, it achieves a much better privacy-utility trade-off than existing methods when extended to GNNs.


Tighnari: Multi-modal Plant Species Prediction Based on Hierarchical Cross-Attention Using Graph-Based and Vision Backbone-Extracted Features

arXiv.org Artificial Intelligence

Predicting plant species composition in specific spatiotemporal contexts plays an important role in biodiversity management and conservation, as well as in improving species identification tools. Our work utilizes 88,987 plant survey records conducted in specific spatiotemporal contexts across Europe. We also use the corresponding satellite images, time series data, climate time series, and other rasterized environmental data such as land cover, human footprint, bioclimatic, and soil variables as training data to train the model to predict the outcomes of 4,716 plant surveys. We propose a feature construction and result correction method based on the graph structure. Through comparative experiments, we select the best-performing backbone networks for feature extraction in both temporal and image modalities. In this process, we built a backbone network based on the Swin-Transformer Block for extracting temporal Cubes features. We then design a hierarchical cross-attention mechanism capable of robustly fusing features from multiple modalities. During training, we adopt a 10-fold cross-fusion method based on fine-tuning and use a Threshold Top-K method for post-processing. Ablation experiments demonstrate the improvements in model performance brought by our proposed solution pipeline.


Automatic parking planning control method based on improved A* algorithm

arXiv.org Artificial Intelligence

As the trend of moving away from high-precision maps gradually emerges in the autonomous driving industry,traditional planning algorithms are gradually exposing some problems. To address the high real-time, high precision, and high trajectory quality requirements posed by the automatic parking task under real-time perceived local maps,this paper proposes an improved automatic parking planning algorithm based on the A* algorithm, and uses Model Predictive Control (MPC) as the control module for automatic parking.The algorithm enhances the planning real-time performance by optimizing heuristic functions, binary heap optimization, and bidirectional search; it calculates the passability of narrow areas by dynamically loading obstacles and introduces the vehicle's own volume during planning; it improves trajectory quality by using neighborhood expansion and Bezier curve optimization methods to meet the high trajectory quality requirements of the parking task. After obtaining the output results of the planning algorithm, a loss function is designed according to the characteristics of the automatic parking task under local maps, and the MPC algorithm is used to output control commands to drive the car along the planned trajectory. This paper uses the perception results of real driving environments converted into maps as planning inputs to conduct simulation tests and ablation experiments on the algorithm. Experimental results show that the improved algorithm proposed in this paper can effectively meet the special requirements of automatic parking under local maps and complete the automatic parking planning and control tasks.


Modeling Clutter Perception using Parametric Proto-object Partitioning

Neural Information Processing Systems

Visual clutter, the perception of an image as being crowded and disordered, affects aspects of our lives ranging from object detection to aesthetics, yet relatively little effort has been made to model this important and ubiquitous percept. Our approach models clutter as the number of proto-objects segmented from an image, with proto-objects defined as groupings of superpixels that are similar in intensity, color, and gradient orientation features. We introduce a novel parametric method of clustering superpixels by modeling mixture of Weibulls on Earth Mover's Distance statistics, then taking the normalized number of proto-objects following partitioning as our estimate of clutter perception. We validated this model using a new 90-image dataset of real world scenes rank ordered by human raters for clutter, and showed that our method not only predicted clutter extremely well (Spearman's ρ = 0.8038, p < 0.001), but also outperformed all existing clutter perception models and even a behavioral object segmentation ground truth. We conclude that the number of proto-objects in an image affects clutter perception more than the number of objects or features.


Leveraging heterogeneous spillover effects in maximizing contextual bandit rewards

arXiv.org Artificial Intelligence

Recommender systems relying on contextual multi-armed bandits continuously improve relevant item recommendations by taking into account the contextual information. The objective of these bandit algorithms is to learn the best arm (i.e., best item to recommend) for each user and thus maximize the cumulative rewards from user engagement with the recommendations. However, current approaches ignore potential spillover between interacting users, where the action of one user can impact the actions and rewards of other users. Moreover, spillover may vary for different people based on their preferences and the closeness of ties to other users. This leads to heterogeneity in the spillover effects, i.e., the extent to which the action of one user can impact the action of another. Here, we propose a framework that allows contextual multi-armed bandits to account for such heterogeneous spillovers when choosing the best arm for each user. By experimenting on several real-world datasets using prominent linear and non-linear contextual bandit algorithms, we observe that our proposed method leads to significantly higher rewards than existing solutions that ignore spillover.


Random Majority Opinion Diffusion: Stabilization Time, Absorbing States, and Influential Nodes

arXiv.org Artificial Intelligence

Consider a graph G with n nodes and m edges, which represents a social network, and assume that initially each node is blue or white. In each round, all nodes simultaneously update their color to the most frequent color in their neighborhood. This is called the Majority Model (MM) if a node keeps its color in case of a tie and the Random Majority Model (RMM) if it chooses blue with probability 1/2 and white otherwise. We prove that there are graphs for which RMM needs exponentially many rounds to reach a stable configuration in expectation, and such a configuration can have exponentially many states (i.e., colorings). This is in contrast to MM, which is known to always reach a stable configuration with one or two states in $O(m)$ rounds. For the special case of a cycle graph C_n, we prove the stronger and tight bounds of $\lceil n/2\rceil-1$ and $O(n^2)$ in MM and RMM, respectively. Furthermore, we show that the number of stable colorings in MM on C_n is equal to $\Theta(\Phi^n)$, where $\Phi = (1+\sqrt{5})/2$ is the golden ratio, while it is equal to 2 for RMM. We also study the minimum size of a winning set, which is a set of nodes whose agreement on a color in the initial coloring enforces the process to end in a coloring where all nodes share that color. We present tight bounds on the minimum size of a winning set for both MM and RMM. Furthermore, we analyze our models for a random initial coloring, where each node is colored blue independently with some probability $p$ and white otherwise. Using some martingale analysis and counting arguments, we prove that the expected final number of blue nodes is respectively equal to $(2p^2-p^3)n/(1-p+p^2)$ and pn in MM and RMM on a cycle graph C_n. Finally, we conduct some experiments which complement our theoretical findings and also lead to the proposal of some intriguing open problems and conjectures to be tackled in future work.