Goto

Collaborating Authors

 Liu, Xinran


Accelerated Quasi-Static FEM for Real-Time Modeling of Continuum Robots with Multiple Contacts and Large Deformation

arXiv.org Artificial Intelligence

Continuum robots offer high flexibility and multiple degrees of freedom, making them ideal for navigating narrow lumens. However, accurately modeling their behavior under large deformations and frequent environmental contacts remains challenging. Current methods for solving the deformation of these robots, such as the Model Order Reduction and Gauss-Seidel (GS) methods, suffer from significant drawbacks. They experience reduced computational speed as the number of contact points increases and struggle to balance speed with model accuracy. To overcome these limitations, we introduce a novel finite element method (FEM) named Acc-FEM. Acc-FEM employs a large deformation quasi-static finite element model and integrates an accelerated solver scheme to handle multi-contact simulations efficiently. Additionally, it utilizes parallel computing with Graphics Processing Units (GPU) for real-time updates of the finite element models and collision detection. Extensive numerical experiments demonstrate that Acc-FEM significantly improves computational efficiency in modeling continuum robots with multiple contacts while achieving satisfactory accuracy, addressing the deficiencies of existing methods.


Fused Partial Gromov-Wasserstein for Structured Objects

arXiv.org Artificial Intelligence

Structured data, such as graphs, are vital in machine learning due to their capacity to capture complex relationships and interactions. In recent years, the Fused Gromov-Wasserstein (FGW) distance has attracted growing interest because it enables the comparison of structured data by jointly accounting for feature similarity and geometric structure. However, as a variant of optimal transport (OT), classical FGW assumes an equal mass constraint on the compared data. In this work, we relax this mass constraint and propose the Fused Partial Gromov-Wasserstein (FPGW) framework, which extends FGW to accommodate unbalanced data. Theoretically, we establish the relationship between FPGW and FGW and prove the metric properties of FPGW. Numerically, we introduce Frank-Wolfe solvers for the proposed FPGW framework and provide a convergence analysis. Finally, we evaluate the FPGW distance through graph classification and clustering experiments, demonstrating its robust performance, especially when data is corrupted by outlier noise.


ESPFormer: Doubly-Stochastic Attention with Expected Sliced Transport Plans

arXiv.org Artificial Intelligence

While self-attention has been instrumental in the success of Transformers, it can lead to over-concentration on a few tokens during training, resulting in suboptimal information flow. Enforcing doubly-stochastic constraints in attention matrices has been shown to improve structure and balance in attention distributions. However, existing methods rely on iterative Sinkhorn normalization, which is computationally costly. In this paper, we introduce a novel, fully parallelizable doubly-stochastic attention mechanism based on sliced optimal transport, leveraging Expected Sliced Transport Plans (ESP). Unlike prior approaches, our method enforces double stochasticity without iterative Sinkhorn normalization, significantly enhancing efficiency. To ensure differentiability, we incorporate a temperature-based soft sorting technique, enabling seamless integration into deep learning models. Experiments across multiple benchmark datasets, including image classification, point cloud classification, sentiment analysis, and neural machine translation, demonstrate that our enhanced attention regularization consistently improves performance across diverse applications.


HMCGeo: IP Region Prediction Based on Hierarchical Multi-label Classification

arXiv.org Artificial Intelligence

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China Emails: {23b903088, zhangzhaoxin, 22s030153, li.ning, 22b303010}@stu.hit.edu.cn School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China Email: xinran_Liu@bupt.edu.cn Abstract --Fine-grained IP geolocation plays a critical role in applications such as location-based services and cybersecurity. Most existing fine-grained IP geolocation methods are regression-based; however, due to noise in the input data, these methods typically encounter kilometer-level prediction errors and provide incorrect region information for users. T o address this issue, this paper proposes a novel hierarchical multi-label classification framework for IP region prediction, named HMCGeo. This framework treats IP geolocation as a hierarchical multi-label classification problem and employs residual connection-based feature extraction and attention prediction units to predict the target host region across multiple geographical granularities. Furthermore, we introduce probabilistic classification loss during training, combining it with hierarchical cross-entropy loss to form a composite loss function. IP region prediction experiments on the New Y ork, Los Angeles, and Shanghai datasets demonstrate that HMCGeo achieves superior performance across all geographical granularities, significantly outperforming existing IP geolocation methods. P geolocation is a technique used to predict the geographical location of a host based on its IP address [1], playing a crucial role in location-based services, network topology optimization, and cybersecurity [2], [3], [4], [5], [6], [7], [8]. Using IP geolocation technology, online services and applications infer the geographical location of users to deliver localized weather updates, news, and event notifications [3]. Internet service providers (ISPs) estimate the approximate location of target hosts to optimize traffic transmission paths, reduce network latency, and improve transmission efficiency [4]. Network analysts examine the geographical origins of incoming traffic to assess security threats from suspicious addresses. This research was supported by the National Key R&D Program of China (2024QY1103, 2018YFB18002). Based on the accuracy of prediction results, IP geolocation is categorized into coarse-grained and fine-grained geolocation. Coarse-grained IP geolocation predicts the location of a target host by utilizing allocation information such as Autonomous System Numbers (ASN), ISP, and BGP, or by analyzing the relationship between latency and distance. These methods construct geolocation databases that provide location information at the country or city level. Building on this foundation, fine-grained IP geolocation reduces prediction errors to a few kilometers in certain regions by leveraging richer landmarks or employing more effective prediction methods.


Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map

arXiv.org Artificial Intelligence

Ensuring adherence to traffic sign regulations is essential for both human and autonomous vehicle navigation. While current online mapping solutions often prioritize the construction of the geometric and connectivity layers of HD maps, overlooking the construction of the traffic regulation layer within HD maps. Addressing this gap, we introduce MapDR, a novel dataset designed for the extraction of Driving Rules from traffic signs and their association with vectorized, locally perceived HD Maps. MapDR features over $10,000$ annotated video clips that capture the intricate correlation between traffic sign regulations and lanes. Built upon this benchmark and the newly defined task of integrating traffic regulations into online HD maps, we provide modular and end-to-end solutions: VLE-MEE and RuleVLM, offering a strong baseline for advancing autonomous driving technology. It fills a critical gap in the integration of traffic sign rules, contributing to the development of reliable autonomous driving systems.


Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data

arXiv.org Artificial Intelligence

Efficient comparison of spherical probability distributions becomes important in fields such as computer vision, geosciences, and medicine. Sliced optimal transport distances, such as spherical and stereographic spherical sliced Wasserstein distances, have recently been developed to address this need. These methods reduce the computational burden of optimal transport by slicing hyperspheres into one-dimensional projections, i.e., lines or circles. Concurrently, linear optimal transport has been proposed to embed distributions into \( L^2 \) spaces, where the \( L^2 \) distance approximates the optimal transport distance, thereby simplifying comparisons across multiple distributions. In this work, we introduce the Linear Spherical Sliced Optimal Transport (LSSOT) framework, which utilizes slicing to embed spherical distributions into \( L^2 \) spaces while preserving their intrinsic geometry, offering a computationally efficient metric for spherical probability measures. We establish the metricity of LSSOT and demonstrate its superior computational efficiency in applications such as cortical surface registration, 3D point cloud interpolation via gradient flow, and shape embedding. Our results demonstrate the significant computational benefits and high accuracy of LSSOT in these applications.


Expected Sliced Transport Plans

arXiv.org Artificial Intelligence

The optimal transport (OT) problem has gained significant traction in modern machine learning for its ability to: (1) provide versatile metrics, such as Wasserstein distances and their variants, and (2) determine optimal couplings between probability measures. To reduce the computational complexity of OT solvers, methods like entropic regularization and sliced optimal transport have been proposed. The sliced OT framework improves efficiency by comparing one-dimensional projections (slices) of high-dimensional distributions. However, despite their computational efficiency, sliced-Wasserstein approaches lack a transportation plan between the input measures, limiting their use in scenarios requiring explicit coupling. In this paper, we address two key questions: Can a transportation plan be constructed between two probability measures using the sliced transport framework? If so, can this plan be used to define a metric between the measures? We propose a "lifting" operation to extend one-dimensional optimal transport plans back to the original space of the measures. By computing the expectation of these lifted plans, we derive a new transportation plan, termed expected sliced transport (EST) plans. We prove that using the EST plan to weight the sum of the individual Euclidean costs for moving from one point to another results in a valid metric between the input discrete probability measures. We demonstrate the connection between our approach and the recently proposed min-SWGG, along with illustrative numerical examples that support our theoretical findings.


Stereographic Spherical Sliced Wasserstein Distances

arXiv.org Artificial Intelligence

Applications involving distributions defined on a hypersphere are remarkably diverse, highlighting the importance of spherical geometries across various disciplines. These applications include: 1) mapping the distribution of geographic or geological features on celestial bodies, such as stars and planets [39, 8, 60], 2) magnetoencephalography (MEG) imaging [75] in medical domains, 3) spherical image representations and 360 images [13, 38], such as omnidirectional images in computer vision [40], 4) texture mapping in computer graphics [24, 21], and more recently, 5) deep representation learning, where the latent representation is often mapped to a bounded space, commonly a sphere, where cosine similarity is utilized for effective representation learning [11, 76]. The analysis of distributions on hyperspheres is traditionally approached through directional statistics, also referred to as circular/spherical statistics [37, 52, 50, 61]. This specialized field is dedicated to the statistical analysis of directions, orientations, and rotations. More recently, with the growing application of optimal transport theory [74, 62] in machine learning, due in part to its favorable statistical, geometrical, and topological properties, there has been an increasing interest in using optimal transport to compare spherical probability measures [14, 32]. One of the main bottlenecks in optimal transport theory is its high computational cost, generally of cubic complexity.


LCOT: Linear circular optimal transport

arXiv.org Artificial Intelligence

The optimal transport problem for measures supported on non-Euclidean spaces has recently gained ample interest in diverse applications involving representation learning. In this paper, we focus on circular probability measures, i.e., probability measures supported on the unit circle, and introduce a new computationally efficient metric for these measures, denoted as Linear Circular Optimal Transport (LCOT). The proposed metric comes with an explicit linear embedding that allows one to apply Machine Learning (ML) algorithms to the embedded measures and seamlessly modify the underlying metric for the ML algorithm to LCOT. We show that the proposed metric is rooted in the Circular Optimal Transport (COT) and can be considered the linearization of the COT metric with respect to a fixed reference measure. We provide a theoretical analysis of the proposed metric and derive the computational complexities for pairwise comparison of circular probability measures. Lastly, through a set of numerical experiments, we demonstrate the benefits of LCOT in learning representations of circular measures.


PT$\mathrm{L}^{p}$: Partial Transport $\mathrm{L}^{p}$ Distances

arXiv.org Artificial Intelligence

Optimal transport and its related problems, including optimal partial transport, have proven to be valuable tools in machine learning for computing meaningful distances between probability or positive measures. This success has led to a growing interest in defining transport-based distances that allow for comparing signed measures and, more generally, multi-channeled signals. Transport $\mathrm{L}^{p}$ distances are notable extensions of the optimal transport framework to signed and possibly multi-channeled signals. In this paper, we introduce partial transport $\mathrm{L}^{p}$ distances as a new family of metrics for comparing generic signals, benefiting from the robustness of partial transport distances. We provide theoretical background such as the existence of optimal plans and the behavior of the distance in various limits. Furthermore, we introduce the sliced variation of these distances, which allows for rapid comparison of generic signals. Finally, we demonstrate the application of the proposed distances in signal class separability and nearest neighbor classification.