Goto

Collaborating Authors

 Clustering


Multi-Robot Task Allocation for Homogeneous Tasks with Collision Avoidance via Spatial Clustering

arXiv.org Artificial Intelligence

In this paper, a novel framework is presented that achieves a combined solution based on Multi-Robot Task Allocation (MRTA) and collision avoidance with respect to homogeneous measurement tasks taking place in industrial environments. The spatial clustering we propose offers to simultaneously solve the task allocation problem and deal with collision risks by cutting the workspace into distinguishable operational zones for each robot. To divide task sites and to schedule robot routes within corresponding clusters, we use K-means clustering and the 2-Opt algorithm. The presented framework shows satisfactory performance, where up to 93\% time reduction (1.24s against 17.62s) with a solution quality improvement of up to 7\% compared to the best performing method is demonstrated. Our method also completely eliminates collision points that persist in comparative methods in a most significant sense. Theoretical analysis agrees with the claim that spatial partitioning unifies the apparently disjoint tasks allocation and collision avoidance problems under conditions of many identical tasks to be distributed over sparse geographical areas. Ultimately, the findings in this work are of substantial importance for real world applications where both computational efficiency and operation free from collisions is of paramount importance.


Energy-Efficient Federated Learning for AIoT using Clustering Methods

arXiv.org Artificial Intelligence

While substantial research has been devoted to optimizing model performance, convergence rates, and communication efficiency, the energy implications of federated learning (FL) within Artificial Intelligence of Things (AIoT) scenarios are often overlooked in the existing literature. This study examines the energy consumed during the FL process, focusing on three main energy-intensive processes: pre-processing, communication, and local learning, all contributing to the overall energy footprint. We rely on the observation that device/client selection is crucial for speeding up the convergence of model training in a distributed AIoT setting and propose two clustering-informed methods. These clustering solutions are designed to group AIoT devices with similar label distributions, resulting in clusters composed of nearly heterogeneous devices. Hence, our methods alleviate the heterogeneity often encountered in real-world distributed learning applications. Throughout extensive numerical experimentation, we demonstrate that our clustering strategies typically achieve high convergence rates while maintaining low energy consumption when compared to other recent approaches available in the literature.


Rhomboid Tiling for Geometric Graph Deep Learning

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) have proven effective for learning from graph-structured data through their neighborhood-based message passing framework. Many hierarchical graph clustering pooling methods modify this framework by introducing clustering-based strategies, enabling the construction of more expressive and powerful models. However, all of these message passing framework heavily rely on the connectivity structure of graphs, limiting their ability to capture the rich geometric features inherent in geometric graphs. To address this, we propose Rhomboid Tiling (RT) clustering, a novel clustering method based on the rhomboid tiling structure, which performs clustering by leveraging the complex geometric information of the data and effectively extracts its higher-order geometric structures. Moreover, we design RTPool, a hierarchical graph clustering pooling model based on RT clustering for graph classification tasks. The proposed model demonstrates superior performance, outperforming 21 state-of-the-art competitors on all the 7 benchmark datasets.


Enhancing Trust Management System for Connected Autonomous Vehicles Using Machine Learning Methods: A Survey

arXiv.org Artificial Intelligence

Connected Autonomous Vehicles (CAVs) operate in dynamic, open, and multi-domain networks, rendering them vulnerable to various threats. Trust Management Systems (TMS) systematically organize essential steps in the trust mechanism, identifying malicious nodes against internal threats and external threats, as well as ensuring reliable decision-making for more cooperative tasks. Recent advances in machine learning (ML) offer significant potential to enhance TMS, especially for the strict requirements of CAVs, such as CAV nodes moving at varying speeds, and opportunistic and intermittent network behavior. Those features distinguish ML-based TMS from social networks, static IoT, and Social IoT. This survey proposes a novel three-layer ML-based TMS framework for CAVs in the vehicle-road-cloud integration system, i.e., trust data layer, trust calculation layer and trust incentive layer. A six-dimensional taxonomy of objectives is proposed. Furthermore, the principles of ML methods for each module in each layer are analyzed. Then, recent studies are categorized based on traffic scenarios that are against the proposed objectives. Finally, future directions are suggested, addressing the open issues and meeting the research trend. We maintain an active repository that contains up-to-date literature and open-source projects at https://github.com/octoberzzzzz/ML-based-TMS-CAV-Survey.


An Identifiable Cost-Aware Causal Decision-Making Framework Using Counterfactual Reasoning

arXiv.org Artificial Intelligence

Decision making under abnormal conditions is a critical process that involves evaluating the current state and determining the optimal action to restore the system to a normal state at an acceptable cost. However, in such scenarios, existing decision-making frameworks highly rely on reinforcement learning or root cause analysis, resulting in them frequently neglecting the cost of the actions or failing to incorporate causal mechanisms adequately. By relaxing the existing causal decision framework to solve the necessary cause, we propose a minimum-cost causal decision (MiCCD) framework via counterfactual reasoning to address the above challenges. Emphasis is placed on making counterfactual reasoning processes identifiable in the presence of a large amount of mixed anomaly data, as well as finding the optimal intervention state in a continuous decision space. Specifically, it formulates a surrogate model based on causal graphs, using abnormal pattern clustering labels as supervisory signals. This enables the approximation of the structural causal model among the variables and lays a foundation for identifiable counterfactual reasoning. With the causal structure approximated, we then established an optimization model based on counterfactual estimation. The Sequential Least Squares Programming (SLSQP) algorithm is further employed to optimize intervention strategies while taking costs into account. Experimental evaluations on both synthetic and real-world datasets reveal that MiCCD outperforms conventional methods across multiple metrics, including F1-score, cost efficiency, and ranking quality(nDCG@k values), thus validating its efficacy and broad applicability.


Unsupervised Urban Land Use Mapping with Street View Contrastive Clustering and a Geographical Prior

arXiv.org Artificial Intelligence

Urban land use classification and mapping are critical for urban planning, resource management, and environmental monitoring. Existing remote sensing techniques often lack precision in complex urban environments due to the absence of ground-level details. Unlike aerial perspectives, street view images provide a ground-level view that captures more human and social activities relevant to land use in complex urban scenes. Existing street view-based methods primarily rely on supervised classification, which is challenged by the scarcity of high-quality labeled data and the difficulty of generalizing across diverse urban landscapes. This study introduces an unsupervised contrastive clustering model for street view images with a built-in geographical prior, to enhance clustering performance. When combined with a simple visual assignment of the clusters, our approach offers a flexible and customizable solution to land use mapping, tailored to the specific needs of urban planners. We experimentally show that our method can generate land use maps from geotagged street view image datasets of two cities. As our methodology relies on the universal spatial coherence of geospatial data ("Tobler's law"), it can be adapted to various settings where street view images are available, to enable scalable, unsupervised land use mapping and updating. The code will be available at https://github.com/lin102/CCGP.


Deep Representation Learning for Unsupervised Clustering of Myocardial Fiber Trajectories in Cardiac Diffusion Tensor Imaging

arXiv.org Artificial Intelligence

Understanding the complex myocardial architecture is critical for diagnosing and treating heart disease. However, existing methods often struggle to accurately capture this intricate structure from Diffusion Tensor Imaging (DTI) data, particularly due to the lack of ground truth labels and the ambiguous, intertwined nature of fiber trajectories. We present a novel deep learning framework for unsupervised clustering of myocardial fibers, providing a data-driven approach to identifying distinct fiber bundles. We uniquely combine a Bidirectional Long Short-Term Memory network to capture local sequential information along fibers, with a Transformer autoencoder to learn global shape features, with pointwise incorporation of essential anatomical context. Clustering these representations using a density-based algorithm identifies 33 to 62 robust clusters, successfully capturing the subtle distinctions in fiber trajectories with varying levels of granularity. Our framework offers a new, flexible, and quantitative way to analyze myocardial structure, achieving a level of delineation that, to our knowledge, has not been previously achieved, with potential applications in improving surgical planning, characterizing disease-related remodeling, and ultimately, advancing personalized cardiac care.


Clustering of Incomplete Data via a Bipartite Graph Structure

arXiv.org Artificial Intelligence

There are various approaches to graph learning for data clustering, incorporating different spectral and structural constraints through diverse graph structures. Some methods rely on bipartite graph models, where nodes are divided into two classes: centers and members. These models typically require access to data for the center nodes in addition to observations from the member nodes. However, such additional data may not always be available in many practical scenarios. Moreover, popular Gaussian models for graph learning have demonstrated limited effectiveness in modeling data with heavy-tailed distributions, which are common in financial markets. In this paper, we propose a clustering method based on a bipartite graph model that addresses these challenges. First, it can infer clusters from incomplete data without requiring information about the center nodes. Second, it is designed to effectively handle heavy-tailed data. Numerical experiments using real financial data validate the efficiency of the proposed method for data clustering.


Fair Clustering with Clusterlets

arXiv.org Artificial Intelligence

Given their widespread usage in the real world, the fairness of clustering methods has become of major interest. Theoretical results on fair clustering show that fairness enjoys transitivity: given a set of small and fair clusters, a trivial centroid-based clustering algorithm yields a fair clustering. Unfortunately, discovering a suitable starting clustering can be computationally expensive, rather complex or arbitrary. In this paper, we propose a set of simple \emph{clusterlet}-based fuzzy clustering algorithms that match single-class clusters, optimizing fair clustering. Matching leverages clusterlet distance, optimizing for classic clustering objectives, while also regularizing for fairness. Empirical results show that simple matching strategies are able to achieve high fairness, and that appropriate parameter tuning allows to achieve high cohesion and low overlap.


FCPCA: Fuzzy clustering of high-dimensional time series based on common principal component analysis

arXiv.org Machine Learning

Clustering multivariate time series data is a crucial task in many domains, as it enables the identification of meaningful patterns and groups in time-evolving data. Traditional approaches, such as crisp clustering, rely on the assumption that clusters are sufficiently separated with little overlap. However, real-world data often defy this assumption, exhibiting overlapping distributions or overlapping clouds of points and blurred boundaries between clusters. Fuzzy clustering offers a compelling alternative by allowing partial membership in multiple clusters, making it well-suited for these ambiguous scenarios. Despite its advantages, current fuzzy clustering methods primarily focus on univariate time series, and for multivariate cases, even datasets of moderate dimensionality become computationally prohibitive. This challenge is further exacerbated when dealing with time series of varying lengths, leaving a clear gap in addressing the complexities of modern datasets. This work introduces a novel fuzzy clustering approach based on common principal component analysis to address the aforementioned shortcomings. Our method has the advantage of efficiently handling high-dimensional multivariate time series by reducing dimensionality while preserving critical temporal features. Extensive numerical results show that our proposed clustering method outperforms several existing approaches in the literature. An interesting application involving brain signals from different drivers recorded from a simulated driving experiment illustrates the potential of the approach.