mmfl
Multimodal Fused Learning for Solving the Generalized Traveling Salesman Problem in Robotic Task Planning
Chen, Jiaqi, Fan, Mingfeng, Zhang, Xuefeng, Liang, Jingsong, Cao, Yuhong, Wu, Guohua, Sartoretti, Guillaume Adrien
Effective and efficient task planning is essential for mobile robots, especially in applications like warehouse retrieval and environmental monitoring. These tasks often involve selecting one location from each of several target clusters, forming a Generalized Traveling Salesman Problem (GTSP) that remains challenging to solve both accurately and efficiently. To address this, we propose a Multimodal Fused Learning (MMFL) framework that leverages both graph and image-based representations to capture complementary aspects of the problem, and learns a policy capable of generating high-quality task planning schemes in real time. Specifically, we first introduce a coordinate-based image builder that transforms GTSP instances into spatially informative representations. We then design an adaptive resolution scaling strategy to enhance adaptability across different problem scales, and develop a multimodal fusion module with dedicated bottlenecks that enables effective integration of geometric and spatial features. Extensive experiments show that our MMFL approach significantly outperforms state-of-the-art methods across various GTSP instances while maintaining the computational efficiency required for real-time robotic applications. Physical robot tests further validate its practical effectiveness in real-world scenarios.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.85)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.82)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Towards Optimal Heterogeneous Client Sampling in Multi-Model Federated Learning
Zhang, Haoran, Gong, Zejun, Li, Zekai, Siew, Marie, Joe-Wong, Carlee, El-Azouzi, Rachid
Federated learning (FL) allows edge devices to collaboratively train models without sharing local data. As FL gains popularity, clients may need to train multiple unrelated FL models, but communication constraints limit their ability to train all models simultaneously. While clients could train FL models sequentially, opportunistically having FL clients concurrently train different models -- termed multi-model federated learning (MMFL) -- can reduce the overall training time. Prior work uses simple client-to-model assignments that do not optimize the contribution of each client to each model over the course of its training. Prior work on single-model FL shows that intelligent client selection can greatly accelerate convergence, but naïve extensions to MMFL can violate heterogeneous resource constraints at both the server and the clients. In this work, we develop a novel convergence analysis of MMFL with arbitrary client sampling methods, theoretically demonstrating the strengths and limitations of previous well-established gradient-based methods. Motivated by this analysis, we propose MMFL-LVR, a loss-based sampling method that minimizes training variance while explicitly respecting communication limits at the server and reducing computational costs at the clients. We extend this to MMFL-StaleVR, which incorporates stale updates for improved efficiency and stability, and MMFL-StaleVRE, a lightweight variant suitable for low-overhead deployment. Experiments show our methods improve average accuracy by up to 19.1% over random sampling, with only a 5.4% gap from the theoretical optimum (full client participation).
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > Singapore (0.04)
- Europe > France (0.04)
Supervised Multi-Modal Fission Learning
Mao, Lingchao, wang, Qi, Su, Yi, Lure, Fleming, Li, Jing
Learning from multimodal datasets can leverage complementary information and improve performance in prediction tasks. A commonly used strategy to account for feature correlations in high-dimensional datasets is the latent variable approach. Several latent variable methods have been proposed for multimodal datasets. However, these methods either focus on extracting the shared component across all modalities or on extracting both a shared component and individual components specific to each modality. To address this gap, we propose a Multi-Modal Fission Learning (MMFL) model that simultaneously identifies globally joint, partially joint, and individual components underlying the features of multimodal datasets. Unlike existing latent variable methods, MMFL uses supervision from the response variable to identify predictive latent components and has a natural extension for incorporating incomplete multimodal data. Through simulation studies, we demonstrate that MMFL outperforms various existing multimodal algorithms in both complete and incomplete modality settings. We applied MMFL to a real-world case study for early prediction of Alzheimers Disease using multimodal neuroimaging and genomics data from the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset. MMFL provided more accurate predictions and better insights into within- and across-modality correlations compared to existing methods.
- North America > United States > Maryland > Montgomery County > Rockville (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- (3 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.87)