AITopics | Wang, Xiaolu

Collaborating Authors

Wang, Xiaolu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning

Li, Xinran, Wang, Xiaolu, Bai, Chenjia, Zhang, Jun

arXiv.org Artificial IntelligenceFeb-26-2025

In cooperative multi-agent reinforcement learning (MARL), well-designed communication protocols can effectively facilitate consensus among agents, thereby enhancing task performance. Moreover, in large-scale multi-agent systems commonly found in real-world applications, effective communication plays an even more critical role due to the escalated challenge of partial observability compared to smaller-scale setups. In this work, we endeavor to develop a scalable communication protocol for MARL. Unlike previous methods that focus on selecting optimal pairwise communication links-a task that becomes increasingly complex as the number of agents grows-we adopt a global perspective on communication topology design. Specifically, we propose utilizing the exponential topology to enable rapid information dissemination among agents by leveraging its small-diameter and small-size properties. This approach leads to a scalable communication protocol, named ExpoComm. To fully unlock the potential of exponential graphs as communication topologies, we employ memory-based message processors and auxiliary tasks to ground messages, ensuring that they reflect global information and benefit decision-making. Extensive experiments on large-scale cooperative benchmarks, including MAgent and Infrastructure Management Planning, demonstrate the superior performance and robust zero-shot transferability of ExpoComm compared to existing communication strategies. The code is publicly available at https://github.com/LXXXXR/ExpoComm.

agent, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.19717

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Telecommunications (0.48)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

SAFL: Structure-Aware Personalized Federated Learning via Client-Specific Clustering and SCSI-Guided Model Pruning

Li, Nan, Wang, Xiaolu, Du, Xiao, Cai, Puyu, Wang, Ting

arXiv.org Artificial IntelligenceJan-30-2025

Federated Learning (FL) enables clients to collaboratively train machine learning models without sharing local data, preserving privacy in diverse environments. While traditional FL approaches preserve privacy, they often struggle with high computational and communication overhead. To address these issues, model pruning is introduced as a strategy to streamline computations. However, existing pruning methods, when applied solely based on local data, often produce sub-models that inadequately reflect clients' specific tasks due to data insufficiency. To overcome these challenges, this paper introduces SAFL (Structure-Aware Federated Learning), a novel framework that enhances personalized federated learning through client-specific clustering and Similar Client Structure Information (SCSI)-guided model pruning. SAFL employs a two-stage process: initially, it groups clients based on data similarities and uses aggregated pruning criteria to guide the pruning process, facilitating the identification of optimal sub-models. Subsequently, clients train these pruned models and engage in server-based aggregation, ensuring tailored and efficient models for each client. This method significantly reduces computational overhead while improving inference accuracy. Extensive experiments demonstrate that SAFL markedly diminishes model size and improves performance, making it highly effective in federated environments characterized by heterogeneous data.

artificial intelligence, machine learning, pruning, (15 more...)

arXiv.org Artificial Intelligence

2501.18659

Country: North America > United States > New York (0.14)

Genre:

Research Report > Promising Solution (0.46)
Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Dual-Delayed Asynchronous SGD for Arbitrarily Heterogeneous Data

Wang, Xiaolu, Sun, Yuchang, Wai, Hoi-To, Zhang, Jun

arXiv.org Artificial IntelligenceMay-27-2024

We consider the distributed learning problem with data dispersed across multiple workers under the orchestration of a central server. Asynchronous Stochastic Gradient Descent (SGD) has been widely explored in such a setting to reduce the synchronization overhead associated with parallelization. However, the performance of asynchronous SGD algorithms often depends on a bounded dissimilarity condition among the workers' local data, a condition that can drastically affect their efficiency when the workers' data are highly heterogeneous. To overcome this limitation, we introduce the \textit{dual-delayed asynchronous SGD (DuDe-ASGD)} algorithm designed to neutralize the adverse effects of data heterogeneity. DuDe-ASGD makes full use of stale stochastic gradients from all workers during asynchronous training, leading to two distinct time lags in the model parameters and data samples utilized in the server's iterations. Furthermore, by adopting an incremental aggregation strategy, DuDe-ASGD maintains a per-iteration computational cost that is on par with traditional asynchronous SGD algorithms. Our analysis demonstrates that DuDe-ASGD achieves a near-minimax-optimal convergence rate for smooth nonconvex problems, even when the data across workers are extremely heterogeneous. Numerical experiments indicate that DuDe-ASGD compares favorably with existing asynchronous and synchronous SGD-based algorithms.

artificial intelligence, dude-asgd, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.16966

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Linear Speedup of Incremental Aggregated Gradient Methods on Streaming Data

Wang, Xiaolu, Jin, Cheng, Wai, Hoi-To, Gu, Yuantao

arXiv.org Artificial IntelligenceSep-10-2023

This paper considers a type of incremental aggregated gradient (IAG) method for large-scale distributed optimization. The IAG method is well suited for the parameter server architecture as the latter can easily aggregate potentially staled gradients contributed by workers. Although the convergence of IAG in the case of deterministic gradient is well known, there are only a few results for the case of its stochastic variant based on streaming data. Considering strongly convex optimization, this paper shows that the streaming IAG method achieves linear speedup when the workers are updating frequently enough, even if the data sample distribution across workers are heterogeneous. We show that the expected squared distance to optimal solution decays at O((1+T)/(nt)), where $n$ is the number of workers, t is the iteration number, and T/n is the update frequency of workers. Our analysis involves careful treatments of the conditional expectations with staled gradients and a recursive system with both delayed and noise terms, which are new to the analysis of IAG-type algorithms. Numerical results are presented to verify our findings.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2309.0498

Country: Asia > China (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Add feedback

Distributionally Robust Graph Learning from Smooth Signals under Moment Uncertainty

Wang, Xiaolu, Pun, Yuen-Man, So, Anthony Man-Cho

arXiv.org Artificial IntelligenceNov-24-2021

We consider the problem of learning a graph from a finite set of noisy graph signal observations, the goal of which is to find a smooth representation of the graph signal. Such a problem is motivated by the desire to infer relational structure in large datasets and has been extensively studied in recent years. Most existing approaches focus on learning a graph on which the observed signals are smooth. However, the learned graph is prone to overfitting, as it does not take the unobserved signals into account. To address this issue, we propose a novel graph learning model based on the distributionally robust optimization methodology, which aims to identify a graph that not only provides a smooth representation of but is also robust against uncertainties in the observed signals. On the statistics side, we establish out-of-sample performance guarantees for our proposed model. On the optimization side, we show that under a mild assumption on the graph signal distribution, our proposed model admits a smooth non-convex optimization formulation. We then develop a projected gradient method to tackle this formulation and establish its convergence guarantees. Our formulation provides a new perspective on regularization in the graph learning setting. Moreover, extensive numerical experiments on both synthetic and real-world data show that our model has comparable yet more robust performance across different populations of observed signals than existing non-robust models according to various metrics.

artificial intelligence, graph, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TSP.2022.3229950

2105.05458

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback