AITopics | Xu, Hongli

Collaborating Authors

Xu, Hongli

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Top-$n\sigma$: Not All Logits Are You Need

Tang, Chenxia, Liu, Jianchun, Xu, Hongli, Huang, Liusheng

arXiv.org Artificial IntelligenceNov-12-2024

Large language models (LLMs) typically employ greedy decoding or low-temperature sampling for reasoning tasks, reflecting a perceived trade-off between diversity and accuracy. We challenge this convention by introducing top-$n\sigma$, a novel sampling method that operates directly on pre-softmax logits by leveraging a statistical threshold. Our key insight is that logits naturally separate into a Gaussian-distributed noisy region and a distinct informative region, enabling efficient token filtering without complex probability manipulations. Unlike existing methods (e.g., top-$p$, min-$p$) that inadvertently include more noise tokens at higher temperatures, top-$n\sigma$ maintains a stable sampling space regardless of temperature scaling. We also provide a theoretical analysis of top-$n\sigma$ to better understand its behavior. The extensive experimental results across four reasoning-focused datasets demonstrate that our method not only outperforms existing sampling approaches but also surpasses greedy decoding, while maintaining consistent performance even at high temperatures.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.07641

Country:

Asia (0.14)
Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation

Liao, Yunming, Xu, Yang, Xu, Hongli, Wang, Lun, Yao, Zhiwei, Qiao, Chunming

arXiv.org Artificial IntelligenceNov-22-2023

Recently, federated learning (FL) has emerged as a popular technique for edge AI to mine valuable knowledge in edge computing (EC) systems. To mitigate the computing/communication burden on resource-constrained workers and protect model privacy, split federated learning (SFL) has been released by integrating both data and model parallelism. Despite resource limitations, SFL still faces two other critical challenges in EC, i.e., statistical heterogeneity and system heterogeneity. To address these challenges, we propose a novel SFL framework, termed MergeSFL, by incorporating feature merging and batch size regulation in SFL. Concretely, feature merging aims to merge the features from workers into a mixed feature sequence, which is approximately equivalent to the features derived from IID data and is employed to promote model accuracy. While batch size regulation aims to assign diverse and suitable batch sizes for heterogeneous workers to improve training efficiency. Moreover, MergeSFL explores to jointly optimize these two strategies upon their coupled relationship to better enhance the performance of SFL. Extensive experiments are conducted on a physical platform with 80 NVIDIA Jetson edge devices, and the experimental results show that MergeSFL can improve the final model accuracy by 5.82% to 26.22%, with a speedup by about 1.74x to 4.14x, compared to the baselines.

artificial intelligence, batch size, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.13348

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Efficient Semi-Supervised Federated Learning for Heterogeneous Participants

Sun, Zhipeng, Xu, Yang, Xu, Hongli, Wang, Zhiyuan, Liao, Yunming

arXiv.org Artificial IntelligenceNov-10-2023

Federated Learning (FL) has emerged to allow multiple clients to collaboratively train machine learning models on their private data. However, training and deploying large-scale models on resource-constrained clients is challenging. Fortunately, Split Federated Learning (SFL) offers a feasible solution by alleviating the computation and/or communication burden on clients. However, existing SFL works often assume sufficient labeled data on clients, which is usually impractical. Besides, data non-IIDness across clients poses another challenge to ensure efficient model training. To our best knowledge, the above two issues have not been simultaneously addressed in SFL. Herein, we propose a novel Semi-SFL system, which incorporates clustering regularization to perform SFL under the more practical scenario with unlabeled and non-IID client data. Moreover, our theoretical and experimental investigations into model convergence reveal that the inconsistent training processes on labeled and unlabeled data have an influence on the effectiveness of clustering regularization. To this end, we develop a control algorithm for dynamically adjusting the global updating frequency, so as to mitigate the training inconsistency and improve training performance. Extensive experiments on benchmark models and datasets show that our system provides a 3.0x speed-up in training time and reduces the communication cost by about 70.3% while reaching the target accuracy, and achieves up to 5.1% improvement in accuracy under non-IID scenarios compared to the state-of-the-art baselines.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.1587

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Adaptive Control of Client Selection and Gradient Compression for Efficient Federated Learning

Jiang, Zhida, Xu, Yang, Xu, Hongli, Wang, Zhiyuan, Qian, Chen

arXiv.org Artificial IntelligenceDec-19-2022

Federated learning (FL) allows multiple clients cooperatively train models without disclosing local data. However, the existing works fail to address all these practical concerns in FL: limited communication resources, dynamic network conditions and heterogeneous client properties, which slow down the convergence of FL. To tackle the above challenges, we propose a heterogeneity-aware FL framework, called FedCG, with adaptive client selection and gradient compression. Specifically, the parameter server (PS) selects a representative client subset considering statistical heterogeneity and sends the global model to them. After local training, these selected clients upload compressed model updates matching their capabilities to the PS for aggregation, which significantly alleviates the communication load and mitigates the straggler effect. We theoretically analyze the impact of both client selection and gradient compression on convergence performance. Guided by the derived convergence rate, we develop an iteration-based algorithm to jointly optimize client selection and compression ratio decision using submodular maximization and linear programming. Extensive experiments on both real-world prototypes and simulations show that FedCG can provide up to 5.3$\times$ speedup compared to other methods.

artificial intelligence, client selection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2212.09483

Genre: Research Report (1.00)

Industry:

Education (0.66)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback