AITopics | Wu, Shuyuan

Collaborating Authors

Wu, Shuyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Network EM Algorithm for Gaussian Mixture Model in Decentralized Federated Learning

Wu, Shuyuan, Du, Bin, Li, Xuetong, Wang, Hansheng

arXiv.org Machine LearningNov-8-2024

We systematically study various network Expectation-Maximization (EM) algorithms for the Gaussian mixture model within the framework of decentralized federated learning. Our theoretical investigation reveals that directly extending the classical decentralized supervised learning method to the EM algorithm exhibits poor estimation accuracy with heterogeneous data across clients and struggles to converge numerically when Gaussian components are poorly-separated. To address these issues, we propose two novel solutions. First, to handle heterogeneous data, we introduce a momentum network EM (MNEM) algorithm, which uses a momentum parameter to combine information from both the current and historical estimators. Second, to tackle the challenge of poorly-separated Gaussian components, we develop a semi-supervised MNEM (semi-MNEM) algorithm, which leverages partially labeled data. Rigorous theoretical analysis demonstrates that MNEM can achieve statistical efficiency comparable to that of the whole sample estimator when the mixture components satisfy certain separation conditions, even in heterogeneous scenarios. Moreover, the semi-MNEM estimator enhances the convergence speed of the MNEM algorithm, effectively addressing the numerical convergence challenges in poorly-separated scenarios. Extensive simulation and real data analyses are conducted to justify our theoretical findings.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2411.05591

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

Li, Xuetong, Gao, Yuan, Chang, Hong, Huang, Danyang, Ma, Yingying, Pan, Rui, Qi, Haobo, Wang, Feifei, Wu, Shuyuan, Xu, Ke, Zhou, Jing, Zhu, Xuening, Zhu, Yingqiu, Wang, Hansheng

arXiv.org Artificial IntelligenceMar-17-2024

This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first class of literature is about distributed computing and focuses on the situation, where the dataset size is too huge to be comfortably handled by one single computer. In this case, a distributed computation system with multiple computers has to be utilized. The second class of literature is about subsampling methods and concerns about the situation, where the sample size of dataset is small enough to be placed on one single computer but too large to be easily processed by its memory as a whole. The last class of literature studies those minibatch gradient related optimization techniques, which have been extensively used for optimizing various deep learning models.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/24754269.2024.2343151

2403.11163

Country:

Asia > China (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (0.92)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quasi-Newton Updating for Large-Scale Distributed Learning

Wu, Shuyuan, Huang, Danyang, Wang, Hansheng

arXiv.org Artificial IntelligenceJun-11-2023

Modern statistical analysis often involves massive datasets (Gopal and Yang, 2013). In several cases, such datasets are too large to be efficiently handled by a single computer. Instead, they have to be divided and then processed on a distributed computer system, which consists of a large number of computers (Zhang et al., 2012). Among all such computers, one often serves as the central computer, while the rest serve as worker computers. In this scenario, the central computer should be connected with all worker computers to construct a distributed computing system. Thus, approaches for the realization of efficient statistical learning on such distributed computing systems have received considerable interest from the research community (Mcdonald et al., 2009; Jordan et al., 2019; Tang et al., 2020; Hector and Song, 2020, 2021). Here, we consider a standard statistical learning problem with a total of N observations, where N is assumed to be very large.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.04111

Country:

Asia > China (0.28)
Asia > Middle East > Jordan (0.24)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Network Gradient Descent Algorithm for Decentralized Federated Learning

Wu, Shuyuan, Huang, Danyang, Wang, Hansheng

arXiv.org Machine LearningMay-5-2022

We study a fully decentralized federated learning algorithm, which is a novel gradient descent algorithm executed on a communication-based network. For convenience, we refer to it as a network gradient descent (NGD) method. In the NGD method, only statistics (e.g., parameter estimates) need to be communicated, minimizing the risk of privacy. Meanwhile, different clients communicate with each other directly according to a carefully designed network structure without a central master. This greatly enhances the reliability of the entire algorithm. Those nice properties inspire us to carefully study the NGD method both theoretically and numerically. Theoretically, we start with a classical linear regression model. We find that both the learning rate and the network structure play significant roles in determining the NGD estimator's statistical efficiency. The resulting NGD estimator can be statistically as efficient as the global estimator, if the learning rate is sufficiently small and the network structure is well balanced, even if the data are distributed heterogeneously. Those interesting findings are then extended to general models and loss functions. Extensive numerical studies are presented to corroborate our theoretical findings. Classical deep learning models are also presented for illustration purpose.

artificial intelligence, machine learning, network structure, (16 more...)

arXiv.org Machine Learning

2205.08364

Country: Asia (0.29)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback