AITopics | cr 0

Collaborating Authors

cr 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

abd1c782880cc59759f4112fda0b8f98-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 19:01:22 GMT

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg

Jian, Like, Liu, Dong

arXiv.org Artificial IntelligenceAug-19-2025

Federated learning (FL) enables decentralized clients to train a model collaboratively without sharing local data. A key distinction between FL and centralized learning is that clients' data are non-independent and identically distributed, which poses significant challenges in training a global model that generalizes well across heterogeneous local data distributions. In this paper, we analyze the convergence of overparameterized FedAvg with gradient descent (GD). We prove that the impact of data heterogeneity diminishes as the width of neural networks increases, ultimately vanishing when the width approaches infinity. In the infinite-width regime, we further prove that both the global and local models in FedAvg behave as linear models, and that FedAvg achieves the same generalization performance as centralized learning with the same number of GD iterations. Extensive experiments validate our theoretical findings across various network architectures, loss functions, and optimization methods.

artificial intelligence, data heterogeneity, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.12576

Country: Asia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification

Elbouanani, Akram, Dufraisse, Evan, Popescu, Adrian

arXiv.org Artificial IntelligenceMay-27-2025

Political biases encoded by LLMs might have detrimental effects on downstream applications. Existing bias analysis methods rely on small-size intermediate tasks (questionnaire answering or political content generation) and rely on the LLMs themselves for analysis, thus propagating bias. We propose a new approach leveraging the observation that LLM sentiment predictions vary with the target entity in the same sentence. We define an entropy-based inconsistency metric to encode this prediction variability. We insert 1319 demographically and politically diverse politician names in 450 political sentences and predict target-oriented sentiment using seven models in six widely spoken languages. We observe inconsistencies in all tested combinations and aggregate them in a statistically robust analysis at different granularity levels. We observe positive and negative bias toward left and far-right politicians and positive correlations between politicians with similar alignment. Bias intensity is higher for Western languages than for others. Larger models exhibit stronger and more consistent biases and reduce discrepancies between similar languages. We partially mitigate LLM unreliability in target-oriented sentiment classification (TSC) by replacing politician names with fictional but plausible counterparts.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.19776

Country:

North America > United States (1.00)
Europe > France (1.00)
Asia (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > France Government (0.67)
Media > News (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Ancestral Inference and Learning for Branching Processes in Random Environments

Jiang, Xiaoran, Vidyashankar, Anand N.

arXiv.org Machine LearningJan-27-2025

Ancestral inference for branching processes in random environments involves determining the ancestor distribution parameters using the population sizes of descendant generations. In this paper, we introduce a new methodology for ancestral inference utilizing the generalized method of moments. We demonstrate that the estimator's behavior is critically influenced by the coefficient of variation of the environment sequence. Furthermore, despite the process's evolution being heavily dependent on the offspring means of various generations, we show that the joint limiting distribution of the ancestor and offspring estimators of the mean, under appropriate centering and scaling, decouple and converge to independent Gaussian random variables when the ratio of the number of generations to the logarithm of the number of replicates converges to zero. Additionally, we provide estimators for the limiting variance and illustrate our findings through numerical experiments and data from Polymerase Chain Reaction experiments and COVID-19 data.

artificial intelligence, machine learning, variance, (18 more...)

arXiv.org Machine Learning

2501.16526

Country:

North America > United States > Maryland (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.88)
Health & Medicine > Therapeutic Area > Immunology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback

High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise

Gorbunov, Eduard, Danilova, Marina, Shibaev, Innokentiy, Dvurechensky, Pavel, Gasnikov, Alexander

arXiv.org Artificial IntelligenceAug-30-2024

Stochastic first-order methods are standard for training large-scale machine learning models. Random behavior may cause a particular run of an algorithm to result in a highly suboptimal objective value, whereas theoretical guarantees are usually proved for the expectation of the objective value. Thus, it is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth stochastic convex optimization have complexity bounds with the dependence on the confidence level that is either negative-power or logarithmic but under an additional assumption of sub-Gaussian (light-tailed) noise distribution that may not hold in practice. In our paper, we resolve this issue and derive the first high-probability convergence results with logarithmic dependence on the confidence level for non-smooth convex stochastic optimization problems with non-sub-Gaussian (heavy-tailed) noise. To derive our results, we propose novel stepsize rules for two stochastic methods with gradient clipping. Moreover, our analysis works for generalized smooth objectives with H\"older-continuous gradients, and for both methods, we provide an extension for strongly convex problems. Finally, our results imply that the first (accelerated) method we consider also has optimal iteration and oracle complexity in all the regimes, and the second one is optimal in the non-smooth setting.

batch size, gradient, probability, (16 more...)

arXiv.org Artificial Intelligence

2106.05958

Country:

Asia > Russia (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Beyond Pairwise: Provably Fast Algorithms for Approximate k-Way Similarity Search

Neural Information Processing SystemsMar-13-2024, 23:18:39 GMT

We go beyond the notion of pairwise similarity and look into search problems with k-way similarity functions.

log 1, resemblance, similarity, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
North America > United States > Texas > Dallas County > Dallas (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
(9 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Add feedback

Flexible Communication for Optimal Distributed Learning over Unpredictable Networks

Tyagi, Sahil, Swany, Martin

arXiv.org Artificial IntelligenceJan-29-2024

Gradient compression alleviates expensive communication in distributed deep learning by sending fewer values and its corresponding indices, typically via Allgather (AG). Training with high compression ratio (CR) achieves high accuracy like DenseSGD, but has lower parallel scaling due to high communication cost (i.e., parallel efficiency). Using lower CRs improves parallel efficiency by lowering synchronization cost, but degrades model accuracy as well (statistical efficiency). Further, speedup attained with different models and CRs also varies with network latency, effective bandwidth and collective op used for aggregation. In many cases, collectives like Allreduce (AR) have lower cost than AG to exchange the same amount of data. In this paper, we propose an AR-compatible Topk compressor that is bandwidth-optimal and thus performs better than AG in certain network configurations. We develop a flexible communication strategy that switches between AG and AR based on which collective is optimal in the current settings, and model the pareto-relationship between parallel and statistical efficiency as a multi-objective optimization (MOO) problem to dynamically adjust CR and accelerate training while still converging to high accuracy.

bandwidth, compression, gradient, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigData59044.2023.10386724

2312.02493

Country: North America > United States > Indiana (0.04)

Genre: Research Report (0.40)

Industry:

Telecommunications > Networks (0.49)
Information Technology > Networks (0.35)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Online Resource Allocation: Bandits feedback and Advice on Time-varying Demands

Lyu, Lixing, Cheung, Wang Chi

arXiv.org Artificial IntelligenceJun-12-2023

We consider a general online resource allocation model with bandit feedback and time-varying demands. While online resource allocation has been well studied in the literature, most existing works make the strong assumption that the demand arrival process is stationary. In practical applications, such as online advertisement and revenue management, however, this process may be exogenous and non-stationary, like the constantly changing internet traffic. Motivated by the recent Online Algorithms with Advice framework [Mitazenmacher and Vassilvitskii, \emph{Commun. ACM} 2022], we explore how online advice can inform policy design. We establish an impossibility result that any algorithm perform poorly in terms of regret without any advice in our setting. In contrast, we design an robust online algorithm that leverages the online predictions on the total demand volumes. Empowered with online advice, our proposed algorithm is shown to have both theoretical performance and promising numerical results compared with other algorithms in literature. We also provide two explicit examples for the time-varying demand scenarios and derive corresponding theoretical performance guarantees. Finally, we adapt our model to a network revenue management problem, and numerically demonstrate that our algorithm can still performs competitively compared to existing baselines.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.04182

Country:

Asia > Singapore (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry: Marketing (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

Efficient Private SCO for Heavy-Tailed Data via Clipping

Jin, Chenhan, Zhou, Kaiwen, Han, Bo, Yang, Ming-Chang, Cheng, James

arXiv.org Artificial IntelligenceJul-15-2022

We consider stochastic convex optimization for heavy-tailed data with the guarantee of being differentially private (DP). Prior work on this problem is restricted to the gradient descent (GD) method, which is inefficient for large-scale problems. In this paper, we resolve this issue and derive the first high-probability bounds for the private stochastic method with clipping. For general convex problems, we derive excess population risks $\Tilde{O}\left(\frac{d^{1/7}\sqrt{\ln\frac{(n \epsilon)^2}{\beta d}}}{(n\epsilon)^{2/7}}\right)$ and $\Tilde{O}\left(\frac{d^{1/7}\ln\frac{(n\epsilon)^2}{\beta d}}{(n\epsilon)^{2/7}}\right)$ under bounded or unbounded domain assumption, respectively (here $n$ is the sample size, $d$ is the dimension of the data, $\beta$ is the confidence level and $\epsilon$ is the private level). Then, we extend our analysis to the strongly convex case and non-smooth case (which works for generalized smooth objectives with H$\ddot{\text{o}}$lder-continuous gradients). We establish new excess risk bounds without bounded domain assumption. The results above achieve lower excess risks and gradient complexities than existing methods in their corresponding cases. Numerical experiments are conducted to justify the theoretical improvement.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Artificial Intelligence

2206.13011

Country:

Asia > China > Hong Kong > Sha Tin (0.04)
Asia > China > Hong Kong > Kowloon (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Performance Analysis and Improvement of Parallel Differential Evolution

Zibin, Pan

arXiv.org Artificial IntelligenceJan-17-2021

Differential evolution (DE) is an effective global evolutionary optimization algorithm using to solve global optimization problems mainly in a continuous domain. In this field, researchers pay more attention to improving the capability of DE to find better global solutions, however, the computational performance of DE is also a very interesting aspect especially when the problem scale is quite large. Firstly, this paper analyzes the design of parallel computation of DE which can easily be executed in Math Kernel Library (MKL) and Compute Unified Device Architecture (CUDA). Then the essence of the exponential crossover operator is described and we point out that it cannot be used for better parallel computation. Later, we propose a new exponential crossover operator (NEC) that can be executed parallelly with MKL/CUDA. Next, the extended experiments show that the new crossover operator can speed up DE greatly. In the end, we test the new parallel DE structure, illustrating that the former is much faster.

cr 0, crossover, exponential crossover, (13 more...)

arXiv.org Artificial Intelligence

2101.06599

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback