AITopics | Kuang, Xu

Collaborating Authors

Kuang, Xu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling

Zhu, Zheqing, Liu, Yueyang, Kuang, Xu, Van Roy, Benjamin

arXiv.org Artificial IntelligenceOct-14-2023

Real-world applications of contextual bandits often exhibit non-stationarity due to seasonality, serendipity, and evolving social trends. While a number of non-stationary contextual bandit learning algorithms have been proposed in the literature, they excessively explore due to a lack of prioritization for information of enduring value, or are designed in ways that do not scale in modern applications with high-dimensional user-specific features and large action set, or both. In this paper, we introduce a novel non-stationary contextual bandit algorithm that addresses these concerns. It combines a scalable, deep-neural-network-based architecture with a carefully designed exploration mechanism that strategically prioritizes collecting information with the most lasting value in a non-stationary environment. Through empirical evaluations on two real-world recommendation datasets, which exhibit pronounced non-stationarity, we demonstrate that our approach significantly outperforms the state-of-the-art baselines.

bandit, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2310.07786

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

A Definition of Non-Stationary Bandits

Liu, Yueyang, Kuang, Xu, Van Roy, Benjamin

arXiv.org Artificial IntelligenceJul-28-2023

Despite the subject of non-stationary bandit learning having attracted much recent attention, we have yet to identify a formal definition of non-stationarity that can consistently distinguish non-stationary bandits from stationary ones. Prior work has characterized non-stationary bandits as bandits for which the reward distribution changes over time. We demonstrate that this definition can ambiguously classify the same bandit as both stationary and non-stationary; this ambiguity arises in the existing definition's dependence on the latent sequence of reward distributions. Moreover, the definition has given rise to two widely used notions of regret: the dynamic regret and the weak regret. These notions are not indicative of qualitative agent performance in some bandits. Additionally, this definition of non-stationary bandits has led to the design of agents that explore excessively. We introduce a formal definition of non-stationary bandits that resolves these issues. Our new definition provides a unified approach, applicable seamlessly to both Bayesian and frequentist formulations of bandits. Furthermore, our definition ensures consistent classification of two bandits offering agents indistinguishable experiences, categorizing them as either both stationary or both non-stationary. This advancement provides a more robust framework for non-stationary bandit learning.

bandit, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.12202

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Non-Stationary Bandit Learning via Predictive Sampling

Liu, Yueyang, Kuang, Xu, Van Roy, Benjamin

arXiv.org Artificial IntelligenceJul-17-2023

Thompson sampling has proven effective across a wide range of stationary bandit environments. However, as we demonstrate in this paper, it can perform poorly when applied to non-stationary environments. We attribute such failures to the fact that, when exploring, the algorithm does not differentiate actions based on how quickly the information acquired loses its usefulness due to non-stationarity. Building upon this insight, we propose predictive sampling, an algorithm that deprioritizes acquiring information that quickly loses usefulness. A theoretical guarantee on the performance of predictive sampling is established through a Bayesian regret bound. We provide versions of predictive sampling for which computations tractably scale to complex bandit environments of practical interest. Through numerical simulations, we demonstrate that predictive sampling outperforms Thompson sampling in all non-stationary environments examined.

bandit, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2205.0197

Country:

North America > United States > Arizona (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.67)
Information Technology (0.46)
Marketing (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Weak Signal Asymptotics for Sequentially Randomized Experiments

Kuang, Xu, Wager, Stefan

arXiv.org Artificial IntelligenceJun-22-2023

We use the lens of weak signal asymptotics to study a class of sequentially randomized experiments, including those that arise in solving multi-armed bandit problems. In an experiment with $n$ time steps, we let the mean reward gaps between actions scale to the order $1/\sqrt{n}$ so as to preserve the difficulty of the learning task as $n$ grows. In this regime, we show that the sample paths of a class of sequentially randomized experiments -- adapted to this scaling regime and with arm selection probabilities that vary continuously with state -- converge weakly to a diffusion limit, given as the solution to a stochastic differential equation. The diffusion limit enables us to derive refined, instance-specific characterization of stochastic dynamics, and to obtain several insights on the regret and belief evolution of a number of sequential experiments including Thompson sampling (but not UCB, which does not satisfy our continuity assumption). We show that all sequential experiments whose randomization probabilities have a Lipschitz-continuous dependence on the observed data suffer from sub-optimal regret performance when the reward gaps are relatively large. Conversely, we find that a version of Thompson sampling with an asymptotically uninformative prior variance achieves near-optimal instance-specific regret scaling, including with large reward gaps, but these good regret properties come at the cost of highly unstable posterior beliefs.

data mining, experiment, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2101.09855

Country: North America > United States > California (0.14)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback