AITopics | Zuo, Shiliang

Collaborating Authors

Zuo, Shiliang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Greedy Algorithm for Structured Bandits: A Sharp Characterization of Asymptotic Success / Failure

Slivkins, Aleksandrs, Xu, Yunzong, Zuo, Shiliang

arXiv.org Artificial IntelligenceMar-5-2025

We study the greedy (exploitation-only) algorithm in bandit problems with a known reward structure. We allow arbitrary finite reward structures, while prior work focused on a few specific ones. We fully characterize when the greedy algorithm asymptotically succeeds or fails, in the sense of sublinear vs. linear regret as a function of time. Our characterization identifies a partial identifiability property of the problem instance as the necessary and sufficient condition for the asymptotic success. Notably, once this property holds, the problem becomes easy--any algorithm will succeed (in the same sense as above), provided it satisfies a mild non-degeneracy condition. We further extend our characterization to contextual bandits and interactive decision-making with arbitrary feedback, and demonstrate its broad applicability across various examples. Keywords: Multi-armed bandits, contextual bandits, structured bandits, greedy algorithm, regret.

bandit, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.0401

Country:

North America > United States > Illinois (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.81)

Add feedback

Principal-Agent Multitasking: the Uniformity of Optimal Contracts and its Efficient Learning via Instrumental Regression

Zuo, Shiliang

arXiv.org Machine LearningMay-31-2024

This work studies the multitasking principal-agent problem. I first show a ``uniformity'' result. Specifically, when the tasks are perfect substitutes, and the agent's cost function is homogeneous to a certain degree, then the optimal contract only depends on the marginal utility of each task and the degree of homogeneity. I then study a setting where the marginal utility of each task is unknown so that the optimal contract must be learned or estimated with observational data. I identify this problem as a regression problem with measurement errors and observe that this problem can be cast as an instrumental regression problem. The current works observe that both the contract and the repeated observations (when available) can act as valid instrumental variables, and propose using the generalized method of moments estimator to compute an approximately optimal contract from offline data. I also study an online setting and show how the optimal contract can be efficiently learned in an online fashion using the two estimators. Here the principal faces an exploration-exploitation tradeoff: she must experiment with new contracts and observe their outcome whilst at the same time ensuring her experimentations are not deviating too much from the optimal contract. This work shows when repeated observations are available and agents are sufficiently ``diverse", the principal can achieve a very low $\widetilde{O}(d)$ cumulative utility loss, even with a ``pure exploitation" algorithm.

artificial intelligence, contract, machine learning, (19 more...)

arXiv.org Machine Learning

2405.20642

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Contextual Bandits with Online Neural Regression

Deb, Rohan, Ban, Yikun, Zuo, Shiliang, He, Jingrui, Banerjee, Arindam

arXiv.org Machine LearningDec-12-2023

Recent works have shown a reduction from contextual bandits to online regression under a realizability assumption [Foster and Rakhlin, 2020, Foster and Krishnamurthy, 2021]. In this work, we investigate the use of neural networks for such online regression and associated Neural Contextual Bandits (NeuCBs). Using existing results for wide networks, one can readily show a ${\mathcal{O}}(\sqrt{T})$ regret for online regression with square loss, which via the reduction implies a ${\mathcal{O}}(\sqrt{K} T^{3/4})$ regret for NeuCBs. Departing from this standard approach, we first show a $\mathcal{O}(\log T)$ regret for online regression with almost convex losses that satisfy QG (Quadratic Growth) condition, a generalization of the PL (Polyak-\L ojasiewicz) condition, and that have a unique minima. Although not directly applicable to wide networks since they do not have unique minima, we show that adding a suitable small random perturbation to the network predictions surprisingly makes the loss satisfy QG with unique minima. Based on such a perturbed prediction, we show a ${\mathcal{O}}(\log T)$ regret for online regression with both squared loss and KL loss, and subsequently convert these respectively to $\tilde{\mathcal{O}}(\sqrt{KT})$ and $\tilde{\mathcal{O}}(\sqrt{KL^*} + K)$ regret for NeuCB, where $L^*$ is the loss of the best policy. Separately, we also show that existing regret bounds for NeuCBs are $\Omega(T)$ or assume i.i.d. contexts, unlike this work. Finally, our experimental results on various datasets demonstrate that our algorithms, especially the one based on KL loss, persistently outperform existing algorithms.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2312.07145

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Corruption-Robust Lipschitz Contextual Search

Zuo, Shiliang

arXiv.org Machine LearningNov-29-2023

I study the problem of learning a Lipschitz function with corrupted binary signals. The learner tries to learn a $L$-Lipschitz function $f: [0,1]^d \rightarrow [0, L]$ that the adversary chooses. There is a total of $T$ rounds. In each round $t$, the adversary selects a context vector $x_t$ in the input space, and the learner makes a guess to the true function value $f(x_t)$ and receives a binary signal indicating whether the guess is high or low. In a total of $C$ rounds, the signal may be corrupted, though the value of $C$ is \emph{unknown} to the learner. The learner's goal is to incur a small cumulative loss. This work introduces the new algorithmic technique \emph{agnostic checking} as well as new analysis techniques. I design algorithms which: for the symmetric loss, the learner achieves regret $L\cdot O(C\log T)$ with $d = 1$ and $L\cdot O_d(C\log T + T^{(d-1)/d})$ with $d > 1$; for the pricing loss, the learner achieves regret $L\cdot \widetilde{O} (T^{d/(d+1)} + C\cdot T^{1/(d+1)})$.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2307.13903

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Near Optimal Adversarial Attack on UCB Bandits

Zuo, Shiliang

arXiv.org Machine LearningAug-21-2020

We consider a stochastic multi-arm bandit problem where rewards are subject to adversarial corruption. We propose a novel attack strategy that manipulates a UCB principle into pulling some non-optimal target arm $T - o(T)$ times with a cumulative cost that scales as $\sqrt{\log T}$, where $T$ is the number of rounds. We also prove the first lower bound on the cumulative attack cost. Our lower bound matches our upper bound up to $\log \log T$ factors, showing our attack to be near optimal.

algorithm, big data, health & medicine, (17 more...)

arXiv.org Machine Learning

2008.09312

Genre: Research Report (0.65)

Industry:

Information Technology > Security & Privacy (0.74)
Government > Military (0.74)
Health & Medicine > Pharmaceuticals & Biotechnology (0.64)

Technology:

Information Technology > Security & Privacy (0.74)
Information Technology > Artificial Intelligence (0.47)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

A Realistic Example in 2 Dimension that Gradient Descent Takes Exponential Time to Escape Saddle Points

Zuo, Shiliang

arXiv.org Machine LearningAug-17-2020

Gradient descent is a popular algorithm in optimization, and its performance in convex settings is mostly well understood. In non-convex settings, it has been shown that gradient descent is able to escape saddle points asymptotically and converge to local minimizers [Lee et. al. 2016]. Recent studies also show a perturbed version of gradient descent is enough to escape saddle points efficiently [Jin et. al. 2015, Ge et. al. 2017]. In this paper we show a negative result: gradient descent may take exponential time to escape saddle points, with non-pathological two dimensional functions. While our focus is theoretical, we also conduct experiments verifying our theoretical result. Through our analysis we demonstrate that stochasticity is essential to escape saddle points efficiently.

artificial intelligence, gradient descent, machine learning, (15 more...)

arXiv.org Machine Learning

2008.07513

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback