AITopics | Lihong Li

Neural Information Processing Systems http://nips.cc/

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.34)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

Adversarial Attacks on Stochastic Bandits

Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Jerry Zhu

Neural Information Processing SystemsMay-26-2025, 07:58:55 GMT

We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm. We propose the first attack against two popular bandit algorithms: -greedy and UCB, without knowledge of the mean rewards. The attacker is able to spend only logarithmic effort, multiplied by a problem-specific parameter that becomes smaller as the bandit problem gets easier to attack. The result means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions, say, a particular medical treatment. As bandits are seeing increasingly wide use in practice, our study exposes a significant security threat.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

Neural Information Processing SystemsMar-27-2025, 02:22:05 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.34)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li

Neural Information Processing SystemsMar-26-2025, 22:48:14 GMT

Neural Information Processing Systems http://nips.cc/

behavior policy, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Adversarial Attacks on Stochastic Bandits

Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Jerry Zhu

Neural Information Processing SystemsMar-26-2025, 11:38:51 GMT

We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm. We propose the first attack against two popular bandit algorithms: -greedy and UCB, without knowledge of the mean rewards. The attacker is able to spend only logarithmic effort, multiplied by a problem-specific parameter that becomes smaller as the bandit problem gets easier to attack. The result means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions, say, a particular medical treatment. As bandits are seeing increasingly wide use in practice, our study exposes a significant security threat.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Yihao Feng, Lihong Li, Qiang Liu

Neural Information Processing SystemsMar-23-2025, 10:19:41 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li

Neural Information Processing SystemsJan-27-2025, 04:05:21 GMT

In many real-world reinforcement learning applications, access to the environment is limited to a fixed dataset, instead of direct (online) interaction with the environment. When using this data for either evaluation or training of a new policy, accurate estimates of discounted stationary distribution ratios -- correction terms which quantify the likelihood that the new policy will experience a certain state-action pair normalized by the probability with which the state-action pair appears in the dataset -- can improve accuracy and performance. In this work, we propose an algorithm, DualDICE, for estimating these quantities. In contrast to previous approaches, our algorithm is agnostic to knowledge of the behavior policy (or policies) used to generate the dataset.

behavior policy, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Yihao Feng, Lihong Li, Qiang Liu

Neural Information Processing SystemsJan-23-2025, 11:02:26 GMT

Value function learning plays a central role in many state-of-the-art reinforcementlearning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variants of Bellman operator that are not necessarily a contraction. As a result, they may easily lose convergence guarantees, as can be observed in practice. In this paper, we propose a novel loss function, which can be optimized using standard gradient-based methods with guaranteed convergence. The key advantage is that its gradient can be easily approximated using sampled transitions, avoiding the need for double samples required by prior algorithms like residual gradient. Our approach may be combined with general function classes such as neural networks, using either on-or off-policy data, and is shown to work reliably and effectively in several benchmarks, including classic problems where standard algorithms are known to diverge.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Active Learning with Oracle Epiphany

Tzu-Kuo Huang, Lihong Li, Ara Vartanian, Saleema Amershi, Jerry Zhu

Neural Information Processing SystemsJan-20-2025, 08:33:49 GMT

We present a theoretical analysis of active learning with more realistic interactions with human oracles. Previous empirical studies have shown oracles abstaining on difficult queries until accumulating enough information to make label decisions. We formalize this phenomenon with an "oracle epiphany model" and analyze active learning query complexity under such oracles for both the realizable and the agnostic cases. Our analysis shows that active learning is possible with oracle epiphany, but incurs an additional cost depending on when the epiphany happens. Our results suggest new, principled active learning approaches with realistic oracles.

artificial intelligence, epiphany, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes

Jianshu Chen, Chong Wang, Lin Xiao, Ji He, Lihong Li, Li Deng

Neural Information Processing SystemsOct-4-2024, 04:32:07 GMT

In sequential decision making, it is often important and useful for end users to understand the underlying patterns or causes that lead to the corresponding decisions. However, typical deep reinforcement learning algorithms seldom provide such information due to their black-box nature. In this paper, we present a probabilistic model, Q-LDA, to uncover latent patterns in text-based sequential decision processes. The model can be understood as a variant of latent topic models that are tailored to maximize total rewards; we further draw an interesting connection between an approximate maximum-likelihood estimation of Q-LDA and the celebrated Q-learning algorithm. We demonstrate in the text-game domain that our proposed method not only provides a viable mechanism to uncover latent patterns in decision processes, but also obtains state-of-the-art rewards in these games.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Workflow (0.46)

Industry: Leisure & Entertainment > Games (0.66)

Add feedback

Filters

Well File:

Lihong Li

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Adversarial Attacks on Stochastic Bandits

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

Adversarial Attacks on Stochastic Bandits

A Kernel Loss for Solving the Bellman Equation

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

A Kernel Loss for Solving the Bellman Equation

Active Learning with Oracle Epiphany

Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes