AITopics | Tan, Vincent

Collaborating Authors

Tan, Vincent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes

Huang, Ruiquan, Cheng, Yuan, Yang, Jing, Tan, Vincent, Liang, Yingbin

arXiv.org Machine LearningOct-20-2023

In multi-task reinforcement learning (RL) under Markov decision processes (MDPs), the presence of shared latent structures among multiple MDPs has been shown to yield significant benefits to the sample efficiency compared to single-task RL. In this paper, we investigate whether such a benefit can extend to more general sequential decision making problems, such as partially observable MDPs (POMDPs) and more general predictive state representations (PSRs). The main challenge here is that the large and complex model space makes it hard to identify what types of common latent structure of multi-task PSRs can reduce the model complexity and improve sample efficiency. To this end, we posit a joint model class for tasks and use the notion of $\eta$-bracketing number to quantify its complexity; this number also serves as a general metric to capture the similarity of tasks and thus determines the benefit of multi-task over single-task RL. We first study upstream multi-task learning over PSRs, in which all tasks share the same observation and action spaces. We propose a provably efficient algorithm UMT-PSR for finding near-optimal policies for all PSRs, and demonstrate that the advantage of multi-task learning manifests if the joint model class of PSRs has a smaller $\eta$-bracketing number compared to that of individual single-task learning. We also provide several example multi-task PSRs with small $\eta$-bracketing numbers, which reap the benefits of multi-task learning. We further investigate downstream learning, in which the agent needs to learn a new target task that shares some commonalities with the upstream tasks via a similarity constraint. By exploiting the learned PSRs from the upstream, we develop a sample-efficient algorithm that provably finds a near-optimal policy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2310.1355

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > Ohio (0.14)

Genre:

Research Report (0.63)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)

Add feedback

High-Dimensional Graphical Model Selection: Tractable Graph Families and Necessary Conditions

Anandkumar, Animashree, Tan, Vincent, Willsky, Alan S.

Neural Information Processing SystemsDec-31-2011

We consider the problem of Ising and Gaussian graphical model selection given n i.i.d. samples from the model. We propose an efficient threshold-based algorithm for structure estimation based known as conditional mutual information test. This simple local algorithm requires only low-order statistics of the data and decides whether two nodes are neighbors in the unknown graph. Under some transparent assumptions, we establish that the proposed algorithm is structurally consistent (or sparsistent) when the number of samples scales as n= Omega(J_{min}^{-4} log p), where p is the number of nodes and J_{min} is the minimum edge potential. We also prove novel non-asymptotic necessary conditions for graphical model selection.

artificial intelligence, graph, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Orange County > Irvine (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback