AITopics | psdp

Collaborating Authors

psdp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Model-Free Exploration in Low-Rank MDPs

Neural Information Processing SystemsFeb-17-2026, 06:54:43 GMT

What are the right computational primitives for exploration?

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Efficient Model-Free Exploration in Low-Rank MDPs

Neural Information Processing SystemsOct-9-2025, 08:18:06 GMT

What are the right computational primitives for exploration?

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

Efficient Model-Free Exploration in Low-Rank MDPs

Mhammedi, Zakaria, Block, Adam, Foster, Dylan J., Rakhlin, Alexander

arXiv.org Artificial IntelligenceJul-8-2023

A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required. Low-Rank Markov Decision Processes -- where transition probabilities admit a low-rank factorization based on an unknown feature embedding -- offer a simple, yet expressive framework for RL with function approximation, but existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions such as latent variable structure, access to model-based function approximation, or reachability. In this work, we propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs that is both computationally efficient and model-free, allowing for general function approximation and requiring no additional structural assumptions. Our algorithm, VoX, uses the notion of a generalized optimal design for the feature embedding as an efficiently computable basis for exploration, performing efficient optimal design computation by interleaving representation learning and policy optimization. Our analysis -- which is appealingly simple and modular -- carefully combines several techniques, including a new reduction from optimal design computation to policy optimization based on the Frank-Wolfe method, and an improved analysis of a certain minimax representation learning objective found in prior work.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.03997

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Artificial Intelligence projects bag Rs 723m in PSDP - ThePenPK

#artificialintelligenceJun-11-2022, 06:35:05 GMT

ISLAMABAD: The federal government has allocated Rs 723 million for the promotion of Artificial Intelligence (AI) in the country to cope with the challenges of technological advancement. The Defence Division will spend Rs 300 million on the development of Information and Communication Technology (ICT) and AI-based precision agriculture system. Named Green AI, the system will utilize dual-use aerospace technologies. For the establishment of the Sino-Pak Centre for Artificial Intelligence under the Information Technology and Telecom Division, an estimated Rs 243 million has been allocated. The National Centre of Artificial Intelligence, Islamabad will work under the Higher Education Commission, for which Rs 170 million has been allocated.

artificial intelligence, establishment, thepenpk, (1 more...)

#artificialintelligence

Country: Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.58)

Industry:

Food & Agriculture > Agriculture (0.67)
Education > Educational Setting > Higher Education (0.31)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Approximate Policy Iteration Schemes: A Comparison

Scherrer, Bruno

arXiv.org Machine LearningMay-12-2014

We consider the infinite-horizon discounted optimal control problem formalized by Markov Decision Processes. We focus on several approximate variations of the Policy Iteration algorithm: Approximate Policy Iteration, Conservative Policy Iteration (CPI), a natural adaptation of the Policy Search by Dynamic Programming algorithm to the infinite-horizon case (PSDP$_\infty$), and the recently proposed Non-Stationary Policy iteration (NSPI(m)). For all algorithms, we describe performance bounds, and make a comparison by paying a particular attention to the concentrability constants involved, the number of iterations and the memory required. Our analysis highlights the following points: 1) The performance guarantee of CPI can be arbitrarily better than that of API/API($\alpha$), but this comes at the cost of a relative---exponential in $\frac{1}{\epsilon}$---increase of the number of iterations. 2) PSDP$_\infty$ enjoys the best of both worlds: its performance guarantee is similar to that of CPI, but within a number of iterations similar to that of API. 3) Contrary to API that requires a constant memory, the memory needed by CPI and PSDP$_\infty$ is proportional to their number of iterations, which may be problematic when the discount factor $\gamma$ is close to 1 or the approximation error $\epsilon$ is close to $0$; we show that the NSPI(m) algorithm allows to make an overall trade-off between memory and performance. Simulations with these schemes confirm our analysis.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

1405.2878

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Policy Search by Dynamic Programming

Bagnell, J. A., Kakade, Sham M., Schneider, Jeff G., Ng, Andrew Y.

Neural Information Processing SystemsDec-31-2004

We consider the policy search approach to reinforcement learning. We show that if a "baseline distribution" is given (indicating roughly how often we expect a good policy to visit each state), then we can derive a policy search algorithm that terminates in a finite number of steps, and for which we can provide nontrivial performance guarantees. We also demonstrate this algorithm on several grid-world POMDPs, a planar biped walking robot, and a double-pole balancing problem.

algorithm, non-stationary policy, psdp, (14 more...)

Neural Information Processing Systems

Country: