AITopics | Bajgar, Ondrej

Collaborating Authors

Bajgar, Ondrej

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Toward Information Theoretic Active Inverse Reinforcement Learning

Bajgar, Ondrej, Gould, Sid William, Mitta, Rohan Narayan Langford, Liu, Jonathon, Newcombe, Oliver, Golden, Jack

arXiv.org Machine LearningDec-31-2024

As AI systems become increasingly autonomous, aligning their decision-making to human preferences is essential. In domains like autonomous driving or robotics, it is impossible to write down the reward function representing these preferences by hand. Inverse reinforcement learning (IRL) offers a promising approach to infer the unknown reward from demonstrations. However, obtaining human demonstrations can be costly. Active IRL addresses this challenge by strategically selecting the most informative scenarios for human demonstration, reducing the amount of required human effort. Where most prior work allowed querying the human for an action at one state at a time, we motivate and analyse scenarios where we collect longer trajectories. We provide an information-theoretic acquisition function, propose an efficient approximation scheme, and illustrate its performance through a set of gridworld experiments as groundwork for future work expanding to more general settings.

machine learning, reinforcement learning, trajectory, (17 more...)

arXiv.org Machine Learning

2501.00381

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Wisconsin (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Walking the Values in Bayesian Inverse Reinforcement Learning

Bajgar, Ondrej, Abate, Alessandro, Gatsis, Konstantinos, Osborne, Michael A.

arXiv.org Artificial IntelligenceJul-15-2024

The goal of Bayesian inverse reinforcement learning (IRL) is recovering a posterior distribution over reward functions using a set of demonstrations from an expert optimizing for a reward unknown to the learner. The resulting posterior over rewards can then be used to synthesize an apprentice policy that performs well on the same or a similar task. A key challenge in Bayesian IRL is bridging the computational gap between the hypothesis space of possible rewards and the likelihood, often defined in terms of Q values: vanilla Bayesian IRL needs to solve the costly forward planning problem - going from rewards to the Q values - at every step of the algorithm, which may need to be done thousands of times. We propose to solve this by a simple change: instead of focusing on primarily sampling in the space of rewards, we can focus on primarily working in the space of Q-values, since the computation required to go from Q-values to reward is radically cheaper. Furthermore, this reversion of the computation makes it easy to compute the gradient allowing efficient sampling using Hamiltonian Monte Carlo. We propose ValueWalk - a new Markov chain Monte Carlo method based on this insight - and illustrate its advantages on several tasks.

machine learning, posterior, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2407.10971

Country:

Europe > United Kingdom > England (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Negative Human Rights as a Basis for Long-term AI Safety and Regulation

Bajgar, Ondrej, Horenovsky, Jan

arXiv.org Artificial IntelligenceApr-20-2023

If autonomous AI systems are to be reliably safe in novel situations, they will need to incorporate general principles guiding them to recognize and avoid harmful behaviours. Such principles may need to be supported by a binding system of regulation, which would need the underlying principles to be widely accepted. They should also be specific enough for technical implementation. Drawing inspiration from law, this article explains how negative human rights could fulfil the role of such principles and serve as a foundation both for an international regulatory system and for building technical safety constraints for future AI systems.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1.14020

2208.14788

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

Planning for Goal-Oriented Dialogue Systems

Muise, Christian, Chakraborti, Tathagata, Agarwal, Shubham, Bajgar, Ondrej, Chaudhary, Arunima, Lastras-Montano, Luis A., Ondrej, Josef, Vodolan, Miroslav, Wiecha, Charlie

arXiv.org Artificial IntelligenceOct-17-2019

Generating complex multi-turn goal-oriented dialogue agents is a difficult problem that has seen a considerable focus from many leaders in the tech industry, including IBM, Google, Amazon, and Microsoft. This is in large part due to the rapidly growing market demand for dialogue agents capable of goal-oriented behaviour. Due to the business process nature of these conversations, end-to-end machine learning systems are generally not a viable option, as the generated dialogue agents must be deployable and verifiable on behalf of the businesses authoring them. In this work, we propose a paradigm shift in the creation of goal-oriented complex dialogue systems that dramatically eliminates the need for a designer to manually specify a dialogue tree, which nearly all current systems have to resort to when the interaction pattern falls outside standard patterns such as slot filling. We propose a declarative representation of the dialogue agent to be processed by state-of-the-art planning technology. Our proposed approach covers all aspects of the process; from model solicitation to the execution of the generated plans/dialogue agents. Along the way, we introduce novel planning encodings for declarative dialogue synthesis, a variety of interfaces for working with the specification as a dialogue architect, and a robust executor for generalized contingent plans. We have created prototype implementations of all components, and in this paper, we further demonstrate the resulting system empirically.

planning & scheduling, specification, survey article, (20 more...)

arXiv.org Artificial Intelligence

1910.08137

Country:

Europe (0.46)
Asia > Middle East > Israel (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Information Technology (0.69)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Generating Dialogue Agents via Automated Planning

Botea, Adi, Muise, Christian, Agarwal, Shubham, Alkan, Oznur, Bajgar, Ondrej, Daly, Elizabeth, Kishimoto, Akihiro, Lastras, Luis, Marinescu, Radu, Ondrej, Josef, Pedemonte, Pablo, Vodolan, Miroslav

arXiv.org Artificial IntelligenceFeb-2-2019

Dialogue systems have many applications such as customer support or question answering. Typically they have been limited to shallow single turn interactions. However more advanced applications such as career coaching or planning a trip require a much more complex multi-turn dialogue. Current limitations of conversational systems have made it difficult to support applications that require personalization, customization and context dependent interactions. We tackle this challenging problem by using domain-independent AI planning to automatically create dialogue plans, customized to guide a dialogue towards achieving a given goal. The input includes a library of atomic dialogue actions, an initial state of the dialogue, and a goal. Dialogue plans are plugged into a dialogue system capable to orchestrate their execution. Use cases demonstrate the viability of the approach. Our work on dialogue planning has been integrated into a product, and it is in the process of being deployed into another.

artificial intelligence, dialogue, planning & scheduling, (20 more...)

arXiv.org Artificial Intelligence

1902.00771

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Consumer Products & Services > Travel (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

Add feedback

A Boo(n) for Evaluating Architecture Performance

Bajgar, Ondrej, Kadlec, Rudolf, Kleindienst, Jan

arXiv.org Machine LearningJul-5-2018

We point out important problems with the common practice of using the best single model performance for comparing deep learning architectures, and we propose a method that corrects these flaws. Each time a model is trained, one gets a different result due to random factors in the training process, which include random parameter initialization and random data shuffling. Reporting the best single model performance does not appropriately address this stochasticity. We propose a normalized expected best-out-of-$n$ performance ($\text{Boo}_n$) as a way to correct these problems.

deep learning, neural network, performance distribution, (19 more...)

arXiv.org Machine Learning

1807.01961

Country: Europe (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback