AITopics | Rahmani, Kia

Collaborating Authors

Rahmani, Kia

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning

Banerjee, Arko, Rahmani, Kia, Biswas, Joydeep, Dillig, Isil

arXiv.org Artificial IntelligenceMay-22-2024

Among approaches for provably safe reinforcement learning, Model Predictive Shielding (MPS) has proven effective at complex tasks in continuous, high-dimensional state spaces, by leveraging a backup policy to ensure safety when the learned policy attempts to take risky actions. However, while MPS can ensure safety both during and after training, it often hinders task progress due to the conservative and task-oblivious nature of backup policies. This paper introduces Dynamic Model Predictive Shielding (DMPS), which optimizes reinforcement learning objectives while maintaining provable safety. DMPS employs a local planner to dynamically select safe recovery actions that maximize both short-term progress as well as long-term rewards. Crucially, the planner and the neural policy play a synergistic role in DMPS. When planning recovery actions for ensuring safety, the planner utilizes the neural policy to estimate long-term rewards, allowing it to observe beyond its short-term planning horizon. Conversely, the neural policy under training learns from the recovery plans proposed by the planner, converging to policies that are both high-performing and safe in practice. This approach guarantees safety during and after training, with bounded recovery regret that decreases exponentially with planning horizon depth. Experimental results demonstrate that DMPS converges to policies that rarely require shield interventions after training and achieve higher rewards compared to several state-of-the-art baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2405.13863

Country:

North America > United States > Texas (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation (0.94)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Programming-by-Demonstration for Long-Horizon Robot Tasks

Patton, Noah, Rahmani, Kia, Missula, Meghana, Biswas, Joydeep, Dillig, Işil

arXiv.org Artificial IntelligenceNov-15-2023

The goal of programmatic Learning from Demonstration (LfD) is to learn a policy in a programming language that can be used to control a robot's behavior from a set of user demonstrations. This paper presents a new programmatic LfD algorithm that targets long-horizon robot tasks which require synthesizing programs with complex control flow structures, including nested loops with multiple conditionals. Our proposed method first learns a program sketch that captures the target program's control flow and then completes this sketch using an LLM-guided search procedure that incorporates a novel technique for proving unrealizability of programming-by-demonstration problems. We have implemented our approach in a new tool called PROLEX and present the results of a comprehensive experimental evaluation on 120 benchmarks involving complex tasks and environments. We show that, given a 120 second time limit, PROLEX can find a program consistent with the demonstrations in 80% of the cases. Furthermore, for 81% of the tasks for which a solution is returned, PROLEX is able to find the ground truth program with just one demonstration. In comparison, CVC5, a syntax guided synthesis tool, is only able to solve 25% of the cases even when given the ground truth program sketch, and an LLM-based approach, GPT-Synth, is unable to solve any of the tasks due to the environment complexity.

demonstration, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3632860

2305.03129

Country:

North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Programmatic Imitation Learning from Unlabeled and Noisy Demonstrations

Xin, Jimmy, Zheng, Linus, Wei, Jiayi, Rahmani, Kia, Holtz, Jarrett, Dillig, Isil, Biswas, Joydeep

arXiv.org Artificial IntelligenceOct-29-2023

Imitation Learning (IL) is a promising paradigm for teaching robots to perform novel tasks using demonstrations. Most existing approaches for IL utilize neural networks (NN), however, these methods suffer from several well-known limitations: they 1) require large amounts of training data, 2) are hard to interpret, and 3) are hard to repair and adapt. There is an emerging interest in programmatic imitation learning (PIL), which offers significant promise in addressing the above limitations. In PIL, the learned policy is represented in a programming language, making it amenable to interpretation and repair. However, state-of-the-art PIL algorithms assume access to action labels and struggle to learn from noisy real-world demonstrations. In this paper, we propose PLUNDER, a novel PIL algorithm that integrates a probabilistic program synthesizer in an iterative Expectation-Maximization (EM) framework to address these shortcomings. Unlike existing PIL approaches, PLUNDER synthesizes probabilistic programmatic policies that are particularly well-suited for modeling the uncertainties inherent in real-world demonstrations. Our approach leverages an EM loop to simultaneously infer the missing action labels and the most likely probabilistic policy. We benchmark PLUNDER against several established IL techniques, and demonstrate its superiority across five challenging imitation learning tasks under noise. PLUNDER policies achieve 95% accuracy in matching the given demonstrations, outperforming the next best baseline by 19%. Additionally, policies generated by PLUNDER successfully complete the tasks 17% more frequently than the nearest baseline.

demonstration, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2303.0144

Country: North America > United States > Texas (0.14)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry:

Leisure & Entertainment > Games (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis

Rahmani, Kia, Raza, Mohammad, Gulwani, Sumit, Le, Vu, Morris, Daniel, Radhakrishna, Arjun, Soares, Gustavo, Tiwari, Ashish

arXiv.org Artificial IntelligenceSep-3-2021

Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms, such as a combination of natural language and examples. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. Machine-learned pre-trained models (PTMs) are adept at handling ambiguous natural language, but struggle with generating syntactically and semantically precise code. Program synthesis techniques can generate correct code, often even from incomplete but precise specifications, such as examples, but they are unable to work with the ambiguity of natural languages. We present an approach that combines PTMs with component-based synthesis (CBS): PTMs are used to generate candidates programs from the natural language description of the task, which are then used to guide the CBS procedure to find the program that matches the precise examples-based specification. We use our combination approach to instantiate multi-modal synthesis systems for two programming domains: the domain of regular expressions and the domain of CSS selectors. Our evaluation demonstrates the effectiveness of our domain-agnostic approach in comparison to a state-of-the-art specialized system, and the generality of our approach in providing multi-modal program synthesis from natural language and examples in different programming domains.

artificial intelligence, logic programming, synthesis, (19 more...)

arXiv.org Artificial Intelligence

2109.02445

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback