AITopics | Rezaei-Shoshtari, Sahand

Collaborating Authors

Rezaei-Shoshtari, Sahand

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fairness in Reinforcement Learning with Bisimulation Metrics

Rezaei-Shoshtari, Sahand, Yurchyk, Hanna, Fujimoto, Scott, Precup, Doina, Meger, David

arXiv.org Artificial IntelligenceDec-31-2024

Ensuring long-term fairness is crucial when developing automated decision making systems, specifically in dynamic and sequential environments. By maximizing their reward without consideration of fairness, AI agents can introduce disparities in their treatment of groups or individuals. In this paper, we establish the connection between bisimulation metrics and group fairness in reinforcement learning. We propose a novel approach that leverages bisimulation metrics to learn reward functions and observation dynamics, ensuring that learners treat groups fairly while reflecting the original problem. We demonstrate the effectiveness of our method in addressing disparities in sequential decision making problems through empirical evaluation on a standard fairness benchmark consisting of lending and college admission scenarios. As machine learning continues to shape decision making systems, understanding and addressing its potential risks and biases becomes increasingly imperative. This concern is especially pronounced in sequential decision making, where neglecting algorithmic fairness can create a self-reinforcing cycle that amplifies existing disparities (Jabbari et al., 2017; D'Amour et al., 2020). In response, there is a growing recognition of the importance of leveraging reinforcement learning (RL) to tackle decision making problems that have traditionally been approached through supervised learning paradigms, in order to achieve long-term fairness (Nashed et al., 2023). Yin et al. (2023) define long-term fairness in RL as the optimization of the cumulative reward subject to a constraint on the cumulative utility, reflecting fairness over a time horizon. Recent efforts to achieve fairness in RL have primarily relied on metrics adopted from supervised learning, such as demographic parity (Dwork et al., 2012) or equality of opportunity (Hardt et al., 2016b). These metrics are typically integrated into a constrained Markov decision process (MDP) framework to learn a policy that adheres to the criterion (Wen et al., 2021; Yin et al., 2023; Satija et al., 2023; Hu & Zhang, 2022). However, this approach is limited by its requirement for complex constrained optimization, which can introduce additional complexity and hyperparameters into the underlying RL algorithm. Moreover, these methods make the implicit assumption that stakeholders are incorporating these fairness constraints into their decision making process. However, in reality, this may not occur due to various external and uncontrollable factors (Kusner & Loftus, 2020).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2412.17123

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry:

Education > Educational Setting > Higher Education (0.34)
Banking & Finance > Credit (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Panangaden, Prakash, Rezaei-Shoshtari, Sahand, Zhao, Rosie, Meger, David, Precup, Doina

arXiv.org Artificial IntelligenceMay-9-2023

Reinforcement learning on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In this paper, we study abstraction in the continuous-control setting, and extend the definition of MDP homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization. Based on these theorems, we propose a family of actor-critic algorithms that are able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. Finally, we introduce a series of environments with continuous symmetries to further demonstrate the ability of our algorithm for action abstraction in the presence of such symmetries. We demonstrate the effectiveness of our method on our environments, as well as on challenging visual control tasks from the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance, and the visualizations of the latent space clearly demonstrate the structure of the learned abstraction.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2305.05666

Country:

North America > Canada > Quebec (0.28)
North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Hypernetworks for Zero-shot Transfer in Reinforcement Learning

Rezaei-Shoshtari, Sahand, Morissette, Charlotte, Hogan, Francois Robert, Dudek, Gregory, Meger, David

arXiv.org Artificial IntelligenceJan-2-2023

In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.

machine learning, reinforcement learning, torso length, (16 more...)

arXiv.org Artificial Intelligence

2211.15457

Country: North America > Canada > Quebec (0.28)

Genre: Research Report (0.82)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Rezaei-Shoshtari, Sahand, Zhao, Rosie, Panangaden, Prakash, Meger, David, Precup, Doina

arXiv.org Artificial IntelligenceSep-15-2022

Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms. In this paper, we study abstraction in the continuous-control setting. We extend the definition of MDP homomorphisms to encompass continuous actions in continuous state spaces. We derive a policy gradient theorem on the abstract MDP, which allows us to leverage approximate symmetries of the environment for policy optimization. Based on this theorem, we propose an actor-critic algorithm that is able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. We demonstrate the effectiveness of our method on benchmark tasks in the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance when learning from pixel observations.

machine learning, reinforcement learning, time step, (16 more...)

arXiv.org Artificial Intelligence

2209.07364

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning Intuitive Physics with Multimodal Generative Models

Rezaei-Shoshtari, Sahand, Hogan, Francois Robert, Jenkin, Michael, Meger, David, Dudek, Gregory

arXiv.org Artificial IntelligenceJan-12-2021

Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelligent and anticipatory actions. This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. Visual information captures object properties such as 3D shape and location, while tactile information provides critical cues about interaction forces and resulting object motion when it makes contact with the environment. Utilizing a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces, our system captures both the visual appearance and the tactile properties of objects. We interpret the dual stream signals from the sensor using a Multimodal Variational Autoencoder (MVAE), allowing us to capture both modalities of contacting objects and to develop a mapping from visual to tactile interaction and vice-versa. Additionally, the perceptual system can be used to infer the outcome of future physical interactions, which we validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.

artificial intelligence, neural network, sensor, (17 more...)

arXiv.org Artificial Intelligence

2101.04454

Country:

Europe (0.93)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback