AITopics | Ishfaq, Haque

Collaborating Authors

Ishfaq, Haque

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning

Ishfaq, Haque, Wang, Guangyuan, Islam, Sami Nur, Precup, Doina

arXiv.org Artificial IntelligenceJan-29-2025

Existing actor-critic algorithms, which are popular for continuous control reinforcement learning (RL) tasks, suffer from poor sample efficiency due to lack of principled exploration mechanism within them. Motivated by the success of Thompson sampling for efficient exploration in RL, we propose a novel model-free RL algorithm, Langevin Soft Actor Critic (LSAC), which prioritizes enhancing critic learning through uncertainty estimation over policy optimization. LSAC employs three key innovations: approximate Thompson sampling through distributional Langevin Monte Carlo (LMC) based $Q$ updates, parallel tempering for exploring multiple modes of the posterior of the $Q$ function, and diffusion synthesized state-action samples regularized with $Q$ action gradients. Our extensive experiments demonstrate that LSAC outperforms or matches the performance of mainstream model-free RL algorithms for continuous control tasks. Notably, LSAC marks the first successful application of an LMC based Thompson sampling in continuous control tasks with continuous action spaces.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2501.17827

Country: Asia (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Offline Multitask Representation Learning for Reinforcement Learning

Ishfaq, Haque, Nguyen-Tang, Thanh, Feng, Songtao, Arora, Raman, Wang, Mengdi, Yin, Ming, Precup, Doina

arXiv.org Artificial IntelligenceMar-18-2024

We study offline multitask representation learning in reinforcement learning (RL), where a learner is provided with an offline dataset from different tasks that share a common representation and is asked to learn the shared representation. We theoretically investigate offline multitask low-rank RL, and propose a new algorithm called MORL for offline multitask representation learning. Furthermore, we examine downstream RL in reward-free, offline and online scenarios, where a new task is introduced to the agent that shares the same representation as the upstream offline tasks. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2403.11574

Country:

North America > United States (0.46)
Europe (0.45)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

Ishfaq, Haque, Lan, Qingfeng, Xu, Pan, Mahmood, A. Rupam, Precup, Doina, Anandkumar, Anima, Azizzadenesheli, Kamyar

arXiv.org Artificial IntelligenceMay-29-2023

We present a scalable and effective exploration strategy based on Thompson sampling for reinforcement learning (RL). One of the key shortcomings of existing Thompson sampling algorithms is the need to perform a Gaussian approximation of the posterior distribution, which is not a good surrogate in most practical settings. We instead directly sample the Q function from its posterior distribution, by using Langevin Monte Carlo, an efficient type of Markov Chain Monte Carlo (MCMC) method. Our method only needs to perform noisy gradient descent updates to learn the exact posterior distribution of the Q function, which makes our approach easy to deploy in deep RL. We provide a rigorous theoretical analysis for the proposed method and demonstrate that, in the linear Markov decision process (linear MDP) setting, it has a regret bound of $\tilde{O}(d^{3/2}H^{5/2}\sqrt{T})$, where $d$ is the dimension of the feature mapping, $H$ is the planning horizon, and $T$ is the total number of steps. We apply this approach to deep RL, by using Adam optimizer to perform gradient updates. Our approach achieves better or similar results compared with state-of-the-art deep RL algorithms on several challenging exploration tasks from the Atari57 suite.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2305.18246

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Energy > Oil & Gas > Upstream (0.34)

Add feedback

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

Ishfaq, Haque, Cui, Qiwen, Nguyen, Viet, Ayoub, Alex, Yang, Zhuoran, Wang, Zhaoran, Precup, Doina, Yang, Lin F.

arXiv.org Machine LearningJun-14-2021

We propose a model-free reinforcement learning In this work, we propose an exploration strategy inspired algorithm inspired by the popular randomized by the popular Randomized Least Squares Value Iteration least squares value iteration (RLSVI) algorithm (RLSVI) algorithm (Osband et al., 2016b; Russo, 2019; as well as the optimism principle. Unlike Zanette et al., 2020a) as well as by the optimism principle existing upper-confidence-bound (UCB) based (Brafman & Tennenholtz, 2001; Jaksch et al., 2010; Jin approaches, which are often computationally intractable, et al., 2018; 2020; Wang et al., 2020), which is efficient in our algorithm drives exploration by simply both statistical and computational sense, and can be easily perturbing the training data with judiciously plugged into common RL algorithms, including UCB-VI chosen i.i.d.

neural network, probability, upstream oil & gas, (16 more...)

arXiv.org Machine Learning

2106.07841

Country:

North America > United States > California (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.42)

Add feedback

TVAE: Triplet-Based Variational Autoencoder using Metric Learning

Ishfaq, Haque, Hoogi, Assaf, Rubin, Daniel

arXiv.org Machine LearningFeb-12-2018

Deep metric learning has been demonstrated to be highly effective in learning semantic representation and encoding information that can be used to measure data similarity, by relying on the embedding learned from metric learning. At the same time, variational autoencoder (VAE) has widely been used to approximate inference and proved to have a good performance for directed probabilistic models. However, for traditional VAE, the data label or feature information are intractable. Similarly, traditional representation learning approaches fail to represent many salient aspects of the data. In this project, we propose a novel integrated framework to learn latent embedding in VAE by incorporating deep metric learning. The features are learned by optimizing a triplet loss on the mean vectors of VAE in conjunction with standard evidence lower bound (ELBO) of VAE. This approach, which we call Triplet based Variational Autoencoder (TVAE), allows us to capture more fine-grained information in the latent embedding. Our model is tested on MNIST data set and achieves a high triplet accuracy of 95.60% while the traditional VAE (Kingma & Welling, 2013) achieves triplet accuracy of 75.08%.

health & medicine, neural network, triplet, (16 more...)

arXiv.org Machine Learning

1802.04403

Country: North America > United States (0.30)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback