AITopics | Nath, Somjit

Collaborating Authors

Nath, Somjit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Behaviour Discovery and Attribution for Explainable Reinforcement Learning

Rishav, Rishav, Nath, Somjit, Michalski, Vincent, Kahou, Samira Ebrahimi

arXiv.org Artificial IntelligenceMar-19-2025

Explaining the decisions made by reinforcement learning (RL) agents is critical for building trust and ensuring reliability in real-world applications. Traditional approaches to explainability often rely on saliency analysis, which can be limited in providing actionable insights. Recently, there has been growing interest in attributing RL decisions to specific trajectories within a dataset. However, these methods often generalize explanations to long trajectories, potentially involving multiple distinct behaviors. Often, providing multiple more fine grained explanations would improve clarity. In this work, we propose a framework for behavior discovery and action attribution to behaviors in offline RL trajectories. Our method identifies meaningful behavioral segments, enabling more precise and granular explanations associated with high level agent behaviors. This approach is adaptable across diverse environments with minimal modifications, offering a scalable and versatile solution for behavior discovery and attribution for explainable RL.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2503.14973

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Unsupervised Event Outlier Detection in Continuous Time

Nath, Somjit, Lui, Yik Chau, Liu, Siqi

arXiv.org Artificial IntelligenceNov-25-2024

Event sequence data record the occurrences of events in continuous time. Event sequence forecasting based on temporal point processes (TPPs) has been extensively studied, but outlier or anomaly detection, especially without any supervision from humans, is still underexplored. In this work, we develop, to the best our knowledge, the first unsupervised outlier detection approach to detecting abnormal events. Our novel unsupervised outlier detection framework is based on ideas from generative adversarial networks (GANs) and reinforcement learning (RL). We train a 'generator' that corrects outliers in the data with a 'discriminator' that learns to discriminate the corrected data from the real data, which may contain outliers. A key insight is that if the generator made a mistake in the correction, it would generate anomalies that are different from the anomalies in the real data, so it serves as data augmentation for the discriminator learning. Different from typical GAN-based outlier detection approaches, our method employs the generator to detect outliers in an online manner. The experimental results show that our method can detect event outliers more accurately than the state-of-the-art approaches.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.16427

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spectral Temporal Contrastive Learning

Morin, Sacha, Nath, Somjit, Kahou, Samira Ebrahimi, Wolf, Guy

arXiv.org Artificial IntelligenceDec-7-2023

Learning useful data representations without requiring labels is a cornerstone of modern deep learning. Self-supervised learning methods, particularly contrastive learning (CL), have proven successful by leveraging data augmentations to define positive pairs. This success has prompted a number of theoretical studies to better understand CL and investigate theoretical bounds for downstream linear probing tasks. This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts. In this paper, we adapt recent work on Spectral CL to formulate Spectral Temporal Contrastive Learning (STCL). We discuss a population loss based on a state graph derived from a time-homogeneous reversible Markov chain with uniform stationary distribution. The STCL loss enables to connect the linear probing performance to the spectral properties of the graph, and can be estimated by considering previously observed data sequences as an ensemble of MCMC chains.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2312.00966

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.56)

Add feedback

Prioritizing Samples in Reinforcement Learning with Reducible Loss

Sujit, Shivakanth, Nath, Somjit, Braga, Pedro H. M., Kahou, Samira Ebrahimi

arXiv.org Artificial IntelligenceNov-1-2023

Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed in the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naive strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample. We define the learn-ability of a sample as the steady decrease of the training loss associated with this sample over time. We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn, typically caused by noise or stochasticity. We empirically show that across multiple domains our method is more robust than random sampling and also better than just prioritizing with respect to the training loss, i.e. the temporal difference loss, which is used in prioritized experience replay. The code to reproduce our experiments can be found here.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2208.10483

Country:

North America > Canada > Quebec (0.14)
North America > United States > New York (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Discovering Object-Centric Generalized Value Functions From Pixels

Nath, Somjit, Subbaraj, Gopeshh Raaj, Khetarpal, Khimya, Kahou, Samira Ebrahimi

arXiv.org Artificial IntelligenceJun-27-2023

Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent "question" functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2304.13892

Country:

North America > United States (0.46)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

Kalwar, Durgesh, Shelke, Omkar, Nath, Somjit, Meisheri, Hardik, Khadilkar, Harshad

arXiv.org Artificial IntelligenceFeb-27-2023

Improving sample efficiency is a key challenge in reinforcement learning, especially in environments with large state spaces and sparse rewards. In literature, this is resolved either through the use of auxiliary tasks (subgoals) or through clever exploration strategies. Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse. However, few studies have attempted to tackle both large scale and reward sparsity at the same time. This paper explores the idea of combining exploration with auxiliary task learning using General Value Functions (GVFs) and a directed exploration strategy. We present a way to learn value functions which can be used to sample actions and provide directed exploration. Experiments on navigation tasks with varying grid sizes demonstrate the performance advantages over several competitive baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2203.00874

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (0.65)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Nath, Somjit, Baranwal, Mayank, Khadilkar, Harshad

arXiv.org Artificial IntelligenceAug-17-2021

Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that algorithms fail to learn anything substantial. This paper formally describes the notion of Markov Decision Processes (MDPs) with stochastic delays and shows that delayed MDPs can be transformed into equivalent standard MDPs (without delays) with significantly simplified cost structure. We employ this equivalence to derive a model-free Delay-Resolved RL framework and show that even a simple RL algorithm built upon this framework achieves near-optimal rewards in environments with stochastic delays in actions and observations. The delay-resolved deep Q-network (DRDQN) algorithm is bench-marked on a variety of environments comprising of multi-step and stochastic delays and results in better performance, both in terms of achieving near-optimal rewards and minimizing the computational overhead thereof, with respect to the currently established algorithms.

algorithm, health & medicine, immunology, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3459637.3482386

2108.07555

Country:

Asia (0.68)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

Nath, Somjit, Verma, Richa, Ray, Abhik, Khadilkar, Harshad

arXiv.org Machine LearningDec-21-2020

We propose a generic reward shaping approach for improving the Similar approaches appear to have worked in literature on container rate of convergence in reinforcement learning (RL), called Self loading [27] and railway scheduling [11] problems, without Improvement Based REwards, or SIBRE. The approach is designed being formally proposed or analysed. One study on bin packing for use in conjunction with any existing RL algorithm, and consists does propose reward shaping explicitly, and is described below. of rewarding improvement over the agent's own past performance. Literature on formal reward shaping: The proposed approach We prove that SIBRE converges in expectation under the same (SIBRE) falls under the category of reward shaping approaches conditions as the original RL algorithm. The reshaped rewards for RL, but with some key novelty points as described help discriminate between policies when the original rewards are below. Prior literature has shown that the optimal policy learnt weakly discriminated or sparse. Experiments on several well-known by RL remains invariant under reward shaping if the modification benchmark environments with different RL algorithms show that can be expressed as a potential function [15].

deep learning, neural network, sibre, (21 more...)

arXiv.org Machine Learning

2004.09846

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Reinforcement Learning for Multi-Product Multi-Node Inventory Management in Supply Chains

Sultana, Nazneen N, Meisheri, Hardik, Baniwal, Vinita, Nath, Somjit, Ravindran, Balaraman, Khadilkar, Harshad

arXiv.org Artificial IntelligenceJun-7-2020

This paper describes the application of reinforcement learning (RL) to multi-product inventory management in supply chains. The problem description and solution are both adapted from a real-world business solution. The novelty of this problem with respect to supply chain literature is (i) we consider concurrent inventory management of a large number (50 to 1000) of products with shared capacity, (ii) we consider a multi-node supply chain consisting of a warehouse which supplies three stores, (iii) the warehouse, stores, and transportation from warehouse to stores have finite capacities, (iv) warehouse and store replenishment happen at different time scales and with realistic time lags, and (v) demand for products at the stores is stochastic. We describe a novel formulation in a multi-agent (hierarchical) reinforcement learning framework that can be used for parallelised decision-making, and use the advantage actor critic (A2C) algorithm with quantised action spaces to solve the problem. Experiments show that the proposed approach is able to handle a multi-objective reward comprised of maximising product sales and minimising wastage of perishable products.

ground transportation, survey article, warehouse, (21 more...)

arXiv.org Artificial Intelligence

2006.04037

Country:

Asia > India (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Retail (1.00)
Energy > Oil & Gas (0.46)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback