AITopics | Krishna, K Madhava

Collaborating Authors

Krishna, K Madhava

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps

Karim, Md Faizal, Hashmi, Mohammed Saad, Bollimuntha, Shreya, Tapeti, Mahesh Reddy, Singh, Gaurav, Govindan, Nagamanikandan, Krishna, K Madhava

arXiv.org Artificial IntelligenceMar-11-2025

Dual-arm robotic grasping is crucial for handling large objects that require stable and coordinated manipulation. While single-arm grasping has been extensively studied, datasets tailored for dual-arm settings remain scarce. We introduce a large-scale dataset of 16 million dual-arm grasps, evaluated under improved force-closure constraints. Additionally, we develop a benchmark dataset containing 300 objects with approximately 30,000 grasps, evaluated in a physics simulation environment, providing a better grasp quality assessment for dual-arm grasp synthesis methods. Finally, we demonstrate the effectiveness of our dataset by training a Dual-Arm Grasp Classifier network that outperforms the state-of-the-art methods by 15\%, achieving higher grasp success rates and improved generalization across objects.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.08358

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation

Shah, Ansh, Krishna, K Madhava

arXiv.org Artificial IntelligenceDec-5-2024

Recovering metric depth from a single image remains a fundamental challenge in computer vision, requiring both scene understanding and accurate scaling. While deep learning has advanced monocular depth estimation, current models often struggle with unfamiliar scenes and layouts, particularly in zero-shot scenarios and when predicting scale-ergodic metric depth. We present MetricGold, a novel approach that harnesses generative diffusion model's rich priors to improve metric depth estimation. Building upon recent advances in MariGold, DDVM and Depth Anything V2 respectively, our method combines latent diffusion, log-scaled metric depth representation, and synthetic data training. MetricGold achieves efficient training on a single RTX 3090 within two days using photo-realistic synthetic data from HyperSIM, VirtualKitti, and TartanAir. Our experiments demonstrate robust generalization across diverse datasets, producing sharper and higher quality metric depth estimates compared to existing approaches.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.10886

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles

Garg, Anant, Krishna, K Madhava

arXiv.org Artificial IntelligenceNov-15-2024

In autonomous driving with image based state space, accurate prediction of future events and modeling diverse behavioral modes are essential for safety and effective decision-making. World model-based Reinforcement Learning (WMRL) approaches offers a promising solution by simulating future states from current state and actions. However, utility of world models is often limited by typical RL policies being limited to deterministic or single gaussian distribution. By failing to capture the full spectrum of possible actions, reduces their adaptability in complex, dynamic environments. In this work, we introduce Imagine-2-Drive, a framework that consists of two components, VISTAPlan, a high-fidelity world model for accurate future prediction and Diffusion Policy Actor (DPA), a diffusion based policy to model multi-modal behaviors for trajectory prediction. We use VISTAPlan to simulate and evaluate trajectories from DPA and use Denoising Diffusion Policy Optimization (DDPO) to train DPA to maximize the cumulative sum of rewards over the trajectories. We analyze the benefits of each component and the framework as a whole in CARLA with standard driving metrics. As a consequence of our twin novelties- VISTAPlan and DPA, we significantly outperform the state of the art (SOTA) world models on standard driving metrics by 15% and 20% on Route Completion and Success Rate respectively.

artificial intelligence, machine learning, world model, (14 more...)

arXiv.org Artificial Intelligence

2411.10171

Genre: Research Report (0.84)

Industry: Transportation (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control

Karim, Md Faizal, Bollimuntha, Shreya, Hashmi, Mohammed Saad, Das, Autrio, Singh, Gaurav, Sridhar, Srinath, Singh, Arun Kumar, Govindan, Nagamanikandan, Krishna, K Madhava

arXiv.org Artificial IntelligenceOct-25-2024

Dual-arm manipulation is an area of growing interest in the robotics community. Enabling robots to perform tasks that require the coordinated use of two arms, is essential for complex manipulation tasks such as handling large objects, assembling components, and performing human-like interactions. However, achieving effective dual-arm manipulation is challenging due to the need for precise coordination, dynamic adaptability, and the ability to manage interaction forces between the arms and the objects being manipulated. We propose a novel pipeline that combines the advantages of policy learning based on environment feedback and gradient-based optimization to learn controller gains required for the control outputs. This allows the robotic system to dynamically modulate its impedance in response to task demands, ensuring stability and dexterity in dual-arm operations. We evaluate our pipeline on a trajectory-tracking task involving a variety of large, complex objects with different masses and geometries. The performance is then compared to three other established methods for controlling dual-arm robots, demonstrating superior results.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2410.19712

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

Constrained 6-DoF Grasp Generation on Complex Shapes for Improved Dual-Arm Manipulation

Singh, Gaurav, Kalwar, Sanket, Karim, Md Faizal, Sen, Bipasha, Govindan, Nagamanikandan, Sridhar, Srinath, Krishna, K Madhava

arXiv.org Artificial IntelligenceJul-15-2024

Efficiently generating grasp poses tailored to specific regions of an object is vital for various robotic manipulation tasks, especially in a dual-arm setup. This scenario presents a significant challenge due to the complex geometries involved, requiring a deep understanding of the local geometry to generate grasps efficiently on the specified constrained regions. Existing methods only explore settings involving table-top/small objects and require augmented datasets to train, limiting their performance on complex objects. We propose CGDF: Constrained Grasp Diffusion Fields, a diffusion-based grasp generative model that generalizes to objects with arbitrary geometries, as well as generates dense grasps on the target regions. CGDF uses a part-guided diffusion approach that enables it to get high sample efficiency in constrained grasping without explicitly training on massive constraint-augmented datasets. We provide qualitative and quantitative comparisons using analytical metrics and in simulation, in both unconstrained and constrained settings to show that our method can generalize to generate stable grasps on complex objects, especially useful for dual-arm manipulation settings, while existing methods struggle to do so.

artificial intelligence, machine learning, target region, (17 more...)

arXiv.org Artificial Intelligence

2404.04643

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

DiffPrompter: Differentiable Implicit Visual Prompts for Semantic-Segmentation in Adverse Conditions

Kalwar, Sanket, Ungarala, Mihir, Jain, Shruti, Monis, Aaron, Konda, Krishna Reddy, Garg, Sourav, Krishna, K Madhava

arXiv.org Artificial IntelligenceOct-6-2023

Semantic segmentation in adverse weather scenarios is a critical task for autonomous driving systems. While foundation models have shown promise, the need for specialized adaptors becomes evident for handling more challenging scenarios. We introduce DiffPrompter, a novel differentiable visual and latent prompting mechanism aimed at expanding the learning capabilities of existing adaptors in foundation models. Our proposed $\nabla$HFC image processing block excels particularly in adverse weather conditions, where conventional methods often fall short. Furthermore, we investigate the advantages of jointly training visual and latent prompts, demonstrating that this combined approach significantly enhances performance in out-of-distribution scenarios. Our differentiable visual prompts leverage parallel and series architectures to generate prompts, effectively improving object segmentation tasks in adverse conditions. Through a comprehensive series of experiments and evaluations, we provide empirical evidence to support the efficacy of our approach. Project page at https://diffprompter.github.io.

artificial intelligence, differentiable implicit visual prompt, semantic-segmentation, (2 more...)

arXiv.org Artificial Intelligence

2310.04181

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.53)
Information Technology > Artificial Intelligence (0.53)

Add feedback

DiGrad: Multi-Task Reinforcement Learning with Shared Actions

Dewangan, Parijat, Phaniteja, S, Krishna, K Madhava, Sarkar, Abhishek, Ravindran, Balaraman

arXiv.org Machine LearningFeb-27-2018

Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate each other, making the learning unstable and sometimes less data efficient. In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient). The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network. We also propose a simple heuristic in the differential policy gradient update to further improve the learning. The proposed architecture was tested on 8 link planar manipulator and 27 degrees of freedom(DoF) Humanoid for learning multi-goal reachability tasks for 3 and 2 end effectors respectively. We show that our approach supports efficient multi-task learning in complex robotic systems, outperforming related methods in continuous action spaces.

artificial intelligence, digrad, reinforcement learning, (14 more...)

arXiv.org Machine Learning

1802.10463

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback