Goto

Collaborating Authors

 Kalwar, Sanket


LatentGAN Autoencoder: Learning Disentangled Latent Distribution

arXiv.org Artificial Intelligence

Generative models like GAN(Goodfellow et al. 2014) and In this work, we present a new way to learn control VAE(Kingma and Welling 2014) have shown remarkable over autoencoder latent distribution with the help of AAE progress in recent years.Generative adversarial networks (Makhzani et al. 2016) which approximates posterior of the have shown state-of-the-art performance in a variety of latent distribution of autoencoder using any arbitrary prior tasks like Image-To-Image translation(Isola et al. 2018), distribution and using (Chen et al. 2016) for learning disentangled video prediction(Liang et al. 2017), Text-to-Image translation(Zhang representation. The previous work by (Wang, Peng, et al. 2017), drug discovery(Hong et al. 2019), and Ko 2019) had used a similar method of learning the latent and privacy-preserving(Shi et al. 2018). VAE has shown prior using AAE along with a perceptual loss and Information state-of-the-art performance in a variety of tasks like image maximization regularizer to train the decoder with generation(Gregor et al. 2015), semi-supervised learning(Maalรธe the help of an extra discriminator.


Constrained 6-DoF Grasp Generation on Complex Shapes for Improved Dual-Arm Manipulation

arXiv.org Artificial Intelligence

Efficiently generating grasp poses tailored to specific regions of an object is vital for various robotic manipulation tasks, especially in a dual-arm setup. This scenario presents a significant challenge due to the complex geometries involved, requiring a deep understanding of the local geometry to generate grasps efficiently on the specified constrained regions. Existing methods only explore settings involving table-top/small objects and require augmented datasets to train, limiting their performance on complex objects. We propose CGDF: Constrained Grasp Diffusion Fields, a diffusion-based grasp generative model that generalizes to objects with arbitrary geometries, as well as generates dense grasps on the target regions. CGDF uses a part-guided diffusion approach that enables it to get high sample efficiency in constrained grasping without explicitly training on massive constraint-augmented datasets. We provide qualitative and quantitative comparisons using analytical metrics and in simulation, in both unconstrained and constrained settings to show that our method can generalize to generate stable grasps on complex objects, especially useful for dual-arm manipulation settings, while existing methods struggle to do so.


DiffPrompter: Differentiable Implicit Visual Prompts for Semantic-Segmentation in Adverse Conditions

arXiv.org Artificial Intelligence

Semantic segmentation in adverse weather scenarios is a critical task for autonomous driving systems. While foundation models have shown promise, the need for specialized adaptors becomes evident for handling more challenging scenarios. We introduce DiffPrompter, a novel differentiable visual and latent prompting mechanism aimed at expanding the learning capabilities of existing adaptors in foundation models. Our proposed $\nabla$HFC image processing block excels particularly in adverse weather conditions, where conventional methods often fall short. Furthermore, we investigate the advantages of jointly training visual and latent prompts, demonstrating that this combined approach significantly enhances performance in out-of-distribution scenarios. Our differentiable visual prompts leverage parallel and series architectures to generate prompts, effectively improving object segmentation tasks in adverse conditions. Through a comprehensive series of experiments and evaluations, we provide empirical evidence to support the efficacy of our approach. Project page at https://diffprompter.github.io.