AITopics | Pandey, Gaurav

Collaborating Authors

Pandey, Gaurav

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Autonomy: Off-Road Navigation Enhanced by Human Input

Nagariya, Akhil, Filev, Dimitar, Saripalli, Srikanth, Pandey, Gaurav

arXiv.org Artificial IntelligenceFeb-25-2025

Successfully navigating these environments requires leveraging both visual and geometric features effectively. Modeling tire-terrain interactions and vehicle dynamics across diverse off-road conditions is a complex task. Even with accurate models, tuning the planning algorithm to navigate safely across different terrains demands extensive time and expertise. In our research, we introduce a demonstration-based local planning algorithm that bypasses the need for directly modeling these intricate dynamic interactions. Instead, it learns navigation preferences from human driving data, demonstrating the ability to adapt these learned behaviors from simulations to real vehicles with minimal manual adjustments. Our approach uses utility functions to directly extract key features from segmented images and learns human driving behaviour using demonstration data. This approach diverges from traditional methods, which typically require either extensive labeled data for end-to-end learning or precise sensor calibration and global mapping in classical robotics approaches.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2502.1876

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Transportation (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.55)

Add feedback

Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG

Bhushan, Kushagra, Nandwani, Yatin, Khandelwal, Dinesh, Gupta, Sonam, Pandey, Gaurav, Raghu, Dinesh, Joshi, Sachindra

arXiv.org Artificial IntelligenceFeb-17-2025

Retrieval-Augmented Generation (RAG) has emerged as a prominent method for incorporating domain knowledge into Large Language Models (LLMs). While RAG enhances response relevance by incorporating retrieved domain knowledge in the context, retrieval errors can still lead to hallucinations and incorrect answers. To recover from retriever failures, domain knowledge is injected by fine-tuning the model to generate the correct response, even in the case of retrieval errors. However, we observe that without systematic knowledge augmentation, fine-tuned LLMs may memorize new information but still fail to extract relevant domain knowledge, leading to poor performance. In this work, we present a novel framework that significantly enhances the fine-tuning process by augmenting the training data in two ways -- context augmentation and knowledge paraphrasing. In context augmentation, we create multiple training samples for a given QA pair by varying the relevance of the retrieved information, teaching the model when to ignore and when to rely on retrieved content. In knowledge paraphrasing, we fine-tune with multiple answers to the same question, enabling LLMs to better internalize specialized knowledge. To mitigate catastrophic forgetting due to fine-tuning, we add a domain-specific identifier to a question and also utilize a replay buffer containing general QA pairs. Experimental results demonstrate the efficacy of our method over existing techniques, achieving up to 10\% relative gain in token-level recall while preserving the LLM's generalization capabilities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.08356

Country: North America (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.67)
Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Longitudinal Ensemble Integration for sequential classification with multimodal data

Susman, Aviad, Krishnamurthy, Rupak, Li, Yan Chak, Olaimat, Mohammad, Bozdag, Serdar, Varghese, Bino, Sheikh-Bahaei, Nasim, Pandey, Gaurav

arXiv.org Artificial IntelligenceNov-8-2024

A BSTRACT Effectively modeling multimodal longitudinal data is a pressing need in various application areas, especially biomedicine. Despite this, few approaches exist in the literature for this problem, with most not adequately taking into account the multimodality of the data. In this study, we developed multiple configurations of a novel multimodal and longitudinal learning framework, Longitudinal Ensemble Integration (LEI), for sequential classification. We evaluated LEI's performance, and compared it against existing approaches, for the early detection of dementia, which is among the most studied multimodal sequential classification tasks. LEI outperformed these approaches due to its use of intermediate base predictions arising from the individual data modalities, which enabled their better integration over time. LEI's design also enabled the identification of features that were consistently important across time for the effective prediction of dementia-related diagnoses. Overall, our work demonstrates the potential of LEI for sequential classification from longitudinal multimodal data. 1 I NTRODUCTION Data that are both longitudinal/temporal and multimodal are increasingly being used in combination with machine learning for forecasting, especially in medical diagnosis (Brand et al., 2019; Zhang & Shen, 2012; Feis et al., 2019; Li et al., 2023). Recently, a number of promising approaches for sequential classification from such data have been introduced (Eslami et al., 2023; Zhang et al., 2011; Wang et al., 2016; Zhang et al., 2024). For instance, some approaches have used recurrent neural network (RNN)-based models applied to data sequences where the modalities at each time point have been concatenated into a long feature vector, sometimes referred to as early fusion (Nguyen et al., 2020; Olaimat et al., 2023; Maheux et al., 2023).

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.05983

Country:

North America > United States > Texas > Denton County > Denton (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward Physics-Aware Deep Learning Architectures for LiDAR Intensity Simulation

Anand, Vivek, Lohani, Bharat, Pandey, Gaurav, Mishra, Rakesh

arXiv.org Artificial IntelligenceApr-24-2024

Autonomous vehicles (AVs) heavily rely on LiDAR perception for environment understanding and navigation. LiDAR intensity provides valuable information about the reflected laser signals and plays a crucial role in enhancing the perception capabilities of AVs. However, accurately simulating LiDAR intensity remains a challenge due to the unavailability of material properties of the objects in the environment, and complex interactions between the laser beam and the environment. The proposed method aims to improve the accuracy of intensity simulation by incorporating physics-based modalities within the deep learning framework. One of the key entities that captures the interaction between the laser beam and the objects is the angle of incidence. In this work we demonstrate that the addition of the LiDAR incidence angle as a separate input to the deep neural networks significantly enhances the results. We present a comparative study between two prominent deep learning architectures: U-NET a Convolutional Neural Network (CNN), and Pix2Pix a Generative Adversarial Network (GAN). We implemented these two architectures for the intensity prediction task and used SemanticKITTI and VoxelScape datasets for experiments. The comparative analysis reveals that both architectures benefit from the incidence angle as an additional input. Moreover, the Pix2Pix architecture outperforms U-NET, especially when the incidence angle is incorporated.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2404.15774

Country: North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization

Jiang, Peng, Pandey, Gaurav, Saripalli, Srikanth

arXiv.org Artificial IntelligenceMar-17-2024

This paper presents a novel system designed for 3D mapping and visual relocalization using 3D Gaussian Splatting. Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment. By leveraging LiDAR data to initiate the training of the 3D Gaussian Splatting map, our system constructs maps that are both detailed and geometrically accurate. To mitigate excessive GPU memory usage and facilitate rapid spatial queries, we employ a combination of a 2D voxel map and a KD-tree. This preparation makes our method well-suited for visual localization tasks, enabling efficient identification of correspondences between the query image and the rendered image from the Gaussian Splatting map via normalized cross-correlation (NCC). Additionally, we refine the camera pose of the query image using feature-based matching and the Perspective-n-Point (PnP) technique. The effectiveness, adaptability, and precision of our system are demonstrated through extensive evaluation on the KITTI360 dataset.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2403.11367

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

Pandey, Gaurav, Nandwani, Yatin, Naseem, Tahira, Mishra, Mayank, Xu, Guangxuan, Raghu, Dinesh, Joshi, Sachindra, Munawar, Asim, Astudillo, Ramón Fernandez

arXiv.org Artificial IntelligenceFeb-4-2024

Following the success of Proximal Policy Optimization (PPO) for Reinforcement Learning from Human Feedback (RLHF), new techniques such as Sequence Likelihood Calibration (SLiC) and Direct Policy Optimization (DPO) have been proposed that are offline in nature and use rewards in an indirect manner. These techniques, in particular DPO, have recently become the tools of choice for LLM alignment due to their scalability and performance. However, they leave behind important features of the PPO approach. Methods such as SLiC or RRHF make use of the Reward Model (RM) only for ranking/preference, losing fine-grained information and ignoring the parametric form of the RM (eg., Bradley-Terry, Plackett-Luce), while methods such as DPO do not use even a separate reward model. In this work, we propose a novel approach, named BRAIn, that re-introduces the RM as part of a distribution matching approach.BRAIn considers the LLM distribution conditioned on the assumption of output goodness and applies Bayes theorem to derive an intractable posterior distribution where the RM is explicitly represented. BRAIn then distills this posterior into an amortized inference network through self-normalized importance sampling, leading to a scalable offline algorithm that significantly outperforms prior art in summarization and AntropicHH tasks. BRAIn also has interesting connections to PPO and DPO for specific RM choices.

kl-divergence, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.02479

Country: Europe > Denmark (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
(3 more...)

Add feedback

eipy: An Open-Source Python Package for Multi-modal Data Integration using Heterogeneous Ensembles

Bennett, Jamie J. R., Li, Yan Chak, Pandey, Gaurav

arXiv.org Artificial IntelligenceJan-17-2024

In this paper, we introduce eipy--an open-source Python package for developing effective, multi-modal heterogeneous ensembles for classification. eipy simultaneously provides both a rigorous, and user-friendly framework for comparing and selecting the best-performing multi-modal data integration and predictive modeling methods by systematically evaluating their performance using nested cross-validation. The package is designed to leverage scikit-learn-like estimators as components to build multi-modal predictive models. An up-to-date user guide, including API reference and tutorials, for eipy is maintained at https://eipy.readthedocs.io . The main repository for this project can be found on GitHub at https://github.com/GauravPandeyLab/eipy .

artificial intelligence, machine learning, modality, (14 more...)

arXiv.org Artificial Intelligence

2401.09582

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.98)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Sharable Clothoid-based Continuous Motion Planning for Connected Automated Vehicles

Oh, Sanghoon, Chen, Qi, Tseng, H. Eric, Pandey, Gaurav, Orosz, Gabor

arXiv.org Artificial IntelligenceDec-17-2023

A continuous motion planning method for connected automated vehicles is considered for generating feasible trajectories in real-time using three consecutive clothoids. The proposed method reduces path planning to a small set of nonlinear algebraic equations such that the generated path can be efficiently checked for feasibility and collision. After path planning, velocity planning is executed while maintaining a parallel simple structure. Key strengths of this framework include its interpretability, shareability, and ability to specify boundary conditions. Its interpretability and shareability stem from the succinct representation of the resulting local motion plan using a handful of physically meaningful parameters. Vehicles may share these parameters via V2X communication so that the recipients can precisely reconstruct the planned trajectory of the senders and respond accordingly. The proposed local planner guarantees the satisfaction of boundary conditions, thus ensuring seamless integration with a wide array of higher-level global motion planners. The tunable nature of the method enables tailoring the local plans to specific maneuvers like turns at intersections, lane changes, and U-turns.

artificial intelligence, constraint, vehicle, (13 more...)

arXiv.org Artificial Intelligence

2312.1088

Country:

Asia (0.93)
Europe (0.68)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.68)
Energy > Oil & Gas (0.47)
Automobiles & Trucks > Manufacturer (0.46)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.87)

Add feedback

RGB-X Object Detection via Scene-Specific Fusion Modules

Deevi, Sri Aditya, Lee, Connor, Gan, Lu, Nagesh, Sushruth, Pandey, Gaurav, Chung, Soon-Jo

arXiv.org Artificial IntelligenceOct-30-2023

Multimodal deep sensor fusion has the potential to enable autonomous vehicles to visually understand their surrounding environments in all weather conditions. However, existing deep sensor fusion methods usually employ convoluted architectures with intermingled multimodal features, requiring large coregistered multimodal datasets for training. In this work, we present an efficient and modular RGB-X fusion network that can leverage and fuse pretrained single-modal models via scene-specific fusion modules, thereby enabling joint input-adaptive network architectures to be created using small, coregistered multimodal datasets. Our experiments demonstrate the superiority of our method compared to existing works on RGB-thermal and RGB-gated datasets, performing fusion using only a small amount of additional parameters. Our code is available at https://github.com/dsriaditya999/RGBXFusion.

artificial intelligence, machine learning, module, (17 more...)

arXiv.org Artificial Intelligence

2310.19372

Country:

Europe > Netherlands (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Industry:

Automobiles & Trucks (0.94)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(2 more...)

Add feedback

Stereo Visual Odometry with Deep Learning-Based Point and Line Feature Matching using an Attention Graph Neural Network

Kannapiran, Shenbagaraj, Bendapudi, Nalin, Yu, Ming-Yuan, Parikh, Devarth, Berman, Spring, Vora, Ankit, Pandey, Gaurav

arXiv.org Artificial IntelligenceAug-2-2023

Abstract-- Robust feature matching forms the backbone for most Visual Simultaneous Localization and Mapping (vSLAM), visual odometry, 3D reconstruction, and Structure from Motion (SfM) algorithms. However, recovering feature matches from texture-poor scenes is a major challenge and still remains an open area of research. In this paper, we present a Stereo Visual Odometry (StereoVO) technique based on point and line features which uses a novel feature-matching mechanism based on an Attention Graph Neural Network that is designed to perform well even under adverse weather conditions such as fog, haze, rain, and snow, and dynamic lighting conditions such as nighttime illumination and glare scenarios. We perform experiments on multiple real and synthetic datasets to validate our method's ability to perform StereoVO under lowvisibility weather and lighting conditions through robust point and line matches. The results demonstrate that our method achieves more line feature matches than state-of-the-art linematching algorithms, which when complemented with point feature matches perform consistently well in adverse weather and dynamic lighting conditions.

artificial intelligence, line feature, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.01125

Country:

North America > United States > Michigan (0.14)
North America > United States > Arizona (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback