Goto

Collaborating Authors

 Kumar, Ashutosh


From Fog to Failure: How Dehazing Can Harm Clear Image Object Detection

arXiv.org Artificial Intelligence

This study explores the challenges of integrating human visual cue-based dehazing into object detection, given the selective nature of human perception. While human vision adapts dynamically to environmental conditions, computational dehazing does not always enhance detection uniformly. We propose a multi-stage framework where a lightweight detector identifies regions of interest (RoIs), which are then enhanced via spatial attention-based dehazing before final detection by a heavier model. We analyze this phenomenon, investigate possible causes, and offer insights for designing hybrid pipelines that balance enhancement and detection. Our findings highlight the need for selective preprocessing and challenge assumptions about universal benefits from cascading transformations. Low-visibility conditions, such as rain, snow, fog, smoke, and haze, pose significant challenges for deep learning applications in autonomous vehicles, security and surveillance, maritime navigation, and agricultural robotics. Under these conditions, object detection models struggle due to reduced contrast and obscured features, leading to performance degradation. This study proposes a deep learning framework inspired by human visual perception to enhance object recognition in adverse visibility scenarios, particularly in foggy environments. A key motivation for this work comes from the impact of poor visibility on airport operations, where disruptions in taxiing and docking cause delays and increase reliance on ground support.


IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

arXiv.org Artificial Intelligence

Known by more than 1.5 billion people in the Indian subcontinent, Indic languages present unique challenges and opportunities for natural language processing (NLP) research due to their rich cultural heritage, linguistic diversity, and complex structures. IndicMMLU-Pro is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) across Indic languages, building upon the MMLU Pro (Massive Multitask Language Understanding) framework. Covering major languages such as Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi, Tamil, Telugu, and Urdu, our benchmark addresses the unique challenges and opportunities presented by the linguistic diversity of the Indian subcontinent. This benchmark encompasses a wide range of tasks in language comprehension, reasoning, and generation, meticulously crafted to capture the intricacies of Indian languages. IndicMMLU-Pro provides a standardized evaluation framework to push the research boundaries in Indic language AI, facilitating the development of more accurate, efficient, and culturally sensitive models. This paper outlines the benchmarks' design principles, task taxonomy, and data collection methodology, and presents baseline results from state-of-the-art multilingual models.


MoonMetaSync: Lunar Image Registration Analysis

arXiv.org Artificial Intelligence

This paper compares scale-invariant (SIFT) and scale-variant (ORB) feature detection methods, alongside our novel feature detector, IntFeat, specifically applied to lunar imagery. We evaluate these methods using low (128x128) and high-resolution (1024x1024) lunar image patches, providing insights into their performance across scales in challenging extraterrestrial environments. IntFeat combines high-level features from SIFT and low-level features from ORB into a single vector space for robust lunar image registration. We introduce SyncVision, a Python package that compares lunar images using various registration methods, including SIFT, ORB, and IntFeat. Our analysis includes upscaling low-resolution lunar images using bi-linear and bi-cubic interpolation, offering a unique perspective on registration effectiveness across scales and feature detectors in lunar landscapes. This research contributes to computer vision and planetary science by comparing feature detection methods for lunar imagery and introducing a versatile tool for lunar image registration and evaluation, with implications for multi-resolution image analysis in space exploration applications.


A Methodology-Oriented Study of Catastrophic Forgetting in Incremental Deep Neural Networks

arXiv.org Artificial Intelligence

Human being and different species of animals having the skills to gather, transferring knowledge, processing, fine-tune and generating information throughout their lifetime. The ability of learning throughout their lifespan is referred as continuous learning which is using neurocognition mechanism. Consequently, in real world computational system of incremental learning autonomous agents also needs such continuous learning mechanism which provide retrieval of information and long-term memory consolidation. However, the main challenge in artificial intelligence is that the incremental learning of the autonomous agent when new data confronted. In such scenarios, the main concern is catastrophic forgetting(CF), i.e., while learning the sequentially, neural network underfits the old data when it confronted with new data. To tackle this CF problem many numerous studied have been proposed, however it is very difficult to compare their performance due to dissimilarity in their evaluation mechanism. Here we focus on the comparison of all algorithms which are having similar type of evaluation mechanism. Here we are comparing three types of incremental learning methods: (1) Exemplar based methods, (2) Memory based methods, and (3) Network based method. In this survey paper, methodology oriented study for catastrophic forgetting in incremental deep neural network is addressed. Furthermore, it contains the mathematical overview of impact-full methods which can be help researchers to deal with CF.


The Ethics of Interaction: Mitigating Security Threats in LLMs

arXiv.org Artificial Intelligence

This paper comprehensively explores the ethical challenges arising from security threats to Language Learning Models (LLMs). These intricate digital repositories are increasingly integrated into our daily lives, making them prime targets for attacks that can compromise their training data and the confidentiality of their data sources. The paper delves into the nuanced ethical repercussions of such security threats on society and individual privacy. We scrutinize five major threats--prompt injection, jailbreaking, Personal Identifiable Information (PII) exposure, sexually explicit content, and hate-based content--going beyond mere identification to assess their critical ethical consequences and the urgency they create for robust defensive strategies. The escalating reliance on LLMs underscores the crucial need for ensuring these systems operate within the bounds of ethical norms, particularly as their misuse can lead to significant societal and individual harm. We propose conceptualizing and developing an evaluative tool tailored for LLMs, which would serve a dual purpose: guiding developers and designers in preemptive fortification of backend systems and scrutinizing the ethical dimensions of LLM chatbot responses during the testing phase. By comparing LLM responses with those expected from humans in a moral context, we aim to discern the degree to which AI behaviors align with the ethical values held by a broader society. Ultimately, this paper not only underscores the ethical troubles presented by LLMs; it also highlights a path toward cultivating trust in these systems.


NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

arXiv.org Artificial Intelligence

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robustness analysis results are available publicly on the NL-Augmenter repository (\url{https://github.com/GEM-benchmark/NL-Augmenter}).