AITopics | Kumar, Rohit

Plotting

Kumar, Rohit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

Beniwal, Himanshu, Venkat, Reddybathuni, Kumar, Rohit, Srivibhav, Birudugadda, Jain, Daksh, Doddi, Pavan, Dhande, Eshwar, Ananth, Adithya, Kuldeep, null, Kubadia, Heer, Sharda, Pratham, Singh, Mayank

arXiv.org Artificial IntelligenceMar-29-2025

This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 888k training instances and 35k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.23088

Country:

Europe (1.00)
Asia > India (0.29)
North America > United States (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Ensemble based approach to quantifying uncertainty of LLM based classifications

Rajamohan, Srijith, Salhin, Ahmed, Frazier, Josh, Kumar, Rohit, Tsai, Yu-Cheng, Cook, Todd

arXiv.org Artificial IntelligenceFeb-12-2025

The output of Large Language Models (LLMs) are a function of the internal model's parameters and the input provided into the context window. The hypothesis presented here is that under a greedy sampling strategy the variance in the LLM's output is a function of the conceptual certainty embedded in the model's parametric knowledge, as well as the lexical variance in the input. Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations. This is then applied to a classification problem and a probabilistic method is proposed for estimating the certainties of the predicted classes.

large language model, natural language, prediction, (20 more...)

arXiv.org Artificial Intelligence

2502.08631

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A two-stage transliteration approach to improve performance of a multilingual ASR

Kumar, Rohit

arXiv.org Artificial IntelligenceOct-9-2024

End-to-end Automatic Speech Recognition (ASR) systems are rapidly claiming to become state-of-art over other modeling methods. Several techniques have been introduced to improve their ability to handle multiple languages. However, due to variation in writing scripts for different languages, while decoding acoustically similar units, they do not always map to an appropriate grapheme in the target language. This restricts the scalability and adaptability of the model while dealing with multiple languages in code-mixing scenarios. This paper presents an approach to build a language-agnostic end-to-end model trained on a grapheme set obtained by projecting the multilingual grapheme data to the script of a more generic target language. This approach saves the acoustic model from retraining to span over a larger space and can easily be extended to multiple languages. A two-stage transliteration process realizes this approach and proves to minimize speech-class confusion. We performed experiments with an end-to-end multilingual speech recognition system for two Indic Languages, namely Nepali and Telugu. The original grapheme space of these languages is projected to the Devanagari script. We achieved a relative reduction of 20% in the Word Error Rate (WER) and 24% in the Character Error Rate (CER) in the transliterated space, over other language-dependent modeling methods.

artificial intelligence, machine learning, speech recognition, (13 more...)

arXiv.org Artificial Intelligence

2410.14709

Country: Asia > India (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sitting, Standing and Walking Control of the Series-Parallel Hybrid Recupera-Reha Exoskeleton

Tijjani, Ibrahim, Kumar, Rohit, Boukheddimi, Melya, Trampler, Mathias, Kumar, Shivesh, Kirchner, Frank

arXiv.org Artificial IntelligenceOct-8-2024

This paper presents advancements in the functionalities of the Recupera-Reha lower extremity exoskeleton robot. The exoskeleton features a series-parallel hybrid design characterized by multiple kinematic loops resulting in 148 degrees of freedom in its spanning tree and 102 independent loop closure constraints, which poses significant challenges for modeling and control. To address these challenges, we applied an optimal control approach to generate feasible trajectories such as sitting, standing, and static walking, and tested these trajectories on the exoskeleton robot. Our method efficiently solves the optimal control problem using a serial abstraction of the model to generate trajectories. It then utilizes the full series-parallel hybrid model, which takes all the kinematic loop constraints into account to generate the final actuator commands. The experimental results demonstrate the effectiveness of our approach in generating the desired motions for the exoskeleton.

artificial intelligence, exoskeleton, human computer interaction, (17 more...)

arXiv.org Artificial Intelligence

2410.06008

Country: Europe > Germany > Bremen > Bremen (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Assistive Technologies (1.00)

Add feedback

Evaluating the Robustness of Off-Road Autonomous Driving Segmentation against Adversarial Attacks: A Dataset-Centric analysis

Deoli, Pankaj, Kumar, Rohit, Vierling, Axel, Berns, Karsten

arXiv.org Artificial IntelligenceFeb-3-2024

This study investigates the vulnerability of semantic segmentation models to adversarial input perturbations, in the domain of off-road autonomous driving. Despite good performance in generic conditions, the state-of-the-art classifiers are often susceptible to (even) small perturbations, ultimately resulting in inaccurate predictions with high confidence. Prior research has directed their focus on making models more robust by modifying the architecture and training with noisy input images, but has not explored the influence of datasets in adversarial attacks. Our study aims to address this gap by examining the impact of non-robust features in off-road datasets and comparing the effects of adversarial attacks on different segmentation network architectures. To enable this, a robust dataset is created consisting of only robust features and training the networks on this robustified dataset. We present both qualitative and quantitative analysis of our findings, which have important implications on improving the robustness of machine learning models in off-road autonomous driving applications. Additionally, this work contributes to the safe navigation of autonomous robot Unimog U5023 in rough off-road unstructured environments by evaluating the robustness of segmentation outputs. The code is publicly available at https://github.com/rohtkumar/adversarial_attacks_ on_segmentation

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2402.02154

Country: Europe > Germany (0.29)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.82)

Add feedback

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

Radhakrishnan, Srijith, Yang, Chao-Han Huck, Khan, Sumeer Ahmad, Kumar, Rohit, Kiani, Narsis A., Gomez-Cabrero, David, Tegner, Jesper N.

arXiv.org Artificial IntelligenceOct-16-2023

We introduce a new cross-modal fusion technique designed for generative error correction in automatic speech recognition (ASR). Our methodology leverages both acoustic information and external linguistic representations to generate accurate speech transcription contexts. This marks a step towards a fresh paradigm in generative error correction within the realm of n-best hypotheses. Unlike the existing ranking-based rescoring methods, our approach adeptly uses distinct initialization techniques and parameter-efficient algorithms to boost ASR performance derived from pre-trained speech and text models. Through evaluation across diverse ASR datasets, we evaluate the stability and reproducibility of our fusion technique, demonstrating its improved word error rate relative (WERR) performance in comparison to n-best hypotheses by relatively 37.66%. To encourage future research, we have made our code and pre-trained models open source at https://github.com/Srijith-rkr/Whispering-LLaMA.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2310.06434

Country: Europe (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Speech enhancement with frequency domain auto-regressive modeling

Purushothaman, Anurenjan, Dutta, Debottam, Kumar, Rohit, Ganapathy, Sriram

arXiv.org Artificial IntelligenceSep-23-2023

Speech applications in far-field real world settings often deal with signals that are corrupted by reverberation. The task of dereverberation constitutes an important step to improve the audible quality and to reduce the error rates in applications like automatic speech recognition (ASR). We propose a unified framework of speech dereverberation for improving the speech quality and the ASR performance using the approach of envelope-carrier decomposition provided by an autoregressive (AR) model. The AR model is applied in the frequency domain of the sub-band speech signals to separate the envelope and carrier parts. A novel neural architecture based on dual path long short term memory (DPLSTM) model is proposed, which jointly enhances the sub-band envelope and carrier components. The dereverberated envelope-carrier signals are modulated and the sub-band signals are synthesized to reconstruct the audio signal back. The DPLSTM model for dereverberation of envelope and carrier components also allows the joint learning of the network weights for the down stream ASR task. In the ASR tasks on the REVERB challenge dataset as well as on the VOiCES dataset, we illustrate that the joint learning of speech dereverberation network and the E2E ASR model yields significant performance improvements over the baseline ASR system trained on log-mel spectrogram as well as other benchmarks for dereverberation (average relative improvements of 10-24% over the baseline system). The speech quality improvements, evaluated using subjective listening tests, further highlight the improved quality of the reconstructed audio.

artificial intelligence, frequency domain auto-regressive modeling, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TASLP.2023.3317570

2309.13537

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Add feedback

Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada

Kolar, Sanjana, Kumar, Rohit

arXiv.org Artificial IntelligenceJul-28-2023

This research investigates the effectiveness of ChatGPT, an AI language model by OpenAI, in translating English into Hindi, Telugu, and Kannada languages, aimed at assisting tourists in India's linguistically diverse environment. To measure the translation quality, a test set of 50 questions from diverse fields such as general knowledge, food, and travel was used. These were assessed by five volunteers for accuracy and fluency, and the scores were subsequently converted into a BLEU score. The BLEU score evaluates the closeness of a machine-generated translation to a human translation, with a higher score indicating better translation quality. The Hindi translations outperformed others, showcasing superior accuracy and fluency, whereas Telugu translations lagged behind. Human evaluators rated both the accuracy and fluency of translations, offering a comprehensive perspective on the language model's performance.

artificial intelligence, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2307.15376

Country:

North America > United States (0.47)
Asia > India (0.35)

Genre: Research Report (0.82)

Industry: Consumer Products & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice

Babii, Andrii, Chen, Xi, Ghysels, Eric, Kumar, Rohit

arXiv.org Machine LearningOct-25-2020

The importance of asymmetries in prediction problems arising in economics has been recognized for a long time. In this paper, we focus on binary choice problems in a data-rich environment with general loss functions. In contrast to the asymmetric regression problems, the binary choice with general loss functions and high-dimensional datasets is challenging and not well understood. Econometricians have studied binary choice problems for a long time, but the literature does not offer computationally attractive solutions in data-rich environments. In contrast, the machine learning literature has many computationally attractive algorithms that form the basis for much of the automated procedures that are implemented in practice, but it is focused on symmetric loss functions that are independent of individual characteristics. One of the main contributions of our paper is to show that the theoretically valid predictions of binary outcomes with arbitrary loss functions can be achieved via a very simple reweighting of the logistic regression, or other state-of-the-art machine learning techniques, such as boosting or (deep) neural networks. We apply our analysis to racial justice in pretrial detention.

assumption 3, law enforcement, neural network, (22 more...)

arXiv.org Machine Learning

2010.08463

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.89)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback