AITopics | Kumar, Rahul

Collaborating Authors

Kumar, Rahul

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ORI: O Routing Intelligence

Shadid, Ahmad, Kumar, Rahul, Mayank, Mohit

arXiv.org Artificial IntelligenceFeb-17-2025

Single large language models (LLMs) often fall short when faced with the ever-growing range of tasks, making a single-model approach insufficient. We address this challenge by proposing ORI (O Routing Intelligence), a dynamic framework that leverages a set of LLMs. By intelligently routing incoming queries to the most suitable model, ORI not only improves task-specific accuracy, but also maintains efficiency. Comprehensive evaluations across diverse benchmarks demonstrate consistent accuracy gains while controlling computational overhead. By intelligently routing queries, ORI outperforms the strongest individual models by up to 2.7 points on MMLU and 1.8 points on MuSR, ties the top performance on ARC, and on BBH. These results underscore the benefits of a multi-model strategy and demonstrate how ORI's adaptive architecture can more effectively handle diverse tasks, offering a scalable, high-performance solution for a system of multiple large language models.

artificial intelligence, large language model, routing intelligence, (2 more...)

arXiv.org Artificial Intelligence

2502.10051

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems

Platnick, Daniel, Abdelnour, Bishoy, Earl, Eamon, Kumar, Rahul, Rezaei, Zahra, Tsangaris, Thomas, Lagum, Faraj

arXiv.org Artificial IntelligenceJul-18-2024

In recent years, there has been increased demand for speech-to-speech translation (S2ST) systems in industry settings. Although successfully commercialized, cloning-based S2ST systems expose their distributors to liabilities when misused by individuals and can infringe on personality rights when exploited by media organizations. This work proposes a regulated S2ST framework called Preset-Voice Matching (PVM). PVM removes cross-lingual voice cloning in S2ST by first matching the input voice to a similar prior consenting speaker voice in the target-language. With this separation, PVM avoids cloning the input speaker, ensuring PVM systems comply with regulations and reduce risk of misuse. Our results demonstrate PVM can significantly improve S2ST system run-time in multi-speaker settings and the naturalness of S2ST synthesized speech. To our knowledge, PVM is the first explicitly regulated S2ST framework leveraging similarly-matched preset-voices for dynamic S2ST tasks.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.13153

Country:

North America > Canada (0.14)
Europe > Portugal (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pretraining Data and Tokenizer for Indic LLM

Kumar, Rahul, Kakde, Shubham, Rajput, Divyansh, Ibrahim, Daud, Nahata, Rishabh, Sowjanya, Pidathala, Kumar, Deepak

arXiv.org Artificial IntelligenceJul-17-2024

We present a novel approach to data preparation for developing multilingual Indic large language model. Our meticulous data acquisition spans open-source and proprietary sources, including Common Crawl, Indic books, news articles, and Wikipedia, ensuring a diverse and rich linguistic representation. For each Indic language, we design a custom preprocessing pipeline to effectively eliminate redundant and low-quality text content. Additionally, we perform deduplication on Common Crawl data to address the redundancy present in 70% of the crawled web pages. This study focuses on developing high-quality data, optimizing tokenization for our multilingual dataset for Indic large language models with 3B and 7B parameters, engineered for superior performance in Indic languages. We introduce a novel multilingual tokenizer training strategy, demonstrating our custom-trained Indic tokenizer outperforms the state-of-the-art OpenAI Tiktoken tokenizer, achieving a superior token-to-word ratio for Indic languages.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2407.12481

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain

Kumar, Rahul, Dibbu, Amar Raja, Harsola, Shrutendra, Subrahmaniam, Vignesh, Modi, Ashutosh

arXiv.org Artificial IntelligenceJun-12-2024

Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as finance and accounting. Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that could help extract information from accounting databases via natural language queries. In this resource paper, we aim to fill this gap by proposing a new large-scale Text-to-SQL dataset for the accounting and financial domain: BookSQL. The dataset consists of 100k natural language queries-SQL pairs, and accounting databases of 1 million records. We experiment with and analyze existing state-of-the-art models (including GPT-4) for the Text-to-SQL task on BookSQL. We find significant performance gaps, thus pointing towards developing more focused models for this domain.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.0786

Country:

North America > United States (0.67)
Europe (0.46)

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

meSch: Multi-Agent Energy-Aware Scheduling for Task Persistence

Naveed, Kaleb Ben, Dang, An, Kumar, Rahul, Panagou, Dimitra

arXiv.org Artificial IntelligenceJun-6-2024

This paper develops a scheduling protocol for a team of autonomous robots that operate in long-term persistent tasks. The proposed framework, called meSch, accounts for the robots' limited battery capacity and the presence of a single charging station, and achieves the following contributions: 1) First, it guarantees exclusive use of the charging station by one robot at a time; the approach is online, applicable to general nonlinear robot models, does not require robots to be deployed at different times, and can handle robots with different discharge rates. 2) Second, we consider the scenario when the charging station is mobile and subject to uncertainty. This approach ensures that the robots can rendezvous with the charging station while considering the uncertainty in its position. Finally, we provide the evaluation of the efficacy of meSch in simulation and experimental case studies.

artificial intelligence, robot, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2406.0456

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model

Pandey, Rohit, Waghela, Hetvi, Rakshit, Sneha, Rangari, Aparna, Singh, Anjali, Kumar, Rahul, Ghosal, Ratnadeep, Sen, Jaydip

arXiv.org Artificial IntelligenceApr-2-2024

A text generation model is a machine learning model that uses neural networks, especially transformers architecture to generate contextually relevant text based on linguistic patterns learned from extensive corpora. The models are trained on a huge amount of textual data so that they can model and learn complex concepts of any language like its grammar, vocabulary, phrases, and styles. Text generation models can increase the productivity of humans in their current business processes. These models are already automating the process of content creation across industries for the generation of reports, summaries, and emails among others. These models are also allowing for a greater level of personalization in communications between businesses and their customers.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2404.01786

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Promising Solution (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

Much Easier Said Than Done: Falsifying the Causal Relevance of Linear Decoding Methods

Hayne, Lucas, Suresh, Abhijit, Jain, Hunar, Kumar, Rahul, Carter, R. McKell

arXiv.org Artificial IntelligenceNov-8-2022

Linear classifier probes are frequently utilized to better understand how neural networks function. Researchers have approached the problem of determining unit importance in neural networks by probing their learned, internal representations. Linear classifier probes identify highly selective units as the most important for network function. Whether or not a network actually relies on high selectivity units can be tested by removing them from the network using ablation. Surprisingly, when highly selective units are ablated they only produce small performance deficits, and even then only in some cases. In spite of the absence of ablation effects for selective neurons, linear decoding methods can be effectively used to interpret network function, leaving their effectiveness a mystery. To falsify the exclusive role of selectivity in network function and resolve this contradiction, we systematically ablate groups of units in subregions of activation space. Here, we find a weak relationship between neurons identified by probes and those identified by ablation. More specifically, we find that an interaction between selectivity and the average activity of the unit better predicts ablation performance deficits for groups of units in AlexNet, VGG16, MobileNetV2, and ResNet101. Linear decoders are likely somewhat effective because they overlap with those units that are causally important for network function. Interpretability methods could be improved by focusing on causally important units.

activation space, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.04367

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays

Kumar, Rahul, Mathias, Sandeep, Saha, Sriparna, Bhattacharyya, Pushpak

arXiv.org Artificial IntelligenceFeb-1-2021

Most research in the area of automatic essay grading (AEG) is geared towards scoring the essay holistically while there has also been some work done on scoring individual essay traits. In this paper, we describe a way to score essays holistically using a multi-task learning (MTL) approach, where scoring the essay holistically is the primary task, and scoring the essay traits is the auxiliary task. We compare our results with a single-task learning (STL) approach, using both LSTMs and BiLSTMs. We also compare our results of the auxiliary task with such tasks done in other AEG systems. To find out which traits work best for different types of essays, we conduct ablation tests for each of the essay traits. We also report the runtime and number of training parameters for each system. We find that MTL-based BiLSTM system gives the best results for scoring the essay holistically, as well as performing well on scoring the essay traits.

computational linguistics, deep learning, educational technology, (18 more...)

arXiv.org Artificial Intelligence

2102.00781

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > New York (0.14)
(4 more...)

Genre: Research Report > New Finding (0.69)

Industry:

Education > Assessment & Standards > Student Performance (0.48)
Education > Educational Technology > Educational Software (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback