AITopics | Kumar, Anuj

Plotting

Kumar, Anuj

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CRAG -- Comprehensive RAG Benchmark

Yang, Xiao, Sun, Kai, Xin, Hao, Sun, Yushi, Bhalla, Nikita, Chen, Xiangsen, Choudhary, Sajal, Gui, Rongze Daniel, Jiang, Ziran Will, Jiang, Ziyu, Kong, Lingkun, Moran, Brian, Wang, Jiaqi, Xu, Yifan Ethan, Yan, An, Yang, Chenyu, Yuan, Eting, Zha, Hanwen, Tang, Nan, Chen, Lei, Scheffer, Nicolas, Liu, Yue, Shah, Nirav, Wanga, Rakesh, Kumar, Anuj, Yih, Wen-tau, Dong, Xin Luna

arXiv.org Artificial IntelligenceJun-7-2024

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA. Whereas most advanced LLMs achieve <=34% accuracy on CRAG, adding RAG in a straightforward manner improves the accuracy only to 44%. State-of-the-art industry RAG solutions only answer 63% questions without any hallucination. CRAG also reveals much lower accuracy in answering questions regarding facts with higher dynamism, lower popularity, or higher complexity, suggesting future research directions. The CRAG benchmark laid the groundwork for a KDD Cup 2024 challenge, attracting thousands of participants and submissions within the first 50 days of the competition. We commit to maintaining CRAG to serve research communities in advancing RAG solutions and general QA solutions.

benchmark, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2406.04744

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.68)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

Shenoy, Ashish, Lu, Yichao, Jayakumar, Srihari, Chatterjee, Debojeet, Moslehpour, Mohsen, Chuang, Pierce, Harpale, Abhay, Bhardwaj, Vikas, Xu, Di, Zhao, Shicong, Zhao, Longfang, Ramchandani, Ankit, Dong, Xin Luna, Kumar, Anuj

arXiv.org Artificial IntelligenceFeb-12-2024

We introduce Lumos, the first end-to-end multimodal question-answering system with text understanding capabilities. At the core of Lumos is a Scene Text Recognition (STR) component that extracts text from first person point-of-view images, the output of which is used to augment input to a Multimodal Large Language Model (MM-LLM). While building Lumos, we encountered numerous challenges related to STR quality, overall latency, and model inference. In this paper, we delve into those challenges, and discuss the system architecture, design choices, and modeling techniques employed to overcome these obstacles. We also provide a comprehensive evaluation for each component, showcasing high quality and efficiency.

large language model, latency, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2402.08017

Country:

Europe > Spain (0.16)
North America > United States (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.63)

Add feedback

A Posteriori Evaluation of a Physics-Constrained Neural Ordinary Differential Equations Approach Coupled with CFD Solver for Modeling Stiff Chemical Kinetics

Kumar, Tadbhagya, Kumar, Anuj, Pal, Pinaki

arXiv.org Artificial IntelligenceDec-18-2023

The high computational cost associated with solving for detailed chemistry poses a significant challenge for predictive computational fluid dynamics (CFD) simulations of turbulent reacting flows. These models often require solving a system of coupled stiff ordinary differential equations (ODEs). While deep learning techniques have been experimented with to develop faster surrogate models, they often fail to integrate reliably with CFD solvers. This instability arises because deep learning methods optimize for training error without ensuring compatibility with ODE solvers, leading to accumulation of errors over time. Recently, NeuralODE-based techniques have offered a promising solution by effectively modeling chemical kinetics. In this study, we extend the NeuralODE framework for stiff chemical kinetics by incorporating mass conservation constraints directly into the loss function during training. This ensures that the total mass and the elemental mass are conserved, a critical requirement for reliable downstream integration with CFD solvers. Proof-of-concept studies are performed with physics-constrained neuralODE (PC-NODE) approach for homogeneous autoignition of hydrogen-air mixture over a range of composition and thermodynamic conditions. Our results demonstrate that this enhancement not only improves the physical consistency with respect to mass conservation criteria but also ensures better robustness. Lastly, a posteriori studies are performed wherein the trained PC-NODE model is coupled with a 3D CFD solver for computing the chemical source terms. PC-NODE is shown to be more accurate relative to the purely data-driven neuralODE approach. Moreover, PC-NODE also exhibits robustness and generalizability to unseen initial conditions from within (interpolative capability) as well as outside (extrapolative capability) the training regime.

artificial intelligence, machine learning, mass fraction, (18 more...)

arXiv.org Artificial Intelligence

2312.00038

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.66)

Industry: Energy > Oil & Gas (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model

Moon, Seungwhan, Madotto, Andrea, Lin, Zhaojiang, Nagarajan, Tushar, Smith, Matt, Jain, Shashank, Yeh, Chun-Fu, Murugesan, Prakash, Heidari, Peyman, Liu, Yue, Srinet, Kavya, Damavandi, Babak, Kumar, Anuj

arXiv.org Artificial IntelligenceSep-27-2023

We present Any-Modality Augmented Language Model (AnyMAL), a unified model that reasons over diverse input modality signals (i.e. text, image, video, audio, IMU motion sensor), and generates textual responses. AnyMAL inherits the powerful text-based reasoning abilities of the state-of-the-art LLMs including LLaMA-2 (70B), and converts modality-specific signals to the joint textual space through a pre-trained aligner module. To further strengthen the multimodal LLM's capabilities, we fine-tune the model with a multimodal instruction set manually collected to cover diverse topics and tasks beyond simple QAs. We conduct comprehensive empirical analysis comprising both human and automatic evaluations, and demonstrate state-of-the-art performance on various multimodal tasks.

large language model, natural language, scalable any-modality augmented language model, (2 more...)

arXiv.org Artificial Intelligence

2309.16058

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

A Framework for Combustion Chemistry Acceleration with DeepONets

Kumar, Anuj, Echekki, Tarek

arXiv.org Artificial IntelligenceApr-6-2023

A combustion chemistry acceleration scheme is developed based on deep operator networks (DeepONets). The scheme is based on the identification of combustion reaction dynamics through a modified DeepOnet architecture such that the solutions of thermochemical scalars are projected to new solutions in small and flexible time increments. The approach is designed to efficiently implement chemistry acceleration without the need for computationally expensive integration of stiff chemistry. An additional framework of latent-space dynamics identification with modified DeepOnet is also proposed which enhances the computational efficiency and widens the applicability of the proposed scheme. The scheme is demonstrated on simple chemical kinetics of hydrogen oxidation to more complex chemical kinetics of n-dodecane high- and low-temperature oxidations. The proposed framework accurately learns the chemical kinetics and efficiently reproduces species and temperature temporal profiles corresponding to each application. In addition, a very large speed-up with a great extrapolation capability is also observed with the proposed scheme.

artificial intelligence, machine learning, react-deeponet, (16 more...)

arXiv.org Artificial Intelligence

2304.12188

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Extended Feature Space-Based Automatic Melanoma Detection System

Kumar, Shakti, Kumar, Anuj

arXiv.org Artificial IntelligenceJan-1-2023

Melanoma is the deadliest form of skin cancer. Uncontrollable growth of melanocytes leads to melanoma. Melanoma has been growing wildly in the last few decades. In recent years, the detection of melanoma using image processing techniques has become a dominant research field. The Automatic Melanoma Detection System (AMDS) helps to detect melanoma based on image processing techniques by accepting infected skin area images as input. A single lesion image is a source of multiple features. Therefore, It is crucial to select the appropriate features from the image of the lesion in order to increase the accuracy of AMDS. For melanoma detection, all extracted features are not important. Some of the extracted features are complex and require more computation tasks, which impacts the classification accuracy of AMDS. The feature extraction phase of AMDS exhibits more variability, therefore it is important to study the behaviour of AMDS using individual and extended feature extraction approaches. A novel algorithm ExtFvAMDS is proposed for the calculation of Extended Feature Vector Space. The six models proposed in the comparative study revealed that the HSV feature vector space for automatic detection of melanoma using Ensemble Bagged Tree classifier on Med-Node Dataset provided 99% AUC, 95.30% accuracy, 94.23% sensitivity, and 96.96% specificity.

artificial intelligence, machine learning, space-based automatic melanoma detection system, (1 more...)

arXiv.org Artificial Intelligence

2209.04588

Genre: Research Report (0.66)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)

Technology:

Information Technology > Data Science > Data Mining > Feature Extraction (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.93)

Add feedback

El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing

Einolghozati, Arash, Arora, Abhinav, Lecanda, Lorena Sainz-Maza, Kumar, Anuj, Gupta, Sonal

arXiv.org Artificial IntelligenceJan-28-2021

Being able to parse code-switched (CS) utterances, such as Spanish+English or Hindi+English, is essential to democratize task-oriented semantic parsing systems for certain locales. In this work, we focus on Spanglish (Spanish+English) and release a dataset, CSTOP, containing 5800 CS utterances alongside their semantic parses. We examine the CS generalizability of various Cross-lingual (XL) models and exhibit the advantage of pre-trained XL language models when data for only one language is present. As such, we focus on improving the pre-trained models for the case when only English corpus alongside either zero or a few CS training instances are available. We propose two data augmentation methods for the zero-shot and the few-shot settings: fine-tune using translate-and-align and augment using a generation model followed by match-and-filter. Combining the few-shot setting with the above improvements decreases the initial 30-point accuracy gap between the zero-shot and the full-data settings by two thirds.

computational linguistics, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2101.10524

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Federated User Representation Learning

Bui, Duc, Malik, Kshitiz, Goetz, Jack, Liu, Honglei, Moon, Seungwhan, Kumar, Anuj, Shin, Kang G.

arXiv.org Machine LearningSep-27-2019

Collaborative personalization, such as through learned user representations (em-beddings), can improve the prediction accuracy of neural-network-based models significantly. We propose Federated User Representation Learning (FURL), a simple, scalable, privacy-preserving and resource-efficient way to utilize existing neural personalization techniques in the Federated Learning (FL) setting. FURL divides model parameters into federated and private parameters. Private parameters, such as private user embeddings, are trained locally, but unlike federated parameters, they are not transferred to or averaged on the server. We show theoretically that this parameter split does not affect training for most model per-sonalization approaches. Storing user embeddings locally not only preserves user privacy, but also improves memory locality of personalization compared to on-server training. We evaluate FURL on two datasets, demonstrating a significant improvement in model quality with 8% and 51% performance increases, and approximately the same level of performance as centralized training with only 0% and 4% reductions. Furthermore, we show that user embeddings learned in FL and the centralized setting have a very similar structure, indicating that FURL can learn collaboratively through the shared parameters while preserving user privacy.

deep learning, neural network, private parameter, (18 more...)

arXiv.org Machine Learning

1909.12535

Country: North America > United States (0.48)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Active Federated Learning

Goetz, Jack, Malik, Kshitiz, Bui, Duc, Moon, Seungwhan, Liu, Honglei, Kumar, Anuj

arXiv.org Machine LearningSep-27-2019

Federated Learning allows for population level models to be trained without centralizing client data by transmitting the global model to clients, calculating gradients locally, then averaging the gradients. Downloading models and uploading gradients uses the client's bandwidth, so minimizing these transmission costs is important. The data on each client is highly variable, so the benefit of training on different clients may differ dramatically. To exploit this we propose Active Federated Learning, where in each round clients are selected not uniformly at random, but with a probability conditioned on the current model and the data on the client to maximize efficiency. We propose a cheap, simple and intuitive sampling scheme which reduces the number of required training iterations by 20-70% while maintaining the same model accuracy, and which mimics well known resampling techniques under certain conditions.

artificial intelligence, federated learning, machine learning, (14 more...)

arXiv.org Machine Learning

1909.12641

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Explore-Exploit: A Framework for Interactive and Online Learning

Liu, Honglei, Kumar, Anuj, Yang, Wenhai, Dumoulin, Benoit

arXiv.org Artificial IntelligenceNov-30-2018

Interactive user interfaces need to continuously evolve based on the interactions that a user has (or does not have) with the system. This may require constant exploration of various options that the system may have for the user and obtaining signals of user preferences on those. However, such an exploration, especially when the set of available options itself can change frequently, can lead to sub-optimal user experiences. We present Explore-Exploit: a framework designed to collect and utilize user feedback in an interactive and online setting that minimizes regressions in end-user experience. This framework provides a suite of online learning operators for various tasks such as personalization ranking, candidate selection and active learning. We demonstrate how to integrate this framework with run-time services to leverage online and interactive machine learning out-of-the-box. We also present results demonstrating the efficiencies that can be achieved using the Explore-Exploit framework.

computer based training, educational technology, operator, (25 more...)

arXiv.org Artificial Intelligence

1812.00116

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.70)

Industry: Education > Educational Setting > Online (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.67)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.63)
Information Technology > Data Science > Data Mining > Big Data (0.50)

Add feedback