AITopics | Gupta, Vivek

Collaborating Authors

Gupta, Vivek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Inverse design of potential metastructures inspired from Indian medieval architectural elements

Bhattacharya, Bishakh, Gupta, Tanuj, Sharma, Arun Kumar, Dwivedi, Ankur, Gupta, Vivek, Sahana, Subhadeep, Pathak, Suryansh, Awasthi, Ashish

arXiv.org Artificial IntelligenceDec-22-2024

In this study, we immerse in the intricate world of patterns, examining the structural details of Indian medieval architecture for the discovery of motifs with great application potential from the mechanical metastructure perspective. The motifs that specifically engrossed us are derived from the tomb of I'timad-ud-Daula, situated in the city of Agra, close to the Taj Mahal. In an exploratory study, we designed nine interlaced metastructures inspired from the tomb's motifs. We fabricated the metastructures using additive manufacturing and studied their vibration characteristics experimentally and numerically. We also investigated bandgap modulation with metallic inserts in honeycomb interlaced metastructures. The comprehensive study of these metastructure panels reveals their high performance in controlling elastic wave propagation and generating suitable frequency bandgaps, hence having potential applications as waveguides for noise and vibration control. Finally, we developed a novel AI-based model trained on numerical datasets for the inverse design of metastructures with a desired bandgap.

artificial intelligence, machine learning, metastructure, (16 more...)

arXiv.org Artificial Intelligence

2412.12122

Country: Asia > India (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Materials (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unraveling the Truth: Do LLMs really Understand Charts? A Deep Dive into Consistency and Robustness

Mukhopadhyay, Srija, Qidwai, Adnan, Garimella, Aparna, Ramu, Pritika, Gupta, Vivek, Roth, Dan

arXiv.org Artificial IntelligenceJul-15-2024

Chart question answering (CQA) is a crucial area of Visual Language Understanding. However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored. This paper evaluates state-of-the-art VLMs on comprehensive datasets, developed specifically for this study, encompassing diverse question categories and chart formats. We investigate two key aspects: 1) the models' ability to handle varying levels of chart and question complexity, and 2) their robustness across different visual representations of the same underlying data. Our analysis reveals significant performance variations based on question and chart types, highlighting both strengths and weaknesses of current models. Additionally, we identify areas for improvement and propose future research directions to build more robust and reliable CQA systems. This study sheds light on the limitations of current models and paves the way for future advancements in the field.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2407.11229

Country:

North America > Canada (0.14)
North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models

Pandya, Pranshu, Talwarr, Agney S, Gupta, Vatsal, Kataria, Tushar, Gupta, Vivek, Roth, Dan

arXiv.org Artificial IntelligenceJul-14-2024

Cognitive textual and visual reasoning tasks, such as puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially. While LLMs and VLMs, through extensive training on large amounts of human-curated data, have attained a high level of pseudo-human intelligence in some common sense reasoning tasks, they still struggle with more complex reasoning tasks that require cognitive understanding. In this work, we introduce a new dataset, NTSEBench, designed to evaluate the cognitive multi-modal reasoning and problem-solving skills of large models. The dataset comprises 2,728 multiple-choice questions comprising of a total of 4,642 images across 26 categories sampled from the NTSE examination conducted nationwide in India, featuring both visual and textual general aptitude questions that do not rely on rote learning. We establish baselines on the dataset using state-of-the-art LLMs and VLMs. To facilitate a comparison between open source and propriety models, we propose four distinct modeling strategies to handle different modalities (text and images) in the dataset instances.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.1038

Country:

North America > United States (0.28)
Europe (0.28)
Asia > India (0.24)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.48)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

Abhyankar, Nikhil, Gupta, Vivek, Roth, Dan, Reddy, Chandan K.

arXiv.org Artificial IntelligenceJun-29-2024

Tabular reasoning involves interpreting unstructured queries against structured tables, requiring a synthesis of textual understanding and symbolic reasoning. Existing methods rely on either of the approaches and are constrained by their respective limitations. Textual reasoning excels in semantic interpretation unlike symbolic reasoning (SQL logic), but falls short in mathematical reasoning where SQL excels. In this paper, we introduce a novel algorithm H-STAR, comprising table extraction and adaptive reasoning, integrating both symbolic and semantic (text-based) approaches. To enhance evidence extraction, H-STAR employs a multi-view approach, incorporating step-by-step row and column retrieval. It also adapts reasoning strategies based on question types, utilizing symbolic reasoning for quantitative and logical tasks, and semantic reasoning for direct lookup and complex lexical queries. Our extensive experiments demonstrate that H-STAR significantly outperforms state-of-the-art methods across three tabular question-answering (QA) and fact-verification datasets, underscoring its effectiveness and efficiency.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.05952

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Enhancing Question Answering on Charts Through Effective Pre-training Tasks

Gupta, Ashim, Gupta, Vivek, Zhang, Shuo, He, Yujie, Zhang, Ning, Shah, Shalin

arXiv.org Artificial IntelligenceJun-14-2024

To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the limitation of current VisualQA models when applied to charts and plots. To investigate shortcomings of the state-of-the-art models, we conduct a comprehensive behavioral analysis, using ChartQA as a case study. Our findings indicate that existing models particularly underperform in answering questions related to the chart's structural and visual context, as well as numerical information. To address these issues, we propose three simple pre-training tasks that enforce the existing model in terms of both structural-visual knowledge, as well as its understanding of numerical questions. We evaluate our pre-trained model (called MatCha-v2) on three chart datasets - both extractive and abstractive question datasets - and observe that it achieves an average improvement of 1.7% over the baseline model.

machine learning, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2406.10085

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

Srivastava, Pragya, Malik, Manuj, Gupta, Vivek, Ganu, Tanuja, Roth, Dan

arXiv.org Artificial IntelligenceFeb-29-2024

Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models and prompting techniques, we assess how LLMs adapt to complex tables and mathematical tasks. We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps. The results provide insights into LLMs' capabilities and limitations in handling complex mathematical scenarios for semi-structured tables. Ultimately, we introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance while providing a nuanced understanding of LLMs abilities for such a task.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.11194

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Multi-Set Inoculation: Assessing Model Robustness Across Multiple Challenge Sets

Gupta, Vatsal, Pandya, Pranshu, Kataria, Tushar, Gupta, Vivek, Roth, Dan

arXiv.org Artificial IntelligenceNov-14-2023

Language models, given their black-box nature, often exhibit sensitivity to input perturbations, leading to trust issues due to hallucinations. To bolster trust, it's essential to understand these models' failure modes and devise strategies to enhance their performance. In this study, we propose a framework to study the effect of input perturbations on language models of different scales, from pre-trained models to large language models (LLMs). We use fine-tuning to train a robust model to perturbations, and we investigate whether exposure to one perturbation improves or degrades the model's performance on other perturbations. To address multi-perturbation robustness, we suggest three distinct training strategies. We also extend the framework to LLMs via a chain of thought(COT) prompting with exemplars. We instantiate our framework for the Tabular-NLI task and show that the proposed strategies train the model robust to different perturbations without losing accuracy on a given dataset.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.08662

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MANER: Multi-Agent Neural Rearrangement Planning of Objects in Cluttered Environments

Gupta, Vivek, Dhir, Praphpreet, Dani, Jeegn, Qureshi, Ahmed H.

arXiv.org Artificial IntelligenceNov-4-2023

Object rearrangement is a fundamental problem in robotics with various practical applications ranging from managing warehouses to cleaning and organizing home kitchens. While existing research has primarily focused on single-agent solutions, real-world scenarios often require multiple robots to work together on rearrangement tasks. This paper proposes a comprehensive learning-based framework for multi-agent object rearrangement planning, addressing the challenges of task sequencing and path planning in complex environments. The proposed method iteratively selects objects, determines their relocation regions, and pairs them with available robots under kinematic feasibility and task reachability for execution to achieve the target arrangement. Our experiments on a diverse range of simulated and real-world environments demonstrate the effectiveness and robustness of the proposed framework. Furthermore, results indicate improved performance in terms of traversal time and success rate compared to baseline approaches.

artificial intelligence, machine learning, planning & scheduling, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3327936

2306.06543

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data

Akhtar, Mubashara, Shankarampeta, Abhilash, Gupta, Vivek, Patil, Arpit, Cocarascu, Oana, Simperl, Elena

arXiv.org Artificial IntelligenceNov-3-2023

Numbers are crucial for various real-world domains such as finance, economics, and science. Thus, understanding and reasoning with numbers are essential skills for language models to solve different tasks. While different numerical benchmarks have been introduced in recent years, they are limited to specific numerical aspects mostly. In this paper, we propose a hierarchical taxonomy for numerical reasoning skills with more than ten reasoning types across four levels: representation, number sense, manipulation, and complex reasoning. We conduct a comprehensive evaluation of state-of-the-art models to identify reasoning challenges specific to them. Henceforth, we develop a diverse set of numerical probes employing a semi-automated approach. We focus on the tabular Natural Language Inference (TNLI) task as a case study and measure models' performance shifts. Our results show that no model consistently excels across all numerical reasoning types. Among the probed models, FlanT5 (few-/zero-shot) and GPT-3.5 (few-shot) demonstrate strong overall numerical reasoning skills compared to other models. Label-flipping probes indicate that models often exploit dataset artifacts to predict the correct labels.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2311.02216

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

InfoSync: Information Synchronization across Multilingual Semi-structured Tables

Khincha, Siddharth, Jain, Chelsi, Gupta, Vivek, Kataria, Tushar, Zhang, Shuo

arXiv.org Artificial IntelligenceJul-6-2023

Information Synchronization of semi-structured data across languages is challenging. For instance, Wikipedia tables in one language should be synchronized across languages. To address this problem, we introduce a new dataset InfoSyncC and a two-step method for tabular synchronization. InfoSync contains 100K entity-centric tables (Wikipedia Infoboxes) across 14 languages, of which a subset (3.5K pairs) are manually annotated. The proposed method includes 1) Information Alignment to map rows and 2) Information Update for updating missing/outdated information for aligned tables across multilingual tables. When evaluated on InfoSync, information alignment achieves an F1 score of 87.91 (en <-> non-en). To evaluate information updation, we perform human-assisted Wikipedia edits on Infoboxes for 603 table pairs. Our approach obtains an acceptance rate of 77.28% on Wikipedia, showing the effectiveness of the proposed method.

computational linguistic, information retrieval, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.03313

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.63)

Industry: Education (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.46)

Add feedback