AITopics | Gupta, Shubham

Collaborating Authors

Gupta, Shubham

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Transfer Framework for Enhancing Temporal Graph Learning in Data-Scarce Settings

Agarwal, Sidharth, Dubey, Tanishq, Gupta, Shubham, Bedathur, Srikanta

arXiv.org Artificial IntelligenceMar-11-2025

Dynamic interactions between entities are prevalent in domains like social platforms, financial systems, healthcare, and e-commerce. These interactions can be effectively represented as time-evolving graphs, where predicting future connections is a key task in applications such as recommendation systems. Temporal Graph Neural Networks (TGNNs) have achieved strong results for such predictive tasks but typically require extensive training data, which is often limited in real-world scenarios. One approach to mitigating data scarcity is leveraging pre-trained models from related datasets. However, direct knowledge transfer between TGNNs is challenging due to their reliance on node-specific memory structures, making them inherently difficult to adapt across datasets. To address this, we introduce a novel transfer approach that disentangles node representations from their associated features through a structured bipartite encoding mechanism. This decoupling enables more effective transfer of memory components and other learned inductive patterns from one dataset to another. Empirical evaluations on real-world benchmarks demonstrate that our method significantly enhances TGNN performance in low-data regimes, outperforming non-transfer baselines by up to 56\% and surpassing existing transfer strategies by 36\%

artificial intelligence, graph, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.00852

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology (0.66)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence

Granite Vision Team, null, Karlinsky, Leonid, Arbelle, Assaf, Daniels, Abraham, Nassar, Ahmed, Alfassi, Amit, Wu, Bo, Schwartz, Eli, Joshi, Dhiraj, Kondic, Jovana, Shabtay, Nimrod, Li, Pengyuan, Herzig, Roei, Abedin, Shafiq, Perek, Shaked, Harary, Sivan, Barzelay, Udi, Goldfarb, Adi Raz, Oliva, Aude, Wieles, Ben, Bhattacharjee, Bishwaranjan, Huang, Brandon, Auer, Christoph, Gutfreund, Dan, Beymer, David, Wood, David, Kuehne, Hilde, Hansen, Jacob, Shtok, Joseph, Wong, Ken, Bathen, Luis Angel, Mishra, Mayank, Lysak, Maksym, Dolfi, Michele, Yurochkin, Mikhail, Livathinos, Nikolaos, Harel, Nimrod, Azulai, Ophir, Naparstek, Oshri, de Lima, Rafael Teixeira, Panda, Rameswar, Doveh, Sivan, Gupta, Shubham, Das, Subhro, Zawad, Syed, Kim, Yusik, He, Zexue, Brooks, Alexander, Goodhart, Gabe, Govindjee, Anita, Leist, Derek, Ibrahim, Ibrahim, Soffer, Aya, Cox, David, Soule, Kate, Lastras, Luis, Desai, Nirmit, Ofek-koifman, Shila, Raghavan, Sriram, Syeda-Mahmood, Tanveer, Staar, Peter, Drory, Tal, Feris, Rogerio

arXiv.org Artificial IntelligenceFeb-14-2025

Ensuring the safety of generative MLLMs is absolutely crucial in order to prevent harm, build trust, address ethical concerns, and enable their responsible deployment in real-world applications. Our results demonstrate that Granite Vision performs almost at par with baselines (despite being the lightest MLLM in the comparison pool) for VLM-as-a-Judge task. Notably, the addition of Safety Vectors to Granite Vision leads to a significant improvement in safety classification performance. We do acknowledge that further work needs to be done to improve high-level reasoning and correct occasional incorrect outputs to improve reliability in sensitive tasks, which require nuanced classification. To address these, we will incorporate more reasoning-focused and structure-related data into the training process in the future. In addition, we showed in this paper that finding safety vectors (SVs) in Granite Vision's attention heads led to significant improvements when safety tasks were reformulated as classification problems. Current reliance for SVs is on few-shot samples which are informative but may have limited scope in terms of capturing the range of possible safety issues that can be encountered. To further improve the model's ability to identify and address all safety concerns, we plan to investigate scaling up SVs using more training data in future research.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2502.09927

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Education (1.00)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Gupta, Shubham, Li, Zichao, Chen, Tianyi, Subakan, Cem, Reddy, Siva, Taslakian, Perouz, Zantedeschi, Valentina

arXiv.org Artificial IntelligenceFeb-11-2025

Document retrieval is a core component of question-answering systems, as it enables conditioning answer generation on new and large-scale corpora. While effective, the standard practice of encoding documents into high-dimensional embeddings for similarity search entails large memory and compute footprints, and also makes it hard to inspect the inner workings of the system. In this paper, we propose a tree-based method for organizing and representing reference documents at various granular levels, which offers the flexibility to balance cost and utility, and eases the inspection of the corpus content and retrieval operations. Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches, hence directly optimizing for retrieval performance. Our evaluations show that ReTreever generally preserves full representation accuracy. Its hierarchical structure further provides strong coarse representations and enhances transparency by indirectly learning meaningful semantic groupings. Among hierarchical retrieval methods, ReTreever achieves the best retrieval accuracy at the lowest latency, proving that this family of techniques can be viable in practical applications.

machine learning, ndcg, question answering, (17 more...)

arXiv.org Artificial Intelligence

2502.07971

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Media (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.88)
(2 more...)

Add feedback

LAST SToP For Modeling Asynchronous Time Series

Gupta, Shubham, Durand, Thibaut, Taylor, Graham, Białokozowicz, Lilian W.

arXiv.org Artificial IntelligenceFeb-3-2025

We present a novel prompt design for Large Language Models (LLMs) tailored to Asynchronous Time Series. Unlike regular time series, which assume values at evenly spaced time points, asynchronous time series consist of timestamped events occurring at irregular intervals, each described in natural language. Our approach effectively utilizes the rich natural language of event descriptions, allowing LLMs to benefit from their broad world knowledge for reasoning across different domains and tasks. This allows us to extend the scope of asynchronous time series analysis beyond forecasting to include tasks like anomaly detection and data imputation. We further introduce Stochastic Soft Prompting, a novel prompt-tuning mechanism that significantly improves model performance, outperforming existing fine-tuning methods such as QLoRA. Through extensive experiments on real world datasets, we demonstrate that our approach achieves state-of-the-art performance across different tasks and datasets.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.01922

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Sports (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

Livathinos, Nikolaos, Auer, Christoph, Lysak, Maksym, Nassar, Ahmed, Dolfi, Michele, Vagenas, Panos, Ramis, Cesar Berrospi, Omenetti, Matteo, Dinkla, Kasper, Kim, Yusik, Gupta, Shubham, de Lima, Rafael Teixeira, Weber, Valery, Morin, Lucas, Meijer, Ingmar, Kuropiatnyk, Viktor, Staar, Peter W. J.

arXiv.org Artificial IntelligenceJan-27-2025

We introduce Docling, an easy-to-use, self-contained, MIT-licensed, open-source toolkit for document conversion, that can parse several types of popular document formats into a unified, richly structured representation. It is powered by state-of-the-art specialized AI models for layout analysis (DocLayNet) and table structure recognition (TableFormer), and runs efficiently on commodity hardware in a small resource budget. Docling is released as a Python package and can be used as a Python API or as a CLI tool. Docling's modular architecture and efficient document representation make it easy to implement extensions, new features, models, and customizations. Docling has been already integrated in other popular open-source frameworks (e.g., LangChain, LlamaIndex, spaCy), making it a natural fit for the processing of documents and the development of high-end applications. The open-source community has fully engaged in using, promoting, and developing for Docling, which gathered 10k stars on GitHub in less than a month and was reported as the No. 1 trending repository in GitHub worldwide in November 2024.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.17887

Country:

Europe > Switzerland (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Information Technology (0.69)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Samudra: An AI Global Ocean Emulator for Climate

Dheeshjith, Surya, Subel, Adam, Adcroft, Alistair, Busecke, Julius, Fernandez-Granda, Carlos, Gupta, Shubham, Zanna, Laure

arXiv.org Artificial IntelligenceDec-19-2024

AI emulators for forecasting have emerged as powerful tools that can outperform conventional numerical predictions. The next frontier is to build emulators for long climate simulations with skill across a range of spatiotemporal scales, a particularly important goal for the ocean. Our work builds a skillful global emulator of the ocean component of a state-of-the-art climate model. We emulate key ocean variables, sea surface height, horizontal velocities, temperature, and salinity, across their full depth. We use a modified ConvNeXt UNet architecture trained on multidepth levels of ocean data. We show that the ocean emulator - Samudra - which exhibits no drift relative to the truth, can reproduce the depth structure of ocean variables and their interannual variability. Samudra is stable for centuries and 150 times faster than the original ocean model. Samudra struggles to capture the correct magnitude of the forcing trends and simultaneously remains stable, requiring further work.

artificial intelligence, emulator, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.03795

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems

de Lima, Rafael Teixeira, Gupta, Shubham, Berrospi, Cesar, Mishra, Lokesh, Dolfi, Michele, Staar, Peter, Vagenas, Panagiotis

arXiv.org Artificial IntelligenceNov-29-2024

Retrieval Augmented Generation (RAG) systems are a widespread application of Large Language Models (LLMs) in the industry. While many tools exist empowering developers to build their own systems, measuring their performance locally, with datasets reflective of the system's use cases, is a technological challenge. Solutions to this problem range from non-specific and cheap (most public datasets) to specific and costly (generating data from local documents). In this paper, we show that using public question and answer (Q&A) datasets to assess retrieval performance can lead to non-optimal systems design, and that common tools for RAG dataset generation can lead to unbalanced data. We propose solutions to these issues based on the characterization of RAG datasets through labels and through label-targeted data generation. Finally, we show that fine-tuned small LLMs can efficiently generate Q&A datasets. We believe that these observations are invaluable to the know-your-data step of RAG systems development.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.1971

Country: Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Dynamic HumTrans: Humming Transcription Using CNNs and Dynamic Programming

Gupta, Shubham, Gomez-Sarmiento, Isaac Neri, Mezdari, Faez Amjed, Ravanelli, Mirco, Subakan, Cem

arXiv.org Artificial IntelligenceOct-7-2024

We propose a novel approach for humming transcription that combines a CNN-based architecture with a dynamic programming-based post-processing algorithm, utilizing the recently introduced HumTrans dataset. We identify and address inherent problems with the offset and onset ground truth provided by the dataset, offering heuristics to improve these annotations, resulting in a dataset with precise annotations that will aid future research. Additionally, we compare the transcription accuracy of our method against several others, demonstrating state-of-the-art (SOTA) results. All our code and corrected dataset is available at https://github.com/shubham-gupta-30/humming_transcription

artificial intelligence, machine learning, transcription, (17 more...)

arXiv.org Artificial Intelligence

2410.05455

Genre: Research Report (0.70)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs

Mishra, Lokesh, Dhibi, Sohayl, Kim, Yusik, Ramis, Cesar Berrospi, Gupta, Shubham, Dolfi, Michele, Staar, Peter

arXiv.org Artificial IntelligenceJun-27-2024

Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as well as content. We propose Statements, a novel domain agnostic data structure for extracting quantitative facts and related information. We propose translating tables to statements as a new supervised deep-learning universal information extraction task. We introduce SemTabNet - a dataset of over 100K annotated tables. Investigating a family of T5-based Statement Extraction Models, our best model generates statements which are 82% similar to the ground-truth (compared to baseline of 21%). We demonstrate the advantages of statements by applying our model to over 2700 tables from ESG reports. The homogeneous nature of statements permits exploratory data analysis on expansive information found in large collections of ESG reports.

large language model, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2406.19102

Country:

Europe > Switzerland (0.14)
Europe > Ireland (0.14)
Asia > Middle East (0.14)
Africa > Ethiopia (0.14)

Genre: Research Report (0.81)

Industry:

Water & Waste Management > Solid Waste Management (0.67)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust Training of Temporal GNNs using Nearest Neighbours based Hard Negatives

Gupta, Shubham, Bedathur, Srikanta

arXiv.org Artificial IntelligenceFeb-14-2024

Temporal graph neural networks Tgnn have exhibited state-of-art performance in future-link prediction tasks. Training of these TGNNs is enumerated by uniform random sampling based unsupervised loss. During training, in the context of a positive example, the loss is computed over uninformative negatives, which introduces redundancy and sub-optimal performance. In this paper, we propose modified unsupervised learning of Tgnn, by replacing the uniform negative sampling with importance-based negative sampling. We theoretically motivate and define the dynamically computed distribution for a sampling of negative examples. Finally, using empirical evaluations over three real-world datasets, we show that Tgnn trained using loss based on proposed negative sampling provides consistent superior performance.

data mining, machine learning, node, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3632410.3632464

2402.09239

Country:

Europe (0.68)
North America > United States > New York (0.15)
Asia > India > NCT (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback