AITopics | Jain, Sarthak

Collaborating Authors

Jain, Sarthak

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mitigating Bad Ground Truth in Supervised Machine Learning based Crop Classification: A Multi-Level Framework with Sentinel-2 Images

A, Sanayya, Shetty, Amoolya, Sharma, Abhijeet, Ravichandran, Venkatesh, Gosuvarapalli, Masthan Wali, Jain, Sarthak, Nanjundiah, Priyamvada, Dutta, Ujjal Kr, Sharma, Divya

arXiv.org Artificial IntelligenceMar-14-2025

In agricultural management, precise Ground Truth (GT) data is crucial for accurate Machine Learning (ML) based crop classification. Yet, issues like crop mislabeling and incorrect land identification are common. We propose a multi-level GT cleaning framework while utilizing multi-temporal Sentinel-2 data to address these issues. Specifically, this framework utilizes generating embeddings for farmland, clustering similar crop profiles, and identification of outliers indicating GT errors. We validated clusters with False Colour Composite (FCC) checks and used distance-based metrics to scale and automate this verification process. The importance of cleaning the GT data became apparent when the models were trained on the clean and unclean data. For instance, when we trained a Random Forest model with the clean GT data, we achieved upto 70\% absolute percentage points higher for the F1 score metric. This approach advances crop classification methodologies, with potential for applications towards improving loan underwriting and agricultural decision-making.

artificial intelligence, crop classification, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2503.11807

Country:

Asia (0.16)
North America > United States (0.14)

Genre: Research Report (0.83)

Industry:

Banking & Finance (0.90)
Food & Agriculture > Agriculture (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Self-supervised Analogical Learning using Language Models

Zhou, Ben, Jain, Sarthak, Zhang, Yi, Ning, Qiang, Wang, Shuai, Benajiba, Yassine, Roth, Dan

arXiv.org Artificial IntelligenceFeb-2-2025

Large language models have been shown to suffer from reasoning inconsistency issues. That is, they fail more in situations unfamiliar to the training data, even though exact or very similar reasoning paths exist in more common cases that they can successfully solve. Such observations motivate us to propose methods that encourage models to understand the high-level and abstract reasoning processes during training instead of only the final answer. This way, models can transfer the exact solution to similar cases, regardless of their relevance to the pre-training data distribution. In this work, we propose SAL, a self-supervised analogical learning framework. SAL mimics the human analogy process and trains models to explicitly transfer high-quality symbolic solutions from cases that they know how to solve to other rare cases in which they tend to fail more. We show that the resulting models after SAL learning outperform base language models on a wide range of reasoning benchmarks, such as StrategyQA, GSM8K, and HotpotQA, by 2% to 20%. At the same time, we show that our model is more generalizable and controllable through analytical studies.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.00996

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models

Rawte, Vipula, Jain, Sarthak, Sinha, Aarush, Kaushik, Garv, Bansal, Aman, Vishwanath, Prathiksha Rumale, Jain, Samyak Rajesh, Reganti, Aishwarya Naresh, Jain, Vinija, Chadha, Aman, Sheth, Amit P., Das, Amitava

arXiv.org Artificial IntelligenceNov-16-2024

Latest developments in Large Multimodal Models (LMMs) have broadened their capabilities to include video understanding. Specifically, Text-to-video (T2V) models have made significant progress in quality, comprehension, and duration, excelling at creating videos from simple textual prompts. Yet, they still frequently produce hallucinated content that clearly signals the video is AI-generated. We introduce ViBe: a large-scale Text-to-Video Benchmark of hallucinated videos from T2V models. We identify five major types of hallucination: Vanishing Subject, Numeric Variability, Temporal Dysmorphia, Omission Error, and Physical Incongruity. Using 10 open-source T2V models, we developed the first large-scale dataset of hallucinated videos, comprising 3,782 videos annotated by humans into these five categories. ViBe offers a unique resource for evaluating the reliability of T2V models and provides a foundation for improving hallucination detection and mitigation in video generation. We establish classification as a baseline and present various ensemble classifier configurations, with the TimeSFormer + CNN combination yielding the best performance, achieving 0.345 accuracy and 0.342 F1 score. This benchmark aims to drive the development of robust T2V models that produce videos more accurately aligned with input prompts.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.10867

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.89)

Add feedback

Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models

Phukan, Orchid Chetia, Jain, Sarthak, Behera, Swarup Ranjan, Buduru, Arun Balaji, Sharma, Rajesh, Prasanna, S. R Mahadeva

arXiv.org Artificial IntelligenceSep-21-2024

In this study, for the first time, we extensively investigate whether music foundation models (MFMs) or speech foundation models (SFMs) work better for singing voice deepfake detection (SVDD), which has recently attracted attention in the research community. For this, we perform a comprehensive comparative study of state-of-the-art (SOTA) MFMs (MERT variants and music2vec) and SFMs (pre-trained for general speech representation learning as well as speaker recognition). We show that speaker recognition SFM representations perform the best amongst all the foundation models (FMs), and this performance can be attributed to its higher efficacy in capturing the pitch, tone, intensity, etc, characteristics present in singing voices. To our end, we also explore the fusion of FMs for exploiting their complementary behavior for improved SVDD, and we propose a novel framework, FIONA for the same. With FIONA, through the synchronization of x-vector (speaker recognition SFM) and MERT-v1-330M (MFM), we report the best performance with the lowest Equal Error Rate (EER) of 13.74 %, beating all the individual FMs as well as baseline FM fusions and achieving SOTA results.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2409.14131

Country:

Asia > India (0.15)
Europe > Estonia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification

Wang, Fei, Shang, Chao, Jain, Sarthak, Wang, Shuai, Ning, Qiang, Min, Bonan, Castelli, Vittorio, Benajiba, Yassine, Roth, Dan

arXiv.org Artificial IntelligenceMar-10-2024

User alignment is crucial for adapting general-purpose language models (LMs) to downstream tasks, but human annotations are often not available for all types of instructions, especially those with customized constraints. We observe that user instructions typically contain constraints. While assessing response quality in terms of the whole instruction is often costly, efficiently evaluating the satisfaction rate of constraints is feasible. We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments, and propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints. Specifically, ACT uses constraint verifiers, which are typically easy to implement in practice, to compute constraint satisfaction rate (CSR) of each response. It samples multiple responses for each prompt and collect preference labels based on their CSR automatically. Subsequently, ACT adapts the LM to the target task through a ranking-based learning process. Experiments on fine-grained entity typing, abstractive summarization, and temporal question answering show that ACT is able to enhance LMs' capability to adhere to different classes of constraints, thereby improving task performance. Further experiments show that the constraint-following capabilities are transferable.

artificial intelligence, constraint, natural language, (15 more...)

arXiv.org Artificial Intelligence

2403.06326

Country:

Europe (1.00)
North America > Canada (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Game-theoretic Counterfactual Explanation for Graph Neural Networks

Chhablani, Chirag, Jain, Sarthak, Channesh, Akshay, Kash, Ian A., Medya, Sourav

arXiv.org Artificial IntelligenceFeb-8-2024

Graph Neural Networks (GNNs) have been a powerful tool for node classification tasks in complex networks. However, their decision-making processes remain a black-box to users, making it challenging to understand the reasoning behind their predictions. Counterfactual explanations (CFE) have shown promise in enhancing the interpretability of machine learning models. Prior approaches to compute CFE for GNNS often are learning-based approaches that require training additional graphs. In this paper, we propose a semivalue-based, non-learning approach to generate CFE for node classification tasks, eliminating the need for any additional training. Our results reveals that computing Banzhaf values requires lower sample complexity in identifying the counterfactual explanations compared to other popular methods such as computing Shapley values. Our empirical evidence indicates computing Banzhaf values can achieve up to a fourfold speed up compared to Shapley values. We also design a thresholding method for computing Banzhaf values and show theoretical and empirical results on its robustness in noisy environments, making it superior to Shapley values. Furthermore, the thresholded Banzhaf values are shown to enhance efficiency without compromising the quality (i.e., fidelity) in the explanations in three popular graph datasets.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2402.0603

Country:

North America > United States > Illinois (0.15)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback

A deep learning pipeline for cross-sectional and longitudinal multiview data integration

Jain, Sarthak, Safo, Sandra E.

arXiv.org Machine LearningDec-2-2023

Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, presenting limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. Additionally, it identifies key variables contributing to the association between views and the separation among classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks and recurrent neural networks. We applied this pipeline to cross-sectional and longitudinal multi-omics data (metagenomics, transcriptomics, and metabolomics) from an inflammatory bowel disease (IBD) study and we identified microbial pathways, metabolites, and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods. The proposed pipeline is available from the following GitHub repository: https://github.com/lasandrall/DeepIDA-GRU.

artificial intelligence, longitudinal multiview data integration, machine learning, (13 more...)

arXiv.org Machine Learning

2312.01238

Country:

Europe (0.28)
Asia > Middle East (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Many and Which Training Points Would Need to be Removed to Flip this Prediction?

Yang, Jinghan, Jain, Sarthak, Wallace, Byron C.

arXiv.org Artificial IntelligenceFeb-8-2023

We consider the problem of identifying a minimal subset of training data $\mathcal{S}_t$ such that if the instances comprising $\mathcal{S}_t$ had been removed prior to training, the categorization of a given test point $x_t$ would have been different. Identifying such a set may be of interest for a few reasons. First, the cardinality of $\mathcal{S}_t$ provides a measure of robustness (if $|\mathcal{S}_t|$ is small for $x_t$, we might be less confident in the corresponding prediction), which we show is correlated with but complementary to predicted probabilities. Second, interrogation of $\mathcal{S}_t$ may provide a novel mechanism for contesting a particular model prediction: If one can make the case that the points in $\mathcal{S}_t$ are wrongly labeled or irrelevant, this may argue for overturning the associated prediction. Identifying $\mathcal{S}_t$ via brute-force is intractable. We propose comparatively fast approximation methods to find $\mathcal{S}_t$ based on influence functions, and find that -- for simple convex text classification models -- these approaches can often successfully identify relatively small sets of training examples which, if removed, would flip the prediction.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2302.02169

Country: Europe > Belgium (0.14)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Add feedback

Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?

Lehman, Eric, Jain, Sarthak, Pichotta, Karl, Goldberg, Yoav, Wallace, Byron C.

arXiv.org Artificial IntelligenceApr-22-2021

Large Transformers pretrained over clinical notes from Electronic Health Records (EHR) have afforded substantial gains in performance on predictive clinical tasks. The cost of training such models (and the necessity of data access to do so) coupled with their utility motivates parameter sharing, i.e., the release of pretrained models such as ClinicalBERT. While most efforts have used deidentified EHR, many researchers have access to large sets of sensitive, non-deidentified EHR with which they might train a BERT model (or similar). Would it be safe to release the weights of such a model if they did? In this work, we design a battery of approaches intended to recover Personal Health Information (PHI) from a trained BERT. Specifically, we attempt to recover patient names and conditions with which they are associated. We find that simple probing methods are not able to meaningfully extract sensitive information from BERT trained over the MIMIC-III corpus of EHR. However, more sophisticated "attacks" may succeed in doing so: To facilitate such research, we make our experimental setup and baseline probing models available at https://github.com/elehman16/exposing_patient_data_release

health & medicine, name insertion 0, neural network, (20 more...)

arXiv.org Artificial Intelligence

2104.07762

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Learning to Identify Patients at Risk of Uncontrolled Hypertension Using Electronic Health Records Data

Mohammadi, Ramin, Jain, Sarthak, Agboola, Stephen, Palacholla, Ramya, Kamarthi, Sagar, Wallace, Byron C.

arXiv.org Machine LearningJun-28-2019

Hypertension is a major risk factor for stroke, cardiovascular disease, and end-stage renal disease, and its prevalence is expected to rise dramatically. Effective hypertension management is thus critical. A particular priority is decreasing the incidence of uncontrolled hypertension. Early identification of patients at risk for uncontrolled hypertension would allow targeted use of personalized, proactive treatments. We develop machine learning models (logistic regression and recurrent neural networks) to stratify patients with respect to the risk of exhibiting uncontrolled hypertension within the coming three-month period. We trained and tested models using EHR data from 14,407 and 3,009 patients, respectively. The best model achieved an AUROC of 0.719, outperforming the simple, competitive baseline of relying prediction based on the last BP measure alone (0.634). Perhaps surprisingly, recurrent neural networks did not outperform a simple logistic regression for this task, suggesting that linear models should be included as strong baselines for predictive tasks using EHR

deep learning, hypertension, vascular disease, (21 more...)

arXiv.org Machine Learning

1907.00089

Country: North America > United States (0.69)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.87)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback