AITopics | Patel, Yash

Collaborating Authors

Patel, Yash

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Three Things to Know about Deep Metric Learning

Patel, Yash, Tolias, Giorgos, Matas, Jiri

arXiv.org Artificial IntelligenceDec-16-2024

This paper addresses supervised deep metric learning for open-set image retrieval, focusing on three key aspects: the loss function, mixup regularization, and model initialization. In deep metric learning, optimizing the retrieval evaluation metric, recall@k, via gradient descent is desirable but challenging due to its non-differentiable nature. To overcome this, we propose a differentiable surrogate loss that is computed on large batches, nearly equivalent to the entire training set. This computationally intensive process is made feasible through an implementation that bypasses the GPU memory limitations. Additionally, we introduce an efficient mixup regularization technique that operates on pairwise scalar similarities, effectively increasing the batch size even further. The training process is further enhanced by initializing the vision encoder using foundational models, which are pre-trained on large-scale datasets. Through a systematic study of these components, we demonstrate that their synergy enables large models to nearly solve popular benchmarks.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.12432

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Conformalized Late Fusion Multi-View Learning

Rivera, Eduardo Ochoa, Patel, Yash, Tewari, Ambuj

arXiv.org Machine LearningMay-25-2024

Uncertainty quantification for multi-view learning is motivated by the increasing use of multi-view data in scientific problems. A common variant of multi-view learning is late fusion: train separate predictors on individual views and combine them after single-view predictions are available. Existing methods for uncertainty quantification for late fusion often rely on undesirable distributional assumptions for validity. Conformal prediction is one approach that avoids such distributional assumptions. However, naively applying conformal prediction to late-stage fusion pipelines often produces overly conservative and uninformative prediction regions, limiting its downstream utility. We propose a novel methodology, Multi-View Conformal Prediction (MVCP), where conformal prediction is instead performed separately on the single-view predictors and only fused subsequently. Our framework extends the standard scalar formulation of a score function to a multivariate score that produces more efficient downstream prediction regions in both classification and regression settings. We then demonstrate that such improvements can be realized in methods built atop conformalized regressors, specifically in robust predict-then-optimize pipelines.

artificial intelligence, machine learning, prediction region, (15 more...)

arXiv.org Machine Learning

2405.16246

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Amortized Variational Inference with Coverage Guarantees

Patel, Yash, McNamara, Declan, Loper, Jackson, Regier, Jeffrey, Tewari, Ambuj

arXiv.org Artificial IntelligenceOct-15-2023

Amortized variational inference produces a posterior approximation that can be rapidly computed given any new observation. Unfortunately, there are few guarantees about the quality of these approximate posteriors. We propose Conformalized Amortized Neural Variational Inference (CANVI), a procedure that is scalable, easily implemented, and provides guaranteed marginal coverage. Given a collection of candidate amortized posterior approximators, CANVI constructs conformalized predictors based on each candidate, compares the predictors using a metric known as predictive efficiency, and returns the most efficient predictor. CANVI ensures that the resulting predictor constructs regions that contain the truth with a user-specified level of probability. CANVI is agnostic to design decisions in formulating the candidate approximators and only requires access to samples from the forward model, permitting its use in likelihood-free settings. We prove lower bounds on the predictive efficiency of the regions produced by CANVI and explore how the quality of a posterior approximation relates to the predictive efficiency of prediction regions based on that approximation. Finally, we demonstrate the accurate calibration and high predictive efficiency of CANVI on a suite of simulation-based inference benchmark tasks and an important scientific task: analyzing galaxy emission spectra.

artificial intelligence, canvi, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.14275

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Conformal Contextual Robust Optimization

Patel, Yash, Rayan, Sahana, Tewari, Ambuj

arXiv.org Machine LearningOct-15-2023

Predict-then-optimize or contextual robust optimization problems are of long-standing interest in safety-critical settings where decision-making happens under uncertainty (Sun, Liu, and Li, 2023; Elmachtoub and Grigas, 2022; Elmachtoub, Liang, and McNellis, 2020; Peršak and Anjos, 2023). In traditional robust optimization, results are made to be robust to distributions anticipated to be present upon deployment (Ben-Tal, El Ghaoui, and Nemirovski, 2009; Beyer and Sendhoff, 2007). Since such decisions are sensitive to proper model specification, recent efforts have sought to supplant this with data-driven uncertainty regions (Cheramin et al., 2021; Bertsimas, Gupta, and Kallus, 2018; Shang and You, 2019; Johnstone and Cox, 2021). Model misspecification is ever more present in contextual robust optimization, spurring efforts to define similar datadriven uncertainty regions (Ohmori, 2021; Chenreddy, Bandi, and Delage, 2022; Sun, Liu, and Li, 2023). Such methods, however, focus on box-and ellipsoid-based uncertainty regions, both of which are necessarily convex and often overly conservative, resulting in suboptimal decision-making. Conformal prediction provides a principled framework for producing distribution-free prediction regions with marginal frequentist coverage guarantees (Angelopoulos and Bates, 2021; Shafer and Vovk, 2008).

artificial intelligence, machine learning, prediction region, (17 more...)

arXiv.org Machine Learning

2310.10003

Genre: Research Report (0.64)

Industry:

Information Technology (0.67)
Transportation (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Integrated Image and Location Analysis for Wound Classification: A Deep Learning Approach

Patel, Yash, Shah, Tirth, Dhar, Mrinal Kanti, Zhang, Taiyu, Niezgoda, Jeffrey, Gopalakrishnan, Sandeep, Yu, Zeyun

arXiv.org Artificial IntelligenceAug-23-2023

The global burden of acute and chronic wounds presents a compelling case for enhancing wound classification methods, a vital step in diagnosing and determining optimal treatments. Recognizing this need, we introduce an innovative multi-modal network based on a deep convolutional neural network for categorizing wounds into four categories: diabetic, pressure, surgical, and venous ulcers. Our multi-modal network uses wound images and their corresponding body locations for more precise classification. A unique aspect of our methodology is incorporating a body map system that facilitates accurate wound location tagging, improving upon traditional wound image classification techniques. A distinctive feature of our approach is the integration of models such as VGG16, ResNet152, and EfficientNet within a novel architecture. This architecture includes elements like spatial and channel-wise Squeeze-and-Excitation modules, Axial Attention, and an Adaptive Gated Multi-Layer Perceptron, providing a robust foundation for classification. Our multi-modal network was trained and evaluated on two distinct datasets comprising relevant images and corresponding location information. Notably, our proposed network outperformed traditional methods, reaching an accuracy range of 74.79% to 100% for Region of Interest (ROI) without location classifications, 73.98% to 100% for ROI with location classifications, and 78.10% to 100% for whole image classifications. This marks a significant enhancement over previously reported performance metrics in the literature. Our results indicate the potential of our multi-modal network as an effective decision-support tool for wound image classification, paving the way for its application in various clinical contexts.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.11877

Country: North America > United States (0.29)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diffusion Models for Probabilistic Deconvolution of Galaxy Images

Xue, Zhiwei, Li, Yuhang, Patel, Yash, Regier, Jeffrey

arXiv.org Artificial IntelligenceJul-20-2023

Telescopes capture images with a particular point spread function (PSF). Inferring what an image would have looked like with a much sharper PSF, a problem known as PSF deconvolution, is ill-posed because PSF convolution is not an invertible transformation. Deep generative models are appealing for PSF deconvolution because they can infer a posterior distribution over candidate images that, if convolved with the PSF, could have generated the observation. However, classical deep generative models such as VAEs and GANs often provide inadequate sample diversity. As an alternative, we propose a classifier-free conditional diffusion model for PSF deconvolution of galaxy images. We demonstrate that this diffusion model captures a greater diversity of possible deconvolutions compared to a conditional VAE.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2307.11122

Country: North America > United States (0.47)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

DocILE Benchmark for Document Information Localization and Extraction

Šimsa, Štěpán, Šulc, Milan, Uřičář, Michal, Patel, Yash, Hamdi, Ahmed, Kocián, Matěj, Skalický, Matyáš, Matas, Jiří, Doucet, Antoine, Coustaty, Mickaël, Karatzas, Dimosthenis

arXiv.org Artificial IntelligenceMay-3-2023

This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition. It contains 6.7k annotated business documents, 100k synthetically generated documents, and nearly 1M unlabeled documents for unsupervised pre-training. The dataset has been built with knowledge of domain-and task-specific aspects, resulting in the following key features: (i) annotations in 55 classes, which surpasses the granularity of previously published key information extraction datasets by a large margin; (ii) Line Item Recognition represents a highly practical information extraction task, where key information has to be assigned to items in a table; (iii) documents come from numerous layouts and the test set includes zero-and few-shot cases as well as layouts commonly seen in the training set. The benchmark comes with several baselines, including RoBERTa, LayoutLMv3 and DETRbased Table Transformer; applied to both tasks of the DocILE benchmark, with results shared in this paper, offering a quick starting point for future work. The dataset, baselines and supplementary material are available at https://github.com/rossumai/docile. Keywords: Document AI Information Extraction Line Item Recognition Business Documents Intelligent Document Processing

data mining, information retrieval, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.05658

Country: Europe (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.96)
Information Technology > Data Science > Data Mining > Text Mining (0.76)
(2 more...)

Add feedback

DocILE 2023 Teaser: Document Information Localization and Extraction

Šimsa, Štěpán, Šulc, Milan, Skalický, Matyáš, Patel, Yash, Hamdi, Ahmed

arXiv.org Artificial IntelligenceJan-29-2023

The lack of data for information extraction (IE) from semi-structured business documents is a real problem for the IE community. Publications relying on large-scale datasets use only proprietary, unpublished data due to the sensitive nature of such documents. Publicly available datasets are mostly small and domain-specific. The absence of a large-scale public dataset or benchmark hinders the reproducibility and cross-evaluation of published methods. The DocILE 2023 competition, hosted as a lab at the CLEF 2023 conference and as an ICDAR 2023 competition, will run the first major benchmark for the tasks of Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) from business documents. With thousands of annotated real documents from open sources, a hundred thousand of generated synthetic documents, and nearly a million unlabeled documents, the DocILE lab comes with the largest publicly available dataset for KILE and LIR. We are looking forward to contributions from the Computer Vision, Natural Language Processing, Information Retrieval, and other communities. The data, baselines, code and up-to-date information about the lab and competition are available at https://docile.rossum.ai/.

artificial intelligence, information retrieval, natural language, (16 more...)

arXiv.org Artificial Intelligence

2301.12394

Country: Europe (0.94)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback