AITopics | Tickoo, Omesh

Plotting

Tickoo, Omesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Quantification in Continual Open-World Learning

Rios, Amanda S., Ndiour, Ibrahima J., Datta, Parual, Sydir, Jaroslaw, Tickoo, Omesh, Ahuja, Nilesh

arXiv.org Artificial IntelligenceDec-20-2024

AI deployed in the real-world should be capable of autonomously adapting to novelties encountered after deployment. Yet, in the field of continual learning, the reliance on novelty and labeling oracles is commonplace albeit unrealistic. This paper addresses a challenging and under-explored problem: a deployed AI agent that continuously encounters unlabeled data - which may include both unseen samples of known classes and samples from novel (unknown) classes - and must adapt to it continuously. To tackle this challenge, we propose our method COUQ "Continual Open-world Uncertainty Quantification", an iterative uncertainty estimation algorithm tailored for learning in generalized continual open-world multi-class settings. We rigorously apply and evaluate COUQ on key sub-tasks in the Continual Open-World: continual novelty detection, uncertainty guided active learning, and uncertainty guided pseudo-labeling for semi-supervised CL. We demonstrate the effectiveness of our method across multiple datasets, ablations, backbones and performance superior to state-of-the-art.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.16409

Country: North America (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

CUAL: Continual Uncertainty-aware Active Learner

Rios, Amanda, Ndiour, Ibrahima, Datta, Parual, Sydir, Jerry, Tickoo, Omesh, Ahuja, Nilesh

arXiv.org Artificial IntelligenceDec-12-2024

AI deployed in many real-world use cases should be capable of adapting to novelties encountered after deployment. Here, we consider a challenging, under-explored and realistic continual adaptation problem: a deployed AI agent is continuously provided with unlabeled data that may contain not only unseen samples of known classes but also samples from novel (unknown) classes. In such a challenging setting, it has only a tiny labeling budget to query the most informative samples to help it continuously learn. We present a comprehensive solution to this complex problem with our model "CUAL" (Continual Uncertainty-aware Active Learner). CUAL leverages an uncertainty estimation algorithm to prioritize active labeling of ambiguous (uncertain) predicted novel class samples while also simultaneously pseudo-labeling the most certain predictions of each class. Evaluations across multiple datasets, ablations, settings and backbones (e.g. ViT foundation model) demonstrate our method's effectiveness. We will release our code upon acceptance.

artificial intelligence, learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2412.09701

Country: North America (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

CONCLAD: COntinuous Novel CLAss Detector

Rios, Amanda, Ndiour, Ibrahima, Datta, Parual, Tickoo, Omesh, Ahuja, Nilesh

arXiv.org Artificial IntelligenceDec-12-2024

In the field of continual learning, relying on so-called oracles for novelty detection is commonplace albeit unrealistic. This paper introduces CONCLAD ("COntinuous Novel CLAss Detector"), a comprehensive solution to the under-explored problem of continual novel class detection in post-deployment data. At each new task, our approach employs an iterative uncertainty estimation algorithm to differentiate between known and novel class(es) samples, and to further discriminate between the different novel classes themselves. Samples predicted to be from a novel class with high-confidence are automatically pseudo-labeled and used to update our model. Simultaneously, a tiny supervision budget is used to iteratively query ambiguous novel class predictions, which are also used during update. Evaluation across multiple datasets, ablations and experimental settings demonstrate our method's effectiveness at separating novel and old class samples continuously. We will release our code upon acceptance.

artificial intelligence, data mining, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2412.10473

Country: North America (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

Krishnan, Ranganath, Khanna, Piyush, Tickoo, Omesh

arXiv.org Artificial IntelligenceDec-3-2024

Large language models (LLMs) have revolutionized the field of natural language processing with their impressive reasoning and question-answering capabilities. However, these models are sometimes prone to generating credible-sounding but incorrect information, a phenomenon known as LLM hallucinations. Reliable uncertainty estimation in LLMs is essential for fostering trust in their generated responses and serves as a critical tool for the detection and prevention of erroneous or hallucinated outputs. To achieve reliable and well-calibrated uncertainty quantification in open-ended and free-form natural language generation, we propose an uncertainty-aware fine-tuning approach for LLMs. This approach enhances the model's ability to provide reliable uncertainty estimates without compromising accuracy, thereby guiding them to produce more trustworthy responses. We introduce a novel uncertainty-aware causal language modeling loss function, grounded in the principles of decision theory. Through rigorous evaluation on multiple free-form question-answering datasets and models, we demonstrate that our uncertainty-aware fine-tuning approach yields better calibrated uncertainty estimates in natural language generation tasks than fine-tuning with the standard causal language modeling loss. Furthermore, the experimental results show that the proposed method significantly improves the model's ability to detect hallucinations and identify out-of-domain prompts. Large Language Models (LLMs) have shown remarkable success in various natural language processing tasks (Touvron et al., 2023; Gemma et al., 2024; Achiam et al., 2023) and are increasingly becoming ubiquitous in a variety of domains for their decision-making and reasoning abilities (Eigner & Händler, 2024). However, their real-world deployment, particularly in high-stakes and safety-critical applications, is hindered by challenges such as hallucinations and out-of-domain prompts, which can lead to the generation of erroneous or nonsensical outputs. Hallucinations, often described as plausible-sounding but incorrect or unfaithful model generations (Ji et al., 2023), present a crucial challenge in developing trustworthy systems especially in critical domains such as medical (Ahmad et al., 2023) and legal (Magesh et al., 2024). The ability to recognize out-of-domain prompts and to acknowledge the limits of a model's knowledge base paves the way for building safe AI systems (Amodei et al., 2016). Uncertainty quantification (UQ) in LLMs plays a pivotal role in understanding what the model knows and does not know, which is an active area of research for free-form natural language generation (NLG) (Kadavath et al., 2022; Kuhn et al., 2023; Lin et al., 2024).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.02904

Country: Europe (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reliable Multimodal Trajectory Prediction via Error Aligned Uncertainty Optimization

Kose, Neslihan, Krishnan, Ranganath, Dhamasia, Akash, Tickoo, Omesh, Paulitsch, Michael

arXiv.org Artificial IntelligenceDec-9-2022

Reliable uncertainty quantification in deep neural networks is very crucial in safety-critical applications such as automated driving for trustworthy and informed decision-making. Assessing the quality of uncertainty estimates is challenging as ground truth for uncertainty estimates is not available. Ideally, in a well-calibrated model, uncertainty estimates should perfectly correlate with model error. We propose a novel error aligned uncertainty optimization method and introduce a trainable loss function to guide the models to yield good quality uncertainty estimates aligning with the model error. Our approach targets continuous structured prediction and regression tasks, and is evaluated on multiple datasets including a large-scale vehicle motion prediction task involving real-world distributional shifts. We demonstrate that our method improves average displacement error by 1.69% and 4.69%, and the uncertainty correlation with model error by 17.22% and 19.13% as quantified by Pearson correlation coefficient on two state-of-the-art baselines.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2212.04812

Genre: Research Report > New Finding (0.68)

Industry:

Automobiles & Trucks (0.48)
Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

FRE: A Fast Method For Anomaly Detection And Segmentation

Ndiour, Ibrahima, Ahuja, Nilesh, Genc, Utku, Tickoo, Omesh

arXiv.org Artificial IntelligenceNov-22-2022

This paper presents a fast and principled approach for solving the visual anomaly detection and segmentation problem. In this setup, we have access to only anomaly-free training data and want to detect and identify anomalies of an arbitrary nature on test data. We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained DNN on the training data, in order to capture the low-dimensional subspace truly spanned by said features. We show that the \emph{feature reconstruction error} (FRE), which is the $\ell_2$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is extremely effective for anomaly detection. Further, using the same feature reconstruction error concept on intermediate convolutional layers, we derive FRE maps that provide pixel-level spatial localization of the anomalies in the image (i.e. segmentation). Experiments using standard anomaly detection datasets and DNN architectures demonstrate that our method matches or exceeds best-in-class quality performance, but at a fraction of the computational and memory cost required by the state of the art. It can be trained and run very efficiently, even on a traditional CPU.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.1265

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Improving Robustness and Efficiency in Active Learning with Contrastive Loss

Krishnan, Ranganath, Ahuja, Nilesh, Sinha, Alok, Subedar, Mahesh, Tickoo, Omesh, Iyer, Ravi

arXiv.org Artificial IntelligenceSep-13-2021

This paper introduces supervised contrastive active learning (SCAL) by leveraging the contrastive loss for active learning in a supervised setting. We propose efficient query strategies in active learning to select unbiased and informative data samples of diverse feature representations. We demonstrate our proposed method reduces sampling bias, achieves state-of-the-art accuracy and model calibration in an active learning setup with the query computation 11x faster than CoreSet and 26x faster than Bayesian active learning by disagreement. Our method yields well-calibrated models even with imbalanced datasets. We also evaluate robustness to dataset shift and out-of-distribution in active learning setup and demonstrate our proposed SCAL method outperforms high performing compute-intensive methods by a bigger margin (average 8.9% higher AUROC for out-of-distribution detection and average 7.2% lower ECE under dataset shift).

active learning, educational method, mentoring method, (25 more...)

arXiv.org Artificial Intelligence

2109.06873

Country: Asia > India (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Probabilistic Modeling of Deep Features for Out-of-Distribution and Adversarial Detection

Ahuja, Nilesh A., Ndiour, Ibrahima, Kalyanpur, Trushant, Tickoo, Omesh

arXiv.org Machine LearningSep-25-2019

We present a principled approach for detecting out-of-distribution (OOD) and adversarial samples in deep neural networks. Our approach consists in modeling the outputs of the various layers (deep features) with parametric probability distributions once training is completed. At inference, the likelihoods of the deep features w.r.t the previously learnt distributions are calculated and used to derive uncertainty estimates that can discriminate in-distribution samples from OOD samples. We explore the use of two classes of multivariate distributions for modeling the deep features - Gaussian and Gaussian mixture - and study the trade-off between accuracy and computational complexity. We demonstrate benefits of our approach on image features by detecting OOD images and adversarially-generated images, using popular DNN architectures on MNIST and CIFAR10 datasets. We show that more precise modeling of the feature distributions result in significantly improved detection of OOD and adversarial samples; up to 12 percentage points in AUPR and AUROC metrics. We further show that our approach remains extremely effective when applied to video data and associated spatio-temporal features by detecting adversarial samples on activity classification tasks using UCF101 dataset, and the C3D network. To our knowledge, our methodology is the first one reported for reliably detecting white-box adversarial framing, a state-of-the-art adversarial attack for video classifiers.

covariance, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1909.11786

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

MOPED: Efficient priors for scalable variational inference in Bayesian deep neural networks

Krishnan, Ranganath, Subedar, Mahesh, Tickoo, Omesh

arXiv.org Machine LearningJun-12-2019

Variational inference for Bayesian deep neural networks (DNNs) requires specifying priors and approximate posterior distributions for neural network weights. Specifying meaningful weight priors is a challenging problem, particularly for scaling variational inference to deeper architectures involving high dimensional weight space. We propose Bayesian MOdel Priors Extracted from Deterministic DNN (MOPED) method for stochastic variational inference to choose meaningful prior distributions over weight space using deterministic weights derived from the pretrained DNNs of equivalent architecture. We evaluate the proposed approach on multiple datasets and real-world application domains with a range of varying complex model architectures to demonstrate MOPED enables scalable variational inference for Bayesian DNNs. The proposed method achieves faster training convergence and provides reliable uncertainty quantification, without compromising on the accuracy provided by the deterministic DNNs. We also propose hybrid architectures to Bayesian DNNs where deterministic and variational layers are combined to balance computation complexity during prediction phase and while providing benefits of Bayesian inference. We will release the source code for this work.

deep learning, neural network, variational inference, (20 more...)

arXiv.org Machine Learning

1906.05323

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Transportation > Passenger (0.88)
Transportation > Ground > Road (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Real-time Approximate Bayesian Computation for Scene Understanding

Felip, Javier, Ahuja, Nilesh, Gómez-Gutiérrez, David, Tickoo, Omesh, Mansinghka, Vikash

arXiv.org Machine LearningMay-22-2019

Consider scene understanding problems such as predicting where a person is probably reaching, or inferring the pose of 3D objects from depth images, or inferring the probable street crossings of pedestrians at a busy intersection. This paper shows how to solve these problems using Approximate Bayesian Computation. The underlying generative models are built from realistic simulation software, wrapped in a Bayesian error model for the gap between simulation outputs and real data. The simulators are drawn from off-the-shelf computer graphics, video game, and traffic simulation code. The paper introduces two techniques for speeding up inference that can be used separately or in combination. The first is to train neural surrogates of the simulators, using a simple form of domain randomization to make the surrogates more robust to the gap between the simulation and reality. The second is to adaptively discretize the latent variables using a Tree-pyramid approach adapted from computer graphics. This paper also shows performance and accuracy measurements on real-world problems, establishing that it is feasible to solve these problems in real-time.

bayesian inference, health & medicine, trajectory, (19 more...)

arXiv.org Machine Learning

1905.13307

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback