Goto

Collaborating Authors

 Singh, Prashant


PARIC: Probabilistic Attention Regularization for Language Guided Image Classification from Pre-trained Vison Language Models

arXiv.org Artificial Intelligence

Developing robust image classification models that generalize effectively to unseen or out-of-distribution data remains a challenging problem in computer vision. This issue largely arises from biases and limited diversity in training datasets Torralba and Efros [2011]. Standard models trained on such data often prioritize irrelevant background or contextual cues over the discriminative visual features that define each class Ribeiro et al. [2016]. Consequently, these models struggle to generalize to unfamiliar or atypical examples, undermining their reliability and practical utility in real-world applications. Learning robust joint representations for vision and language is an important challenge in modern deep learning research, where the goal is to construct a function f(V, L) that aligns visual data V and linguistic data L into a unified representation capturing shared semantics while preserving modality-specific details; mathematically, this can be expressed as f: V L Z, where Z denotes the joint latent space encoding these semantics, with the primary challenge being to construct f such that it is both expressive and generalizable across diverse input types.


Machine learning driven search of hydrogen storage materials

arXiv.org Artificial Intelligence

The transition to a low-carbon economy demands efficient and sustainable energy-storage solutions, with hydrogen emerging as a promising clean-energy carrier and with metal hydrides recognized for their hydrogen-storage capacity. Here, we leverage machine learning (ML) to predict hydrogen-to-metal (H/M) ratios and solution energy by incorporating thermodynamic parameters and local lattice distortion (LLD) as key features. Our best-performing ML model provides improvements to H/M ratios and solution energies over a broad class of ternary alloys (easily extendable to multi-principal-element alloys), such as Ti-Nb-X (X = Mo, Cr, Hf, Ta, V, Zr) and Co-Ni-X (X = Al, Mg, V). Ti-Nb-Mo alloys reveal compositional effects in H-storage behavior, in particular Ti, Nb, and V enhance H-storage capacity, while Mo reduces H/M and hydrogen weight percent by 40-50%. We attributed to slow hydrogen kinetics in molybdenum rich alloys, which is validated by our pressure-composition isotherm (PCT) experiments on pure Ti and Ti5Mo95 alloys. Density functional theory (DFT) and molecular simulations also confirm that Ti and Nb promote H diffusion, whereas Mo hinders it, highlighting the interplay between electronic structure, lattice distortions, and hydrogen uptake. Notably, our Gradient Boosting Regression model identifies LLD as a critical factor in H/M predictions. To aid material selection, we present two periodic tables illustrating elemental effects on (a) H2 wt% and (b) solution energy, derived from ML, and provide a reference for identifying alloying elements that enhance hydrogen solubility and storage.


$\texttt{InfoHier}$: Hierarchical Information Extraction via Encoding and Embedding

arXiv.org Artificial Intelligence

Analyzing large-scale datasets, especially involving complex and high-dimensional data like images, is particularly challenging. While self-supervised learning (SSL) has proven effective for learning representations from unlabeled data, it typically focuses on flat, non-hierarchical structures, missing the multi-level relationships present in many realworld datasets. Hierarchical clustering (HC) can uncover these relationships by organizing data into a tree-like structure, but it often relies on rigid similarity metrics that struggle to capture the complexity of diverse data types. To address these we envision InfoHier, a framework that combines SSL with HC to jointly learn robust latent representations and hierarchical structures. This approach leverages SSL to provide adaptive representations, enhancing HC's ability to capture complex patterns. Simultaneously, it integrates HC loss to refine SSL training, resulting in representations that are more attuned to the underlying information hierarchy. InfoHier has the potential to improve the expressiveness and performance of both clustering and representation learning, offering significant benefits for data analysis, management, and information retrieval.


Variational Autoencoders for Efficient Simulation-Based Inference

arXiv.org Artificial Intelligence

We present a generative modeling approach based on the variational inference framework for likelihood-free simulation-based inference. The method leverages latent variables within variational autoencoders to efficiently estimate complex posterior distributions arising from stochastic simulations. We explore two variations of this approach distinguished by their treatment of the prior distribution. The first model adapts the prior based on observed data using a multivariate prior network, enhancing generalization across various posterior queries. In contrast, the second model utilizes a standard Gaussian prior, offering simplicity while still effectively capturing complex posterior distributions. We demonstrate the efficacy of these models on well-established benchmark problems, achieving results comparable to flow-based approaches while maintaining computational efficiency and scalability.


EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search

arXiv.org Artificial Intelligence

In the context of enterprises accumulating proprietary unstructured data, AI-driven information retrieval solutions have emerged as vital tools for extracting relevant answers to employee queries. Traditional methods for developing such solutions often involve choosing between Retrieval Augmented Generation (RAG) or fine-tuned Large Language Models (LLMs). However, fine-tuned LLMs, comprising only generative models, lack a guarantee of factual accuracy, while RAG, comprising an embedding model and a generative model, assures factual precision (Lewis at al., 2020 [1]). Despite their superior performance in general, RAG based solutions often rely on pre-trained models, potentially leading to suboptimal alignment with enterprise-specific data. Addressing this challenge entails exploring two potential avenues: Firstly, recent studies such as RAFT (Zhang et al., 2024 [2]) explore the integration of fine-tuned generative models within a RAG pipeline to enhance accuracy, albeit requiring substantial domain-specific data to fine-tune the generative models. Alternatively, leveraging domain-specific embedding models within a RAG pipeline to enhance accuracy remains an underexplored area. Earlier efforts, such as BioBERT (Lee et al., 2019 [3]), SciBERT (Beltagy et al., 2019 [4]), and LEGAL-BERT (Chalkidis et al., 2020 [5]) have effectively demonstrated the efficacy of domain-specific embeddings in information retrieval tasks. These endeavors primarily investigated two methodologies: (a) extending the pre-training of BERT and (b) pre-training BERT from scratch, both employing domain-specific corpora. Despite yielding commendable results, these methodologies necessitated substantial domainspecific corpora, with figures as staggering as 21.3B words for BioBERT, 3.17B tokens for SciBERT, and 11.5GB of text data for LEGAL-BERT, thereby posing significant challenges, particularly in low-resource domains like enterprises.


Efficient Resource Scheduling for Distributed Infrastructures Using Negotiation Capabilities

arXiv.org Artificial Intelligence

In the past few decades, the rapid development of information and internet technologies has spawned massive amounts of data and information. The information explosion drives many enterprises or individuals to seek to rent cloud computing infrastructure to put their applications in the cloud. However, the agreements reached between cloud computing providers and clients are often not efficient. Many factors affect the efficiency, such as the idleness of the providers' cloud computing infrastructure, and the additional cost to the clients. One possible solution is to introduce a comprehensive, bargaining game (a type of negotiation), and schedule resources according to the negotiation results. We propose an agent-based auto-negotiation system for resource scheduling based on fuzzy logic. The proposed method can complete a one-to-one auto-negotiation process and generate optimal offers for the provider and client. We compare the impact of different member functions, fuzzy rule sets, and negotiation scenario cases on the offers to optimize the system. It can be concluded that our proposed method can utilize resources more efficiently and is interpretable, highly flexible, and customizable. We successfully train machine learning models to replace the fuzzy negotiation system to improve processing speed. The article also highlights possible future improvements to the proposed system and machine learning models. All the codes and data are available in the open-source repository.


Transfer learning-assisted inverse modeling in nanophotonics based on mixture density networks

arXiv.org Artificial Intelligence

The simulation of nanophotonic structures relies on electromagnetic solvers, which play a crucial role in understanding their behavior. However, these solvers often come with a significant computational cost, making their application in design tasks, such as optimization, impractical. To address this challenge, machine learning techniques have been explored for accurate and efficient modeling and design of photonic devices. Deep neural networks, in particular, have gained considerable attention in this field. They can be used to create both forward and inverse models. An inverse modeling approach avoids the need for coupling a forward model with an optimizer and directly performs the prediction of the optimal design parameters values. In this paper, we propose an inverse modeling method for nanophotonic structures, based on a mixture density network model enhanced by transfer learning. Mixture density networks can predict multiple possible solutions at a time including their respective importance as Gaussian distributions. However, multiple challenges exist for mixture density network models. An important challenge is that an upper bound on the number of possible simultaneous solutions needs to be specified in advance. Also, another challenge is that the model parameters must be jointly optimized, which can result computationally expensive. Moreover, optimizing all parameters simultaneously can be numerically unstable and can lead to degenerate predictions. The proposed approach allows overcoming these limitations using transfer learning-based techniques, while preserving a high accuracy in the prediction capability of the design solutions given an optical response as an input. A dimensionality reduction step is also explored. Numerical results validate the proposed method.


Adaptive Parameter-Free Robust Learning using Latent Bernoulli Variables

arXiv.org Machine Learning

We present an efficient parameter-free approach for statistical learning from corrupted training sets. We identify corrupted and non-corrupted samples using latent Bernoulli variables, and therefore formulate the robust learning problem as maximization of the likelihood where latent variables are marginalized out. The resulting optimization problem is solved via variational inference using an efficient Expectation-Maximization based method. The proposed approach improves over the state-of-the-art by automatically inferring the corruption level and identifying outliers, while adding minimal computational overhead. We demonstrate our robust learning method on a wide variety of machine learning tasks including online learning and deep learning where it exhibits ability to adapt to different levels of noise and attain high prediction accuracy.


Bayesian polynomial neural networks and polynomial neural ordinary differential equations

arXiv.org Artificial Intelligence

Symbolic regression with polynomial neural networks and polynomial neural ordinary differential equations (ODEs) are two recent and powerful approaches for equation recovery of many science and engineering problems. However, these methods provide point estimates for the model parameters and are currently unable to accommodate noisy data. We address this challenge by developing and validating the following Bayesian inference methods: the Laplace approximation, Markov Chain Monte Carlo (MCMC) sampling methods, and variational inference. We have found the Laplace approximation to be the best method for this class of problems. Our work can be easily extended to the broader class of symbolic neural networks to which the polynomial neural network belongs.


Dronevision: An Experimental 3D Testbed for Flying Light Specks

arXiv.org Artificial Intelligence

Today's robotic laboratories for drones are housed in a large room. At times, they are the size of a warehouse. These spaces are typically equipped with permanent devices to localize the drones, e.g., Vicon Infrared cameras. Significant time is invested to fine-tune the localization apparatus to compute and control the position of the drones. One may use these laboratories to develop a 3D multimedia system with miniature sized drones configured with light sources. As an alternative, this brave new idea paper envisions shrinking these room-sized laboratories to the size of a cube or cuboid that sits on a desk and costs less than 10K dollars. The resulting Dronevision (DV) will be the size of a 1990s Television. In addition to light sources, its Flying Light Specks (FLSs) will be network-enabled drones with storage and processing capability to implement decentralized algorithms. The DV will include a localization technique to expedite development of 3D displays. It will act as a haptic interface for a user to interact with and manipulate the 3D virtual illuminations. It will empower an experimenter to design, implement, test, debug, and maintain software and hardware that realize novel algorithms in the comfort of their office without having to reserve a laboratory. In addition to enhancing productivity, it will improve safety of the experimenter by minimizing the likelihood of accidents. This paper introduces the concept of a DV, the research agenda one may pursue using this device, and our plans to realize one.