Goto

Collaborating Authors

 Karande, Shirish


I Know Therefore I Score: Label-Free Crafting of Scoring Functions using Constraints Based on Domain Expertise

arXiv.org Artificial Intelligence

Several real-life applications require crafting concise, quantitative scoring functions (also called rating systems) from measured observations. For example, an effectiveness score needs to be created for advertising campaigns using a number of engagement metrics. Experts often need to create such scoring functions in the absence of labelled data, where the scores need to reflect business insights and rules as understood by the domain experts. Without a way to capture these inputs systematically, this becomes a time-consuming process involving trial and error. In this paper, we introduce a label-free practical approach to learn a scoring function from multi-dimensional numerical data. The approach incorporates insights and business rules from domain experts in the form of easily observable and specifiable constraints, which are used as weak supervision by a machine learning model. We convert such constraints into loss functions that are optimized simultaneously while learning the scoring function. We examine the efficacy of the approach using a synthetic dataset as well as four real-life datasets, and also compare how it performs vis-a-vis supervised learning models.


MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRI

arXiv.org Artificial Intelligence

Previous real-time MRI (rtMRI)-based speech synthesis models depend heavily on noisy ground-truth speech. Applying loss directly over ground truth mel-spectrograms entangles speech content with MRI noise, resulting in poor intelligibility. We introduce a novel approach that adapts the multi-modal self-supervised AV-HuBERT model for text prediction from rtMRI and incorporates a new flow-based duration predictor for speaker-specific alignment. The predicted text and durations are then used by a speech decoder to synthesize aligned speech in any novel voice. We conduct thorough experiments on two datasets and demonstrate our method's generalization ability to unseen speakers. We assess our framework's performance by masking parts of the rtMRI video to evaluate the impact of different articulators on text prediction. Our method achieves a $15.18\%$ Word Error Rate (WER) on the USC-TIMIT MRI corpus, marking a huge improvement over the current state-of-the-art. Speech samples are available at \url{https://mri2speech.github.io/MRI2Speech/}


Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset

arXiv.org Artificial Intelligence

Current Non-Audible Murmur (NAM)-to-speech techniques rely on voice cloning to simulate ground-truth speech from paired whispers. However, the simulated speech often lacks intelligibility and fails to generalize well across different speakers. To address this issue, we focus on learning phoneme-level alignments from paired whispers and text and employ a Text-to-Speech (TTS) system to simulate the ground-truth. To reduce dependence on whispers, we learn phoneme alignments directly from NAMs, though the quality is constrained by the available training data. To further mitigate reliance on NAM/whisper data for ground-truth simulation, we propose incorporating the lip modality to infer speech and introduce a novel diffusion-based method that leverages recent advancements in lip-to-speech technology. Additionally, we release the MultiNAM dataset with over $7.96$ hours of paired NAM, whisper, video, and text data from two speakers and benchmark all methods on this dataset. Speech samples and the dataset are available at \url{https://diff-nam.github.io/DiffNAM/}


Towards Mitigating Perceived Unfairness in Contracts from a Non-Legal Stakeholder's Perspective

arXiv.org Artificial Intelligence

Commercial contracts are known to be a valuable source for deriving project-specific requirements. However, contract negotiations mainly occur among the legal counsel of the parties involved. The participation of non-legal stakeholders, including requirement analysts, engineers, and solution architects, whose primary responsibility lies in ensuring the seamless implementation of contractual terms, is often indirect and inadequate. Consequently, a significant number of sentences in contractual clauses, though legally accurate, can appear unfair from an implementation perspective to non-legal stakeholders. This perception poses a problem since requirements indicated in the clauses are obligatory and can involve punitive measures and penalties if not implemented as committed in the contract. Therefore, the identification of potentially unfair clauses in contracts becomes crucial. In this work, we conduct an empirical study to analyze the perspectives of different stakeholders regarding contractual fairness. We then investigate the ability of Pre-trained Language Models (PLMs) to identify unfairness in contractual sentences by comparing chain of thought prompting and semi-supervised fine-tuning approaches. Using BERT-based fine-tuning, we achieved an accuracy of 84% on a dataset consisting of proprietary contracts. It outperformed chain of thought prompting using Vicuna-13B by a margin of 9%.


Can Physics Informed Neural Operators Self Improve?

arXiv.org Artificial Intelligence

Self-training techniques have shown remarkable value across many deep learning models and tasks. However, such techniques remain largely unexplored when considered in the context of learning fast solvers for systems of partial differential equations (Eg: Neural Operators). In this work, we explore the use of self-training for Fourier Neural Operators (FNO). Neural Operators emerged as a data driven technique, however, data from experiments or traditional solvers is not always readily available. Physics Informed Neural Operators (PINO) overcome this constraint by utilizing a physics loss for the training, however the accuracy of PINO trained without data does not match the performance obtained by training with data. In this work we show that self-training can be used to close this gap in performance. We examine canonical examples, namely the 1D-Burgers and 2D-Darcy PDEs, to showcase the efficacy of self-training. Specifically, FNOs, when trained exclusively with physics loss through self-training, approach 1.07x for Burgers and 1.02x for Darcy, compared to FNOs trained with both data and physics loss. Furthermore, we discover that pseudo-labels can be used for self-training without necessarily training to convergence in each iteration. A consequence of this is that we are able to discover self-training schedules that improve upon the baseline performance of PINO in terms of accuracy as well as time.


HyperLoRA for PDEs

arXiv.org Artificial Intelligence

Physics-informed neural networks (PINNs) have been widely used to develop neural surrogates for solutions of Partial Differential Equations. A drawback of PINNs is that they have to be retrained with every change in initial-boundary conditions and PDE coefficients. The Hypernetwork, a model-based meta learning technique, takes in a parameterized task embedding as input and predicts the weights of PINN as output. Predicting weights of a neural network however, is a high-dimensional regression problem, and hypernetworks perform sub-optimally while predicting parameters for large base networks. To circumvent this issue, we use a low ranked adaptation (LoRA) formulation to decompose every layer of the base network into low-ranked tensors and use hypernetworks to predict the low-ranked tensors. Despite the reduced dimensionality of the resulting weight-regression problem, LoRA-based Hypernetworks violate the underlying physics of the given task. We demonstrate that the generalization capabilities of LoRA-based hypernetworks drastically improve when trained with an additional physics-informed loss component (HyperPINN) to satisfy the governing differential equations. We observe that LoRA-based HyperPINN training allows us to learn fast solutions for parameterized PDEs like Burger's equation and Navier Stokes: Kovasznay flow, while having an 8x reduction in prediction parameters on average without compromising on accuracy when compared to all other baselines.


How important are specialized transforms in Neural Operators?

arXiv.org Artificial Intelligence

Simulating physical systems using Partial Differential Equations (PDEs) has become an indispensible part of modern industrial process optimization. Traditionally, numerical solvers have been used to solve the associated PDEs, however recently Transform-based Neural Operators such as the Fourier Neural Operator and Wavelet Neural Operator have received a lot of attention for their potential to provide fast solutions for systems of PDEs. In this work, we investigate the importance of the transform layers to the reported success of transform based neural operators. In particular, we record the cost in terms of performance, if all the transform layers are replaced by learnable linear layers. Surprisingly, we observe that linear layers suffice to provide performance comparable to the best-known transform-based layers and seem to do so with a compute time advantage as well. We believe that this observation can have significant implications for future work on Neural Operators, and might point to other sources of efficiencies for these architectures.


Synthesis of Mathematical programs from Natural Language Specifications

arXiv.org Artificial Intelligence

Several decision problems that are encountered in various business domains can be modeled as mathematical programs, i.e. optimization problems. The process of conducting such modeling often requires the involvement of experts trained in operations research and advanced algorithms. Surprisingly, despite the significant advances in the methods for program and code synthesis, AutoML, learning to optimize etc., there has been little or no attention paid to automating the task of synthesizing mathematical programs. We imagine a scenario where the specifications for modeling, i.e. the objective and constraints are expressed in an unstructured form in natural language (NL) and the mathematical program has to be synthesized from such an NL specification. In this work we evaluate the efficacy of employing CodeT5 with data augmentation and post-processing of beams. We utilize GPT-3 with back translation for generation of synthetic examples. Further we apply rules of linear programming to score beams and correct beams based on common error patterns. We observe that with these enhancements CodeT5 base gives an execution accuracy of 0.73 which is significantly better than zero-shot execution accuracy of 0.41 by ChatGPT and 0.36 by Codex.


DeepEpiSolver: Unravelling Inverse problems in Covid, HIV, Ebola and Disease Transmission

arXiv.org Artificial Intelligence

The spread of many infectious diseases is modeled using variants of the SIR compartmental model, which is a coupled differential equation. The coefficients of the SIR model determine the spread trajectories of disease, on whose basis proactive measures can be taken. Hence, the coefficient estimates must be both fast and accurate. Shaier et al. in the paper "Disease Informed Neural Networks" used Physics Informed Neural Networks (PINNs) to estimate the parameters of the SIR model. There are two drawbacks to this approach. First, the training time for PINNs is high, with certain diseases taking close to 90 hrs to train. Second, PINNs don't generalize for a new SIDR trajectory, and learning its corresponding SIR parameters requires retraining the PINN from scratch. In this work, we aim to eliminate both of these drawbacks. We generate a dataset between the parameters of ODE and the spread trajectories by solving the forward problem for a large distribution of parameters using the LSODA algorithm. We then use a neural network to learn the mapping between spread trajectories and coefficients of SIDR in an offline manner. This allows us to learn the parameters of a new spread trajectory without having to retrain, enabling generalization at test time. We observe a speed-up of 3-4 orders of magnitude with accuracy comparable to that of PINNs for 11 highly infectious diseases. Further finetuning of neural network inferred ODE coefficients using PINN further leads to 2-3 orders improvement of estimated coefficients.


Symbolic Regression for PDEs using Pruned Differentiable Programs

arXiv.org Artificial Intelligence

Physics-informed Neural Networks (PINNs) have been widely used to obtain accurate neural surrogates for a system of Partial Differential Equations (PDE). One of the major limitations of PINNs is that the neural solutions are challenging to interpret, and are often treated as black-box solvers. While Symbolic Regression (SR) has been studied extensively, very few works exist which generate analytical expressions to directly perform SR for a system of PDEs. In this work, we introduce an end-to-end framework for obtaining mathematical expressions for solutions of PDEs. We use a trained PINN to generate a dataset, upon which we perform SR. We use a Differentiable Program Architecture (DPA) defined using context-free grammar to describe the space of symbolic expressions. We improve the interpretability by pruning the DPA in a depth-first manner using the magnitude of weights as our heuristic. On average, we observe a 95.3% reduction in parameters of DPA while maintaining accuracy at par with PINNs. Furthermore, on an average, pruning improves the accuracy of DPA by 7.81% . We demonstrate our framework outperforms the existing state-of-the-art SR solvers on systems of complex PDEs like Navier-Stokes: Kovasznay flow and Taylor-Green Vortex flow. Furthermore, we produce analytical expressions for a complex industrial use-case of an Air-Preheater, without suffering from performance loss viz-a-viz PINNs.