Goto

Collaborating Authors

 prediction rate


Private Rate-Constrained Optimization with Applications to Fair Learning

arXiv.org Machine Learning

Many problems in trustworthy ML can be formulated as minimization of the model error under constraints on the prediction rates of the model for suitably-chosen marginals, including most group fairness constraints (demographic parity, equality of odds, etc.). In this work, we study such constrained minimization problems under differential privacy (DP). Standard DP optimization techniques like DP-SGD rely on the loss function's decomposability into per-sample contributions. However, rate constraints introduce inter-sample dependencies, violating the decomposability requirement. To address this, we develop RaCO-DP, a DP variant of the Stochastic Gradient Descent-Ascent (SGDA) algorithm which solves the Lagrangian formulation of rate constraint problems. We demonstrate that the additional privacy cost of incorporating these constraints reduces to privately estimating a histogram over the mini-batch at each optimization step. We prove the convergence of our algorithm through a novel analysis of SGDA that leverages the linear structure of the dual parameter. Finally, empirical results on learning under group fairness constraints demonstrate that our method Pareto-dominates existing private learning approaches in fairness-utility trade-offs.


Document Type Classification using File Names

arXiv.org Artificial Intelligence

Rapid document classification is critical in several time-sensitive applications like digital forensics and large-scale media classification. Traditional approaches that rely on heavy-duty deep learning models fall short due to high inference times over vast input datasets and computational resources associated with analyzing whole documents. In this paper, we present a method using lightweight supervised learning models, combined with a TF-IDF feature extraction-based tokenization method, to accurately and efficiently classify documents based solely on file names that substantially reduces inference time. This approach can distinguish ambiguous file names from the indicative file names through confidence scores and through using a negative class representing ambiguous file names. Our results indicate that file name classifiers can process more than 80% of the in-scope data with 96.7% accuracy when tested on a dataset with a large portion of out-of-scope data with respect to the training dataset while being 442.43x faster than more complex models such as DiT. Our method offers a crucial solution for efficiently processing vast datasets in critical scenarios, enabling fast, more reliable document classification.


Quantifying the value of positive transfer: An experimental case study

arXiv.org Artificial Intelligence

In traditional approaches to structural health monitoring, challenges often arise associated with the availability of labelled data. Population-based structural health monitoring seeks to overcomes these challenges by leveraging data/information from similar structures via technologies such as transfer learning. The current paper demonstrate a methodology for quantifying the value of information transfer in the context of operation and maintenance decision-making. This demonstration, based on a population of laboratory-scale aircraft models, highlights the steps required to evaluate the expected value of information transfer including similarity assessment and prediction of transfer efficacy. Once evaluated for a given population, the value of information transfer can be used to optimise transfer-learning strategies for newly-acquired target domains.


Contrastive Unlearning: A Contrastive Approach to Machine Unlearning

arXiv.org Artificial Intelligence

Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effective unlearning. It removes the influence of unlearning samples by contrasting their embeddings against the remaining samples so that they are pushed away from their original classes and pulled toward other classes. By directly optimizing the representation space, it effectively removes the influence of unlearning samples while maintaining the representations learned from the remaining samples. Experiments on a variety of datasets and models on both class unlearning and sample unlearning showed that contrastive unlearning achieves the best unlearning effects and efficiency with the lowest performance loss compared with the state-of-the-art algorithms.


SAF: Smart Aggregation Framework for Revealing Atoms Importance Rank and Improving Prediction Rates in Drug Discovery

arXiv.org Artificial Intelligence

Machine learning, and representation learning in particular, has the potential to facilitate drug discovery by screening a large chemical space in silico. A successful approach for representing molecules is to treat them as a graph and utilize graph neural networks. One of the key limitations of such methods is the necessity to represent compounds with different numbers of atoms, which requires aggregating the atom's information. Common aggregation operators, such as averaging, result in loss of information at the atom level. In this work, we propose a novel aggregating approach where each atom is weighted non-linearly using the Boltzmann distribution with a hyperparameter analogous to temperature. We show that using this weighted aggregation improves the ability of the gold standard message-passing neural network to predict antibiotic activity. Moreover, by changing the temperature hyperparameter, our approach can reveal the atoms that are important for activity prediction in a smooth and consistent way, thus providing a novel, regulated attention mechanism for graph neural networks. We further validate our method by showing that it recapitulates the functional group in beta-Lactam antibiotics. The ability of our approach to rank the atoms' importance for a desired function can be used within any graph neural network to provide interpretability of the results and predictions at the node level.


Should You Mask 15% in Masked Language Modeling?

arXiv.org Artificial Intelligence

Masked language models (MLMs) conventionally mask 15% of tokens due to the belief that more masking would leave insufficient context to learn good representations; this masking rate has been widely used, regardless of model sizes or masking strategies. In this work, we revisit this important choice of MLM pre-training. We first establish that 15% is not universally optimal, and larger models should adopt a higher masking rate. Specifically, we find that masking 40% outperforms 15% for BERT-large size models on GLUE and SQuAD. Interestingly, an extremely high masking rate of 80% can still preserve 95% fine-tuning performance and most of the accuracy in linguistic probing, challenging the conventional wisdom about the role of the masking rate. We then examine the interplay between masking rates and masking strategies and find that uniform masking requires a higher masking rate compared to sophisticated masking strategies such as span or PMI masking. Finally, we argue that increasing the masking rate has two distinct effects: it leads to more corruption, which makes the prediction task more difficult; it also enables more predictions, which benefits optimization. Using this framework, we revisit BERT's 80-10-10 corruption strategy. Together, our results contribute to a better understanding of MLM pre-training.


Quantum Advantage in Variational Bayes Inference

arXiv.org Artificial Intelligence

Variational Bayes (VB) inference algorithm is used widely to estimate both the parameters and the unobserved hidden variables in generative statistical models. The algorithm -- inspired by variational methods used in computational physics -- is iterative and can get easily stuck in local minima, even when classical techniques, such as deterministic annealing (DA), are used. We study a variational Bayes (VB) inference algorithm based on a non-traditional quantum annealing approach -- referred to as quantum annealing variational Bayes (QAVB) inference -- and show that there is indeed a quantum advantage to QAVB over its classical counterparts. In particular, we show that such better performance is rooted in key concepts from quantum mechanics: (i) the ground state of the Hamiltonian of a quantum system -- defined from the given variational Bayes (VB) problem -- corresponds to an optimal solution for the minimization problem of the variational free energy at very low temperatures; (ii) such a ground state can be achieved by a technique paralleling the quantum annealing process; and (iii) starting from this ground state, the optimal solution to the VB problem can be achieved by increasing the heat bath temperature to unity, and thereby avoiding local minima introduced by spontaneous symmetry breaking observed in classical physics based VB algorithms. We also show that the update equations of QAVB can be potentially implemented using $\lceil \log K \rceil$ qubits and $\mathcal{O} (K)$ operations per step. Thus, QAVB can match the time complexity of existing VB algorithms, while delivering higher performance.


Intel OpenVino Boost Semantic Segmentation prediction

#artificialintelligence

A booster to predict ML model on production line. I'm here to share my experience in Deep Learning Computer Vision model deployment. As we all know that deep learning models are bulky and they take lazy time to load the model and infer incoming data. I will be just walking through the steps I followed. I trained UNet(with different backbone)and DeepLabV3 segmentation models on Open Images Dataset v6 Extension (I will explain thoroughly about downloading dataset and converting masks to trainable masks in other blog).


Evolution

#artificialintelligence

One thing people cannot dodge in the history of humankind from the Stone Age to the digital age is evolution. When we speak about the age of digitalization, Alan Turing and John McCarthy are the pivotal people whose contributions towards Computer Science and Artificial Intelligence in the 1950s were remarkable. Most of the Artificial Intelligence concepts were already proposed during the 1950s. But Artificial Intelligence has become the talk of the town only in recent times. It is because of the Storage.


An interpretable semi-supervised classifier using two different strategies for amended self-labeling

arXiv.org Machine Learning

In the context of some machine learning applications, obtaining data instances is a relatively easy process but labeling them could become quite expensive or tedious. Such scenarios lead to datasets with few labeled instances and a larger number of unlabeled ones. Semi-supervised classification techniques combine labeled and unlabeled data during the learning phase in order to increase classifier's generalization capability. Regrettably, most successful semi-supervised classifiers do not allow explaining their outcome, thus behaving like black boxes. However, there is an increasing number of problem domains in which experts demand a clear understanding of the decision process. In this paper, we report on an extended experimental study presenting an interpretable self-labeling grey-box classifier that uses a black box to estimate the missing class labels and a white box to make the final predictions. Two different approaches for amending the self-labeling process are explored: a first one based on the confidence of the black box and the latter one based on measures from Rough Set Theory. The results of the extended experimental study support the interpretability by means of transparency and simplicity of our classifier, while attaining superior prediction rates when compared with state-of-the-art self-labeling classifiers reported in the literature.