Goto

Collaborating Authors

 Inductive Learning




Pitfalls of Epistemic Uncertainty Quantification through Loss Minimisation

Neural Information Processing Systems

Uncertainty quantification has received increasing attention in machine learning in the recent past. In particular, a distinction between aleatoric and epistemic uncertainty has been found useful in this regard. The latter refers to the learner's (lack of) knowledge and appears to be especially difficult to measure and quantify. In this paper, we analyse a recent proposal based on the idea of a second-order learner, which yields predictions in the form of distributions over probability distributions. While standard (first-order) learners can be trained to predict accurate probabilities, namely by minimising suitable loss functions on sample data, we show that loss minimisation does not work for second-order predictors: The loss functions proposed for inducing such predictors do not incentivise the learner to represent its epistemic uncertainty in a faithful way.


A ImageNet Texture

Neural Information Processing Systems

See Figures 7 and 8 for examples of the ImageNet-Texture dataset and their counterparts in the original ImageNet dataset. Shape is often less well-defined in these classes, for example in window screen and rapeseed. B.1 Comparison of two ways to apply α in NCE loss Since the denominator normalizes the 3 kinds of pairs equally, we only pay attention to the numerator. Because of the exponential tail, it applies a exponentially larger weight to the negatives that are harder. Our patch-based augmentation is also closely related to some of the self-supervised learning methods which solve jigsaw as the pretext task. All of our models are trained on 4 GTX 1080 Ti gpus.


Instance-Dependent Partial Label Learning

Neural Information Processing Systems

Most existing PLL approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels. However, this assumption is not realistic since the candidate labels are always instance-dependent. In this paper, we consider instance-dependent PLL and assume that each example is associated with a latent label distribution constituted by the real number of each label, representing the degree to each label describing the feature. The incorrect label with a high degree is more likely to be annotated as the candidate label.





Supplementary Material for Paper 1 " Universal Semi-Supervised Learning " 2

Neural Information Processing Systems

Moreover, we will conduct additional experiments to further evaluate our method in Section C. Furthermore, we provide the standard deviation results that correspond to the main paper in Section D. Finally, we will discuss the limitations and social impact of our method in Section E. VisDA2017 datasets, we set the batch size to 64. Other implementation details are presented below. It contains 3 domains: "Amazon" (A), "DSLR" (D), and "Webcam" (W), and each domain is composed of 31 classes. Shared learning rate decay factor 0.2 # training iteration in which learning rate decay starts 400,000 # training iteration in which consistency coefficient ramp up starts 200,000 Supervised Initial learning rate 0.003 Π-Model [6, 10] Initial learning rate 3 10 CAFA framework, which includes class-sharing data detection and feature adaptation . Here we use PI as the backbone method.


Driving Accurate Allergen Prediction with Protein Language Models and Generalization-Focused Evaluation

arXiv.org Artificial Intelligence

Allergens, typically proteins capable of triggering adverse immune responses, represent a significant public health challenge. To accurately identify allergen proteins, we introduce Applm (Allergen Prediction with Protein Language Models), a computational framework that leverages the 100-billion parameter xTrimoPGLM protein language model. We show that Applm consistently outperforms seven state-of-the-art methods in a diverse set of tasks that closely resemble difficult real-world scenarios. These include identifying novel allergens that lack similar examples in the training set, differentiating between allergens and non-allergens among homologs with high sequence similarity, and assessing functional consequences of mutations that create few changes to the protein sequences. Our analysis confirms that xTrimoPGLM, originally trained on one trillion tokens to capture general protein sequence characteristics, is crucial for Applm's performance by detecting important differences among protein sequences. In addition to providing Applm as open-source software, we also provide our carefully curated benchmark datasets to facilitate future research.