South America
Bayesian active learning for optimization and uncertainty quantification in protein docking
Motivation: Ab initio protein docking represents a major challenge for optimizing a noisy and costly "black box"-like function in a high-dimensional space. Despite progress in this field, there is no docking method available for rigorous uncertainty quantification (UQ) of its solution quality (e.g. interface RMSD or iRMSD). Results: We introduce a novel algorithm, Bayesian Active Learning (BAL), for optimization and UQ of such black-box functions and flexible protein docking. BAL directly models the posterior distribution of the global optimum (or native structures for protein docking) with active sampling and posterior estimation iteratively feeding each other. Furthermore, we use complex normal modes to represent a homogeneous Euclidean conformation space suitable for high-dimension optimization and construct funnel-like energy models for encounter complexes. Over a protein docking benchmark set and a CAPRI set including homology docking, we establish that BAL significantly improve against both starting points by rigid docking and refinements by particle swarm optimization, providing for one third targets a top-3 near-native prediction. BAL also generates tight confidence intervals with half range around 25% of iRMSD and confidence level at 85%. Its estimated probability of a prediction being native or not achieves binary classification AUROC at 0.93 and AUPRC over 0.60 (compared to 0.14 by chance); and also found to help ranking predictions. To the best of our knowledge, this study represents the first uncertainty quantification solution for protein docking, with theoretical rigor and comprehensive assessment. Source codes are available at https://github.com/Shen-Lab/BAL.
Probabilistic Discriminative Learning with Layered Graphical Models
Shen, Yuesong, Wu, Tao, Domokos, Csaba, Cremers, Daniel
Probabilistic graphical models are traditionally known for their successes in generative modeling. In this work, we advocate layered graphical models (LGMs) for probabilistic discriminative learning. To this end, we design LGMs in close analogy to neural networks (NNs), that is, they have deep hierarchical structures and convolutional or local connections between layers. Equipped with tensorized truncated variational inference, our LGMs can be efficiently trained via backpropagation on mainstream deep learning frameworks such as PyTorch. To deal with continuous valued inputs, we use a simple yet effective soft-clamping strategy for efficient inference. Through extensive experiments on image classification over MNIST and FashionMNIST datasets, we demonstrate that LGMs are capable of achieving competitive results comparable to NNs of similar architectures, while preserving transparent probabilistic modeling.
Distributionally Robust Removal of Malicious Nodes from Networks
Yu, Sixie, Vorobeychik, Yevgeniy
An important problem in networked systems is detection and removal of suspected malicious nodes. A crucial consideration in such settings is the uncertainty endemic in detection, coupled with considerations of network connectivity, which impose indirect costs from mistakely removing benign nodes as well as failing to remove malicious nodes. A recent approach proposed to address this problem directly tackles these considerations, but has a significant limitation: it assumes that the decision maker has accurate knowledge of the joint maliciousness probability of the nodes on the network. This is clearly not the case in practice, where such a distribution is at best an estimate from limited evidence. To address this problem, we propose a distributionally robust framework for optimal node removal. While the problem is NP-Hard, we propose a principled algorithmic technique for solving it approximately based on duality combined with Semidefinite Programming relaxation. A combination of both theoretical and empirical analysis, the latter using both synthetic and real data, provide strong evidence that our algorithmic approach is highly effective and, in particular, is significantly more robust than the state of the art.
An Evaluation of the Human-Interpretability of Explanation
Lage, Isaac, Chen, Emily, He, Jeffrey, Narayanan, Menaka, Kim, Been, Gershman, Sam, Doshi-Velez, Finale
Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable under three specific tasks that users may perform with machine learning systems: simulation of the response, verification of a suggested response, and determining whether the correctness of a suggested response changes under a change to the inputs. Through carefully controlled human-subject experiments, we identify regularizers that can be used to optimize for the interpretability of machine learning systems. Our results show that the type of complexity matters: cognitive chunks (newly defined concepts) affect performance more than variable repetitions, and these trends are consistent across tasks and domains. This suggests that there may exist some common design principles for explanation systems.
Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
de Amorim, Cleison Correia, Macêdo, David, Zanchettin, Cleber
Abstract--The recognition of sign language is a challenging task with an important role in society to facilitate the communication ofdeaf persons. We propose a new approach of Spatial-Temporal Graph Convolutional Network to sign language recognition based on the human skeletal movements. The method uses graphs to capture the signs dynamics in two dimensions, spatial and temporal, considering the complex aspects of the language. Additionally, we present a new dataset of human skeletons for sign language based on ASLLVD to contribute to future related studies. I. INTRODUCTION Sign language is a visual communication skill that enables individuals with different types of hearing impairment to communicate in society. It is the language used by most deaf people in their daily lives and, moreover, it is the symbol of identification between the members of that community and the main force that unites them. The sign language has a very close relationship with the culture of the country or even regions, and for this reason, each nation has its language [1]. According to the World Health Organization, the number of deaf people is about 466 million, and the organization estimates that by 2050 this number exceeds 900 million, which is equivalent to a forecast of 1 in 10 individuals around the world [2].
Natural Analysts in Adaptive Data Analysis
Adaptive data analysis is frequently criticized for its pessimistic generalization guarantees. The source of these pessimistic bounds is a model that permits arbitrary, possibly adversarial analysts that optimally use information to bias results. While being a central issue in the field, still lacking are notions of natural analysts that allow for more optimistic bounds faithful to the reality that typical analysts aren't adversarial. In this work, we propose notions of natural analysts that smoothly interpolate between the optimal non-adaptive bounds and the best-known adaptive generalization bounds. To accomplish this, we model the analyst's knowledge as evolving according to the rules of an unknown dynamical system that takes in revealed information and outputs new statistical queries to the data. This allows us to restrict the analyst through different natural control-theoretic notions. One such notion corresponds to a recency bias, formalizing an inability to arbitrarily use distant information. Another complementary notion formalizes an anchoring bias, a tendency to weight initial information more strongly. Both notions come with quantitative parameters that smoothly interpolate between the non-adaptive case and the fully adaptive case, allowing for a rich spectrum of intermediate analysts that are neither non-adaptive nor adversarial. Natural not only from a cognitive perspective, we show that our notions also capture standard optimization methods, like gradient descent in various settings. This gives a new interpretation to the fact that gradient descent tends to overfit much less than its adaptive nature might suggest.
Weak-lensing shear measurement with machine learning: teaching artificial neural networks about feature noise
Tewes, Malte, Kuntzer, Thibault, Nakajima, Reiko, Courbin, Frédéric, Hildebrandt, Hendrik, Schrabback, Tim
Cosmic shear is a primary cosmological probe for several present and upcoming surveys investigating dark matter and dark energy, such as Euclid or WFIRST. The probe requires an extremely accurate measurement of the shapes of millions of galaxies based on imaging data. Crucially, the shear measurement must address and compensate for a range of interwoven nuisance effects related to the instrument optics and detector, noise, unknown galaxy morphologies, colors, blending of sources, and selection effects. This paper explores the use of supervised machine learning (ML) as a tool to solve this inverse problem. We present a simple architecture that learns to regress shear point estimates and weights via shallow artificial neural networks. The networks are trained on simulations of the forward observing process, and take combinations of moments of the galaxy images as inputs. A challenging peculiarity of this ML application is the combination of the noisiness of the input features and the requirements on the accuracy of the inverse regression. To address this issue, the proposed training algorithm minimizes bias over multiple realizations of individual source galaxies, reducing the sensitivity to properties of the overall sample of source galaxies. Importantly, an observational selection function of these source galaxies can be straightforwardly taken into account via the weights. We first introduce key aspects of our approach using toy-model simulations, and then demonstrate its potential on images mimicking Euclid data. Finally, we analyze images from the GREAT3 challenge, obtaining competitively low shear biases despite the use of a simple training set. We conclude that the further development of ML approaches is of high interest to meet the stringent requirements on the shear measurement in current and future surveys. A demonstration implementation of our technique is publicly available.
- AI What a surprise to find a comic strip on...
What a surprise to find a comic strip on Artificial Intelligence written by somebody called Montaigne!: Marion Montaigne présente l'Intelligence artificielle: https://youtu.be/DtdoNksCtmE Indeed, Montaigne discusses in the Essays whether the savages found mainly in Brazil are human beings or not and whether they should be considered humans, the way the barbarians in Ancient Greece have been finally recognised as such. As it is growing more and more difficult to recognise an artificial intelligence from a human being; I think reconsidering this classic is relevant as it challenges again human identity. From the conquistadors until now the concept of humanity has evolved. On the other hand, what is common to humanity, at each step, from the cannibals till now; is some human beings' will to build weapons to kill others.
Learning Choice Functions
Pfannschmidt, Karlson, Gupta, Pritha, Hüllermeier, Eyke
We study the problem of learning choice functions, which play an important role in various domains of application, most notably in the field of economics. Formally, a choice function is a mapping from sets to sets: Given a set of choice alternatives as input, a choice function identifies a subset of most preferred elements. Learning choice functions from suitable training data comes with a number of challenges. For example, the sets provided as input and the subsets produced as output can be of any size. Moreover, since the order in which alternatives are presented is irrelevant, a choice function should be symmetric. Perhaps most importantly, choice functions are naturally context-dependent, in the sense that the preference in favor of an alternative may depend on what other options are available. We formalize the problem of learning choice functions and present two general approaches based on two representations of context-dependent utility functions. Both approaches are instantiated by means of appropriate neural network architectures, and their performance is demonstrated on suitable benchmark tasks.
Evaluating Bregman Divergences for Probability Learning from Crowd
The crowdsourcing scenarios are a good example of having a probability distribution over some categories showing what the people in a global perspective thinks. Learn a predictive model of this probability distribution can be of much more valuable that learn only a discriminative model that gives the most likely category of the data. Here we present differents models that adapts having probability distribution as target to train a machine learning model. We focus on the Bregman divergences framework to used as objective function to minimize. The results show that special care must be taken when build a objective function and consider a equal optimization on neural network in Keras framework.