The scarcity of data annotated at the desired level of granularity is a recurring issue in many applications. Significant amounts of effort have been devoted to developing weakly supervised methods tailored to each individual setting, which are often carefully designed to take advantage of the particular properties of weak supervision regimes, form of available data and prior knowledge of the task at hand. Unfortunately, it is difficult to adapt these methods to new tasks and/or forms of data, which often require different weak supervision regimes or models. We present a general-purpose method that can solve any weakly supervised learning problem irrespective of the weak supervision regime or the model. The proposed method turns any off-the-shelf strongly supervised classifier into a weakly supervised classifier and allows the user to specify any arbitrary weakly supervision regime via a loss function. We apply the method to several different weak supervision regimes and demonstrate competitive results compared to methods specifically engineered for those settings.
Contextual bandits are a common problem faced by machine learning practitioners in domains as diverse as hypothesis testing to product recommendations. There have been a lot of approaches in exploiting rich data representations for contextual bandit problems with varying degree of success. Self-supervised learning is a promising approach to find rich data representations without explicit labels. In a typical self-supervised learning scheme, the primary task is defined by the problem objective (e.g. clustering, classification, embedding generation etc.) and the secondary task is defined by the self-supervision objective (e.g. rotation prediction, words in neighborhood, colorization, etc.). In the usual self-supervision, we learn implicit labels from the training data for a secondary task. However, in the contextual bandit setting, we don't have the advantage of getting implicit labels due to lack of data in the initial phase of learning. We provide a novel approach to tackle this issue by combining a contextual bandit objective with a self supervision objective. By augmenting contextual bandit learning with self-supervision we get a better cumulative reward. Our results on eight popular computer vision datasets show substantial gains in cumulative reward. We provide cases where the proposed scheme doesn't perform optimally and give alternative methods for better learning in these cases.
We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit.
Deep artificial neural networks, trained with labeled data sets are widely used in numerous vision and robotics applications today. In terms of AI, these are called reflex models, referring to the fact that they do not self-evolve or actively adapt to environmental changes. As demand for intelligent robot control expands to many high level tasks, reinforcement learning and state based models play an increasingly important role. Herein, in computer vision and robotics domain, we study a novel approach to add reinforcement controls onto the image recognition reflex models to attain better overall performance, specifically to a wider environment range beyond what is expected of the task reflex models. Follow a common infrastructure with environment sensing and AI based modeling of self-adaptive agents, we implement multiple types of AI control agents. To the end, we provide comparative results of these agents with baseline, and an insightful analysis of their benefit to improve overall image recognition performance in real world.
Predictive models deployed in the real world may assign incorrect labels to instances with high confidence. Such errors or unknown unknowns are rooted in model incompleteness, and typically arise because of the mismatch between training data and the cases encountered at test time. As the models are blind to such errors, input from an oracle is needed to identify these failures. In this paper, we formulate and address the problem of informed discovery of unknown unknowns of any given predictive model where unknown unknowns occur due to systematic biases in the training data.We propose a model-agnostic methodology which uses feedback from an oracle to both identify unknown unknowns and to intelligently guide the discovery. We employ a two-phase approach which first organizes the data into multiple partitions based on the feature similarity of instances and the confidence scores assigned by the predictive model, and then utilizes an explore-exploit strategy for discovering unknown unknowns across these partitions. We demonstrate the efficacy of our framework by varying the underlying causes of unknown unknowns across various applications. To the best of our knowledge, this paper presents the first algorithmic approach to the problem of discovering unknown unknowns of predictive models.