Goto

Collaborating Authors

 Inductive Learning


ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

arXiv.org Artificial Intelligence

Appendix A. Relational MDs and the UCI Property Here, we formally extend the class of matching dependencies (MDs) introduced in Section 2.1, which we will call classical MDs, to the larger class of relational MDs. This extension is motivated by the application of MDs to blocking for entity resolution, but applications can be easily foreseen in other areas where declarative relational knowledge may be useful in combination with matching and merging. We also identify classes of relational MDs for which a single clean instance exists, no matter how the MDs are enforced, that can be computed through the chase procedure in polynomial time in the size of the database on which the MDs are enforced. We say that the MDs (in some cases in combination with an initial instance) have the unique clean instance property (UCI property). More details can be found in [11, 6, 7]. Definition 1. the form: Given a relational schema R, a relational MD is a formula of ฯ•: t


Semi-Supervised Radio Signal Identification

arXiv.org Machine Learning

Radio signal recognition in dense and complex multi-user spectrum environments is an important tool for optimizing spectrum utilization, identifying and minimizing interference, enforcing spectrum policy, and implementing effective radio sensing and coordination systems. Classical approaches to the problem focus on energy detection and the use of expert features and decision criteria to identify and categorize specific modulation types [2] [1]. These approaches rely on prior knowledge of signal properties, features, and decision statistics to separate known modulations and are typically derived under simplified analytic hardware, propagation, radio environment models. We recently demonstrated the viability of naive feature learning for supervised radio classification systems [14] which allows for joint feature and classifier learning given labeled datasets and examples. In this case we were able to outperform traditional expert decision statistic based classification in sensitivity and accuracy by a significant margin. This was a powerful result, providing significant performance improvements against current day solutions, but it still relied entirely on supervised learning and well curated training data. In the real world, and especially in the radio domain, we are faced with vast amounts of unlabeled example data available to our sensor and incomplete knowledge of class labels comprising ground truth. To address this problem we investigate alternative strategies for radio identification learning which rely less heavily on labeled training data and are capable of making sense of radio signals with either no or less labeled examples, potentially drastically reducing the burden of data curation on such a machine learning system for developers and maintainers, and allowing systems to recognize new signals and scale to to understand new environments over time.


inejc/painters

#artificialintelligence

This repository contains a 1st place solution for the Painter by Numbers competition on Kaggle. Below is a brief description of the dataset and approaches I've used to build and validate a predictive model. The challenge of the competition was to examine pairs of paintings and determine whether they were painted by the same artist. The training set consists of artwork images and their corresponding class labels (painters). Examples in the test set were split into 13 groups and all possible pairs within each group needed to be examined for the submission.


Stochastic Online AUC Maximization

Neural Information Processing Systems

Area under ROC (AUC) is a metric which is widely used for measuring the classification performance for imbalanced data. It is of theoretical and practical interest to develop online learning algorithms that maximizes AUC for large-scale data. A specific challenge in developing online AUC maximization algorithm is that the learning objective function is usually defined over a pair of training examples of opposite classes, and existing methods achieves on-line processing with higher space and time complexity. In this work, we propose a new stochastic online algorithm for AUC maximization. In particular, we show that AUC optimization can be equivalently formulated as a convex-concave saddle point problem. From this saddle representation, a stochastic online algorithm (SOLAM) is proposed which has time and space complexity of one datum. We establish theoretical convergence of SOLAM with high probability and demonstrate its effectiveness and efficiency on standard benchmark datasets.


A Consistent Regularization Approach for Structured Prediction

Neural Information Processing Systems

We propose and analyze a regularization approach for structured prediction problems. We characterize a large class of loss functions that allows to naturally embed structured outputs in a linear space. We exploit this fact to design learning algorithms using a surrogate loss approach and regularization techniques. We prove universal consistency and finite sample bounds characterizing the generalization properties of the proposed method. Experimental results are provided to demonstrate the practical usefulness of the proposed approach.


beta-risk: a New Surrogate Risk for Learning from Weakly Labeled Data

Neural Information Processing Systems

During the past few years, the machine learning community has paid attention to developping new methods for learning from weakly labeled data. This field covers different settings like semi-supervised learning, learning with label proportions, multi-instance learning, noise-tolerant learning, etc. This paper presents a generic framework to deal with these weakly labeled scenarios. We introduce the beta-risk as a generalized formulation of the standard empirical risk based on surrogate margin-based loss functions. This risk allows us to express the reliability on the labels and to derive different kinds of learning algorithms. We specifically focus on SVMs and propose a soft margin beta-svm algorithm which behaves better that the state of the art.


Adaptive Smoothed Online Multi-Task Learning

Neural Information Processing Systems

This paper addresses the challenge of jointly learning both the per-task model parameters and the inter-task relationships in a multi-task online learning setting. The proposed algorithm features probabilistic interpretation, efficient updating rules and flexible modulation on whether learners focus on their specific task or on jointly address all tasks. The paper also proves a sub-linear regret bound as compared to the best linear predictor in hindsight. Experiments over three multi-task learning benchmark datasets show advantageous performance of the proposed approach over several state-of-the-art online multi-task learning baselines.


Discriminative Gaifman Models

Neural Information Processing Systems

We present discriminative Gaifman models, a novel family of relational machine learning models. Gaifman models learn feature representations bottom up from representations of locally connected and bounded-size regions of knowledge bases (KBs). Considering local and bounded-size neighborhoods of knowledge bases renders logical inference and learning tractable, mitigates the problem of overfitting, and facilitates weight sharing. Gaifman models sample neighborhoods of knowledge bases so as to make the learned relational models more robust to missing objects and relations which is a common situation in open-world KBs. We present the core ideas of Gaifman models and apply them to large-scale relational learning problems. We also discuss the ways in which Gaifman models relate to some existing relational machine learning approaches.


Launch and Iterate: Reducing Prediction Churn

Neural Information Processing Systems

Practical applications of machine learning often involve successive training iterations with changes to features and training examples. Ideally, changes in the output of any new model should only be improvements (wins) over the previous iteration, but in practice the predictions may change neutrally for many examples, resulting in extra net-zero wins and losses, referred to as unnecessary churn. These changes in the predictions are problematic for usability for some applications, and make it harder and more expensive to measure if a change is statistically significant positive. In this paper, we formulate the problem and present a stabilization operator to regularize a classifier towards a previous classifier. We use a Markov chain Monte Carlo stabilization operator to produce a model with more consistent predictions without adversely affecting accuracy. We investigate the properties of the proposal with theoretical analysis. Experiments on benchmark datasets for different classification algorithms demonstrate the method and the resulting reduction in churn.


Supervised learning through the lens of compression

Neural Information Processing Systems

This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. We first extend the investigation to multiclass categorization: we prove that in this case learnability is equivalent to compression of logarithmic sample size and that the uniform convergence property implies compression of constant size. We use the compressibility-learnability equivalence to show that (i) for multiclass categorization, PAC and agnostic PAC learnability are equivalent, and (ii) to derive a compactness theorem for learnability. We then consider supervised learning under general loss functions: we show that in this case, in order to maintain the compressibility-learnability equivalence, it is necessary to consider an approximate variant of compression. We use it to show that PAC and agnostic PAC are not equivalent, even when the loss function has only three values.