Goto

Collaborating Authors

 Inductive Learning



The Pessimistic Limits of Margin-based Losses in Semi-supervised Learning

arXiv.org Machine Learning

We show that for linear classifiers defined by convex marginbased surrogate losses that are monotonically decreasing, it is impossible to construct any semi-supervised approach that is able to guarantee an improvement over the supervised classifier measured by this surrogate loss. For non-monotonically decreasing loss functions, we demonstrate safe improvements are possible. Key words and phrases: Semi-supervised Learning, Margin-based loss, Surrogate loss, Logistic Loss, Hinge Loss, Quadratic Loss, Absolute Loss. 1. INTRODUCTION Semi-supervised learning has delivered encouraging results in various settings, e.g. for object detection in computer vision [1], protein function prediction from sequence data [2] or prediction of cancer recurrence [3] in the biomedical domain and part-of-speech tagging in natural language processing [4]. In other settings, however, using unlabeled data has been shown to lead to a decrease in performance when compared to the supervised solution [4, 5]. For semi-supervised classifiers to be used safely in practice, we may at least want some guarantee that they improve performance over their supervised alternatives.


RSSL: Semi-supervised Learning in R

arXiv.org Machine Learning

In this paper, we introduce a package for semi-supervised learning research in the R programming language called RSSL. We cover the purpose of the package, the methods it includes and comment on their use and implementation. We then show, using several code examples, how the package can be used to replicate well-known results from the semi-supervised learning literature.


A real quick snooze! New record set for the world's fastest BED - as modified vehicle clocks 84mph on the race track

Daily Mail - Science & tech

New record set for the world's fastest BED with motorised mattress clocking 84mph on a race track Engineers were commissioned by a hotel booking site to build a motorised bed British racing diver Tom Onslow-Cole, 29, took the piece of furniture for a spin He broke the Guinness World Record for the World's Fastest Bed at 83.8mph He broke the Guinness World Record for the World's Fastest Bed at 83.8mph British racing diver Tom Onslow-Cole, 29, took the piece of furniture for a spin and broke the Guinness World Record for the World's Fastest Bed, clocking 84mph The do's and don'ts of aprรจs-ski revealed (including why... Aviation expert reveals how to travel in luxury on a... The do's and don'ts of aprรจs-ski revealed (including why... Aviation expert reveals how to travel in luxury on a... Crossing the finish line: Adjudicators clocked it whooshing forwards at 83.8 mph A wheely great sleep: Onslow-Cole said his speedy snooze was an'unforgettable experience'. He added: 'I hope it'll stand the test of time โ€“ it'll take some beating!' Woman goes on racist rant while waiting in line at J.C. Penney Black blues musician explores racism by befriending the KKK A young thug is filmed fly kicking a lady in the back Dramatic moment man removed from flight for'speaking Arabic' GRAPHIC: Robber is left writhing on the pavement after shot out Syrian police injured after girl blows herself up inside station Male guests in a Chinese wedding flock to harass a bridesmaid Angela Rye shares video of her invasive ordeal with TSA agent Body cam footage shows moments before two Georgia cops are shot Boeing cargo plane overshoots runway before crashing in Colombia Shocking video shows a Texas mother hitting her daughter Adorable moment puppy excitedly unwraps Christmas present Woman goes on racist rant while waiting in line at J.C. Penney Dramatic moment man removed from flight for'speaking Arabic' Is resting your head on a BOX the best way to sleep on a... Shocking pictures reveal how some of the most picturesque... Choose the right seat, alter your watch and drink alcohol:... Fascinating images capture the... Should you be worried about flying in the snow? When photographers were asked to submit their best holiday... 'Is this a real picture?


3D Generative Adversarial Network

#artificialintelligence

We study the problem of 3D object generation. We propose a novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional networks and generative adversarial nets. The benefits of our model are three-fold: first, the use of an adversarial criterion, instead of traditional heuristic criteria, enables the generator to capture object structure implicitly and to synthesize high-quality 3D objects; second, the generator establishes a mapping from a low-dimensional probabilistic space to the space of 3D objects, so that we can sample objects without a reference image or CAD models, and explore the 3D object manifold; third, the adversarial discriminator provides a powerful 3D shape descriptor which, learned without supervision, has wide applications in 3D object recognition. Experiments demonstrate that our method generates high-quality 3D objects, and our unsupervisedly learned features achieve impressive performance on 3D object recognition, comparable with those of supervised learning methods.


Improving Predictions with Ensemble Model

@machinelearnbot

"Alone we can do so little and together we can do much" - a phrase from Helen Keller during 50's is a reflection of achievements and successful stories in real life scenarios from decades. Same thing applies with most of the cases from innovation with big impacts and with advanced technologies world. The machine Learning domain is also in the same race to make predictions and classification in a more accurate way using so called ensemble method and it is proved that ensemble modeling offers one of the most convincing way to build highly accurate predictive models. Ensemble methods are learning models that achieve performance by combining the opinions of multiple learners. Typically, an ensemble model is a supervised learning technique for combining multiple weak learners or models to produce a strong learner with the concept of Bagging and Boosting for data sampling.


Hierarchical Partitioning of the Output Space in Multi-label Data

arXiv.org Machine Learning

Hierarchy Of Multi-label classifiers (HOMER) is a multi-label learning algorithm that breaks the initial learning task to several, easier sub-tasks by first constructing a hierarchy of labels from a given label set and secondly employing a given base multi-label classifier (MLC) to the resulting sub-problems. The primary goal is to effectively address class imbalance and scalability issues that often arise in real-world multi-label classification problems. In this work, we present the general setup for a HOMER model and a simple extension of the algorithm that is suited for MLCs that output rankings. Furthermore, we provide a detailed analysis of the properties of the algorithm, both from an aspect of effectiveness and computational complexity. A secondary contribution involves the presentation of a balanced variant of the k means algorithm, which serves in the first step of the label hierarchy construction. We conduct extensive experiments on six real-world datasets, studying empirically HOMER's parameters and providing examples of instantiations of the algorithm with different clustering approaches and MLCs, The empirical results demonstrate a significant improvement over the given base MLC.


Causal Discovery as Semi-Supervised Learning

arXiv.org Machine Learning

In this short report, we discuss an approach to estimating causal graphs in which indicators of causal influence between variables are treated as labels in a machine learning formulation. Available data on the variables of interest are used as "inputs" to estimate the labels. We frame the problem as one of semi-supervised learning: available interventional data or background knowledge provide labels on some edges in the graph and the remaining edges are treated as unlabelled objects. To illustrate the key ideas, we consider a simple approach to feature construction (rooted in bivariate kernel density estimation) and embed this within a semi-supervised manifold framework. Results on yeast knockout data demonstrate that the proposed approach can identify causal relationships as validated by unseen interventional experiments. An advantage of the formulation we propose is that by reframing causal discovery as semi-supervised learning, it allows a range of data-driven approaches to be brought to bear on causal discovery, without demanding specification of full probability models or explicit models of underlying mechanisms.


Graph-based semi-supervised learning for relational networks

arXiv.org Machine Learning

We address the problem of semi-supervised learning in relational networks, networks in which nodes are entities and links are the relationships or interactions between them. Typically this problem is confounded with the problem of graph-based semi-supervised learning (GSSL), because both problems represent the data as a graph and predict the missing class labels of nodes. However, not all graphs are created equally. In GSSL a graph is constructed, often from independent data, based on similarity. As such, edges tend to connect instances with the same class label. Relational networks, however, can be more heterogeneous and edges do not always indicate similarity. For instance, instead of links being more likely to connect nodes with the same class label, they may occur more frequently between nodes with different class labels (link-heterogeneity). Or nodes with the same class label do not necessarily have the same type of connectivity across the whole network (class-heterogeneity), e.g. in a network of sexual interactions we may observe links between opposite genders in some parts of the graph and links between the same genders in others. Performing classification in networks with different types of heterogeneity is a hard problem that is made harder still when we do not know a-priori the type or level of heterogeneity. Here we present two scalable approaches for graph-based semi-supervised learning for the more general case of relational networks. We demonstrate these approaches on synthetic and real-world networks that display different link patterns within and between classes. Compared to state-of-the-art approaches, ours give better classification performance without prior knowledge of how classes interact. In particular, our two-step label propagation algorithm gives consistently good accuracy and runs on networks of over 1.6 million nodes and 30 million edges in around 12 seconds.


What makes ImageNet good for transfer learning?

arXiv.org Artificial Intelligence

The tremendous success of ImageNet-trained deep features on a wide range of transfer tasks begs the question: what are the properties of the ImageNet dataset that are critical for learning good, general-purpose features? This work provides an empirical investigation of various facets of this question: Is more pre-training data always better? How does feature quality depend on the number of training examples per class? Does adding more object classes improve performance? For the same data budget, how should the data be split into classes? Is fine-grained recognition necessary for learning good features? Given the same number of training classes, is it better to have coarse classes or fine-grained classes? Which is better: more classes or more examples per class? To answer these and related questions, we pre-trained CNN features on various subsets of the ImageNet dataset and evaluated transfer performance on PASCAL detection, PASCAL action classification, and SUN scene classification tasks. Our overall findings suggest that most changes in the choice of pre-training data long thought to be critical do not significantly affect transfer performance.? Given the same number of training classes, is it better to have coarse classes or fine-grained classes? Which is better: more classes or more examples per class?