Goto

Collaborating Authors

 Inductive Learning


How do machine learning professionals use structured prediction?

#artificialintelligence

Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.


Introducing Neural Structured Learning in TensorFlow

#artificialintelligence

We are excited to introduce Neural Structured Learning in TensorFlow, an easy-to-use framework that both novice and advanced developers can use for training neural networks with structured signals. Neural Structured Learning (NSL) can be applied to construct accurate and robust models for vision, language understanding, and prediction in general. Many machine learning tasks benefit from using structured data which contains rich relational information among the samples. For example, modeling citation networks, Knowledge Graph inference and reasoning on linguistic structure of sentences, and learning molecular fingerprints all require a model to learn from structured inputs, as opposed to just individual samples. These structures can be explicitly given (e.g., as a graph), or implicitly inferred (e.g., as an adversarial example).


A beginner's guide to supervised learning with Python

#artificialintelligence

Why is artificial intelligence (AI) and machine learning (ML) so important? Anyone who doesn't understand this will soon be left behind. There are many kinds of implementations and techniques that carry out AI and ML to solve real-time problems, and supervised learning is one of the most used approaches. "The key to artificial intelligence has always been the representation." In supervised learning, we start by importing a dataset containing training attributes and the target attributes.


Distributionally Robust Language Modeling

arXiv.org Machine Learning

Language models are generally trained on data spanning a wide range of topics (e.g., news, reviews, fiction), but they might be applied to an a priori unknown target distribution (e.g., restaurant reviews). In this paper, we first show that training on text outside the test distribution can degrade test performance when using standard maximum likelihood (MLE) training. To remedy this without the knowledge of the test distribution, we propose an approach which trains a model that performs well over a wide range of potential test distributions. In particular, we derive a new distributionally robust optimization (DRO) procedure which minimizes the loss of the model over the worst-case mixture of topics with sufficient overlap with the training distribution. Our approach, called topic conditional value at risk (topic CVaR), obtains a 5.5 point perplexity reduction over MLE when the language models are trained on a mixture of Yelp reviews and news and tested only on reviews.


Google launches TensorFlow machine learning framework for graphical data

#artificialintelligence

Google today introduced Neural Structured Learning (NSL), an open source framework that uses the Neural Graph Learning method for training neural networks with graphs and structured data. NSL works with with the TensorFlow machine learning platform and is made to work for both experienced and inexperienced machine learning practitioners. NSL can make models for computer vision, perform NLP, and run predictions from graphical datasets like medical records or knowledge graphs. "Leveraging structured signals during training allows developers to achieve higher model accuracy, particularly when the amount of labeled data is relatively small," TensorFlow engineers said in a blog post today. "Training with structured signals also leads to more robust models. These techniques have been widely used in Google for improving model performance, such as learning image semantic embedding."


To Detect Irregular Trade Behaviors In Stock Market By Using Graph Based Ranking Methods

arXiv.org Machine Learning

To detect the irregular trade behaviors in the stock market is the important problem in machine learning field. These irregular trade behaviors are obviously illegal. To detect these irregular trade behaviors in the stock market, data scientists normally employ the supervised learning techniques. In this paper, we employ the three graph Laplacian based semi-supervised ranking methods to solve the irregular trade behavior detection problem. Experimental results show that that the un-normalized and symmetric normalized graph Laplacian based semi-supervised ranking methods outperform the random walk Laplacian based semi-supervised ranking method.


Dual Student: Breaking the Limits of the Teacher in Semi-supervised Learning

arXiv.org Machine Learning

Recently, consistency-based methods have achieved state-of-the-art results in semi-supervised learning (SSL). These methods always involve two roles, an explicit or implicit teacher model and a student model, and penalize predictions under different perturbations by a consistency constraint. However, the weights of these two roles are tightly coupled since the teacher is essentially an exponential moving average (EMA) of the student. In this work, we show that the coupled EMA teacher causes a performance bottleneck. To address this problem, we introduce Dual Student, which replaces the teacher with another student. We also define a novel concept, stable sample, following which a stabilization constraint is designed for our structure to be trainable. Further, we discuss two variants of our method, which produce even higher performance. Extensive experiments show that our method improves the classification performance significantly on several main SSL benchmarks. Specifically, it reduces the error rate of the 13-layer CNN from 16.84% to 12.39% on CIFAR-10 with 1k labels and from 34.10% to 31.56% on CIFAR-100 with 10k labels. In addition, our method also achieves a clear improvement in domain adaptation.


Understanding Supervised Learning In One Article

#artificialintelligence

As you might know, supervised machine learning is one of the most commonly used and successful types of machine learning. In this article, we will describe supervised learning in more detail and explain several popular supervised learning algorithms. Remember that supervised learning is used whenever we want to predict a certain outcome from a given input, and we have examples of input/output pairs. We build a machine learning model from these input/output pairs, which comprise our training set. Our goal is to make accurate predictions for new, never-before-seen data. Supervised learning often requires human effort to build the training set, but afterwards automates and often speeds up an otherwise laborious or infeasible task. There are two major types of supervised machine learning problems, called classification and regression. In classification, the goal is to predict a class label, which is a choice from a predefined list of possibilities.


An Adjusted Nearest Neighbor Algorithm Maximizing the F-Measure from Imbalanced Data

arXiv.org Machine Learning

In this paper, we address the challenging problem of learning from imbalanced data using a Nearest-Neighbor (NN) algorithm. In this setting, the minority examples typically belong to the class of interest requiring the optimization of specific criteria, like the F-Measure. Based on simple geometrical ideas, we introduce an algorithm that reweights the distance between a query sample and any positive training example. This leads to a modification of the Voronoi regions and thus of the decision boundaries of the NN algorithm. We provide a theoretical justification about the weighting scheme needed to reduce the False Negative rate while controlling the number of False Positives. We perform an extensive experimental study on many public imbalanced datasets, but also on large scale non public data from the French Ministry of Economy and Finance on a tax fraud detection task, showing that our method is very effective and, interestingly, yields the best performance when combined with state of the art sampling methods.


Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT

#artificialintelligence

At Hugging Face, we experienced first-hand the growing popularity of these models as our NLP library -- which encapsulates most of them -- got installed more than 400,000 times in just a few months. However, as these models were reaching a larger NLP community, an important and challenging question started to emerge. How should we put these monsters in production? How can we use such large models under low latency constraints? Do we need (costly) GPU servers to serve at scale?