AITopics

2105.14095

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Neural Information Processing SystemsFeb-15-2020, 04:27:56 GMT

Learning Bounds for Domain Adaptation

Blitzer, John, Crammer, Koby, Kulesza, Alex, Pereira, Fernando, Wortman, Jennifer

Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source domain with a large amount of training data to different target domain with very little training data. In this work we give uniform convergence bounds for algorithms that minimize a convex combination of source and target empirical risk. The bounds explicitly model the inherent trade-off between training on a large but inaccurate source data set and a small but accurate target training set. Our theory also gives results when we have multiple source domains, each of which may have a different number of instances, and we exhibit cases in which minimizing a non-uniform combination of source risks can achieve much lower target error than standard empirical risk minimization.

artificial intelligence, domain adaptation, machine learning, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJun-13-2019

Variance Estimation For Online Regression via Spectrum Thresholding

Kozdoba, Mark, Moroshko, Edward, Mannor, Shie, Crammer, Koby

We consider the online linear regression problem, where the predictor vector may vary with time. This problem can be modelled as a linear dynamical system, where the parameters that need to be learned are the variance of both the process noise and the observation noise. The classical approach to learning the variance is via the maximum likelihood estimator -- a non-convex optimization problem prone to local minima and with no finite sample complexity bounds. In this paper we study the global system operator: the operator that maps the noises vectors to the output. In particular, we obtain estimates on its spectrum, and as a result derive the first known variance estimators with sample complexity guarantees for online regression problems. We demonstrate the approach on a number of synthetic and real-world benchmarks.

artificial intelligence, machine learning, spectrum, (16 more...)

1906.05591

Country: North America (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Neural Information Processing SystemsDec-31-2018

Efficient Loss-Based Decoding on Graphs for Extreme Classification

Evron, Itay, Moroshko, Edward, Crammer, Koby

In extreme classification problems, learning algorithms are required to map instances to labels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space (LTLS), and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds. Our framework employs output codes induced by graphs, for which we show how to perform efficient loss-based decoding to potentially improve accuracy. In addition, our framework offers a tradeoff between accuracy, model size and prediction time. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms.

accuracy, artificial intelligence, machine learning, (17 more...)

Country:

North America > Canada > Quebec (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Virginia (0.14)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsDec-31-2018

Efficient Loss-Based Decoding on Graphs for Extreme Classification

Evron, Itay, Moroshko, Edward, Crammer, Koby

In extreme classification problems, learning algorithms are required to map instances tolabels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space [19], and on a general approach for error correcting output coding (ECOC) with loss-based decoding [1], and introduce a flexible and efficient approach accompanied by theoretical bounds. Our framework employs output codes induced by graphs, for which we show how to perform efficient loss-based decoding to potentially improve accuracy. In addition, ourframework offers a tradeoff between accuracy, model size and prediction time. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms.

artificial intelligence, graph, machine learning, (19 more...)

Country:

North America > Canada > Quebec (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Virginia (0.14)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningDec-17-2018

Multi Instance Learning For Unbalanced Data

Kozdoba, Mark, Moroshko, Edward, Shani, Lior, Takagi, Takuya, Katoh, Takashi, Mannor, Shie, Crammer, Koby

In the context of Multi Instance Learning, we analyze the Single Instance (SI) learning objective. We show that when the data is unbalanced and the family of classifiers is sufficiently rich, the SI method is a useful learning algorithm. In particular, we show that larger data imbalance, a quality that is typically perceived as negative, in fact implies a better resilience of the algorithm to the statistical dependencies of the objects in bags. In addition, our results shed new light on some known issues with the SI method in the setting of linear classifiers, and we show that these issues are significantly less likely to occur in the setting of neural networks. We demonstrate our results on a synthetic dataset, and on the COCO dataset for the problem of patch classification with weak image level labels derived from captions.

classifier, neural network, survey article, (21 more...)

1812.0701

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

arXiv.org Machine LearningMar-28-2018

A Better Resource Allocation Algorithm with Semi-Bandit Feedback

Dagan, Yuval, Crammer, Koby

We study a sequential resource allocation problem between a fixed number of arms. On each iteration the algorithm distributes a resource among the arms in order to maximize the expected success rate. Allocating more of the resource to a given arm increases the probability that it succeeds, yet with a cut-off. We follow Lattimore et al. (2014) and assume that the probability increases linearly until it equals one, after which allocating more of the resource is wasteful. These cut-off values are fixed and unknown to the learner. We present an algorithm for this problem and prove a regret upper bound of $O(\log n)$ improving over the best known bound of $O(\log^2 n)$. Lower bounds we prove show that our upper bound is tight. Simulations demonstrate the superiority of our algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

1803.10415

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

arXiv.org Machine LearningMar-8-2018

Efficient Loss-Based Decoding On Graphs For Extreme Classification

Evron, Itay, Moroshko, Edward, Crammer, Koby

In extreme classification problems, learning algorithms are required to map instances to labels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space, and on a general approach for error correcting output coding (ECOC), and introduce a flexible and efficient approach accompanied by bounds. Our framework employs output codes induced by graphs, and offers a tradeoff between accuracy and model size. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows the superiority of our method compared with state-of-the-art algorithms.

artificial intelligence, graph, machine learning, (15 more...)

1803.03319

Country:

Asia > Middle East > Israel (0.14)
North America > United States (0.14)

Genre:

Research Report > Experimental Study (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsDec-31-2017

Rotting Bandits

Levine, Nir, Crammer, Koby, Mannor, Shie

The Multi-Armed Bandits (MAB) framework highlights the trade-off between acquiring new knowledge (Exploration) and leveraging available knowledge (Exploitation). In the classical MAB problem, a decision maker must choose an arm at each time step, upon which she receives a reward. The decision maker's objective is to maximize her cumulative expected reward over the time horizon. The MAB problem has been studied extensively, specifically under the assumption of the arms' rewards distributions being stationary, or quasi-stationary, over time. We consider a variant of the MAB framework, which we termed Rotting Bandits, where each arm's expected reward decays as a function of the number of times it has been pulled. We are motivated by many real-world scenarios such as online advertising, content recommendation, crowdsourcing, and more. We present algorithms, accompanied by simulations, and derive theoretical guarantees.

algorithm, big data, health & medicine, (19 more...)

Country:

Asia > Middle East > Israel (0.14)
North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.50)

arXiv.org Machine LearningNov-2-2017

Rotting Bandits

Levine, Nir, Crammer, Koby, Mannor, Shie

The Multi-Armed Bandits (MAB) framework highlights the tension between acquiring new knowledge (Exploration) and leveraging available knowledge (Exploitation). In the classical MAB problem, a decision maker must choose an arm at each time step, upon which she receives a reward. The decision maker's objective is to maximize her cumulative expected reward over the time horizon. The MAB problem has been studied extensively, specifically under the assumption of the arms' rewards distributions being stationary, or quasi-stationary, over time. We consider a variant of the MAB framework, which we termed Rotting Bandits, where each arm's expected reward decays as a function of the number of times it has been pulled. We are motivated by many real-world scenarios such as online advertising, content recommendation, crowdsourcing, and more. We present algorithms, accompanied by simulations, and derive theoretical guarantees.

algorithm, artificial intelligence, big data, (17 more...)

1702.07274

Country:

Asia > Middle East > Israel (0.14)
North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)