AITopics

We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier. We propose an episodic memory model that performs sparse experience replay and local adaptation to mitigate catastrophic forgetting in this setup. Experiments on text classification and question answering demonstrate the complementary benefits of sparse experience replay and local adaptation to allow the model to continuously learn from new datasets. We also show that the space complexity of the episodic memory module can be reduced significantly ( 50-90%) by randomly choosing which examples to store in memory with a minimal decrease in performance. We consider an episodic memory component as a crucial building block of general linguistic intelligence and see our model as a first step in that direction.

artificial intelligence, health & medicine, language learning, (5 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Curriculum > Subject-Specific Education (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)

Add feedback

Joint quantile regression in vector-valued RKHSs

Sangnier, Maxime, Fercoq, Olivier, d', Alché-Buc, Florence

Neural Information Processing SystemsFeb-14-2020, 14:28:55 GMT

Addressing the will to give a more complete picture than an average relationship provided by standard regression, a novel framework for estimating and predicting simultaneously several conditional quantiles is introduced. The proposed methodology leverages kernel-based multi-task learning to curb the embarrassing phenomenon of quantile crossing, with a one-step estimation procedure and no post-processing. Moreover, this framework comes along with theoretical guarantees and an efficient coordinate descent learning algorithm. Numerical experiments on benchmark and real datasets highlight the enhancements of our approach regarding the prediction error, the crossing occurrences and the training time. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, joint quantile regression, machine learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Structured Prediction Approach for Label Ranking

Korba, Anna, Garcia, Alexandre, d', Alché-Buc, Florence

Neural Information Processing SystemsDec-31-2018

We propose to solve a label ranking problem as a structured output regression task. In this view, we adopt a least square surrogate loss approach that solves a supervised learning problem in two steps: a regression step in a well-chosen feature space and a pre-image (or decoding) step. We use specific feature maps/embeddings for ranking data, which convert any ranking/permutation into a vector representation. These embeddings are all well-tailored for our approach, either by resulting in consistent estimators, or by solving trivially the pre-image problem which is often the bottleneck in structured prediction. Their extension to the case of incomplete or partial rankings is also discussed. Finally, we provide empirical results on synthetic and real-world datasets showing the relevance of our method.

artificial intelligence, inductive learning, label ranking, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Genre: Research Report (0.46)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

A Structured Prediction Approach for Label Ranking

Korba, Anna, Garcia, Alexandre, d', Alché-Buc, Florence

Neural Information Processing SystemsDec-31-2018

We propose to solve a label ranking problem as a structured output regression task. In this view, we adopt a least square surrogate loss approach that solves a supervised learning problem in two steps: a regression step in a well-chosen feature space and a pre-image (or decoding) step. We use specific feature maps/embeddings for ranking data, which convert any ranking/permutation into a vector representation. These embeddings are all well-tailored for our approach, either by resulting in consistent estimators, or by solving trivially the pre-image problem which is often the bottleneck in structured prediction. Their extension to the case of incomplete or partial rankings is also discussed. Finally, we provide empirical results on synthetic and real-world datasets showing the relevance of our method.

artificial intelligence, inductive learning, label ranking, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Genre: Research Report (0.46)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Nonlinear Acceleration of Stochastic Algorithms

Scieur, Damien, Bach, Francis, d', Aspremont, Alexandre

Neural Information Processing SystemsDec-31-2017

Extrapolation methods use the last few iterates of an optimization algorithm to produce a better estimate of the optimum. They were shown to achieve optimal convergence rates in a deterministic setting using simple gradient iterates. Here, we study extrapolation methods in a stochastic setting, where the iterates are produced by either a simple or an accelerated stochastic gradient algorithm. We first derive convergence bounds for arbitrary, potentially biased perturbations, then produce asymptotic bounds using the ratio between the variance of the noise and the accuracy of the current point. Finally, we apply this acceleration technique to stochastic algorithms such as SGD, SAGA, SVRG and Katyusha in different settings, and show significant performance gains.

algorithm 1, artificial intelligence, optimization problem, (15 more...)

Neural Information Processing Systems

Country:

Europe > France (0.15)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

Add feedback

Integration Methods and Optimization Algorithms

Scieur, Damien, Roulet, Vincent, Bach, Francis, d', Aspremont, Alexandre

Neural Information Processing SystemsDec-31-2017

We show that accelerated optimization methods can be seen as particular instances of multi-step integration schemes from numerical analysis, applied to the gradient flow equation. Compared with recent advances in this vein, the differential equation considered here is the basic gradient flow, and we derive a class of multi-step schemes which includes accelerated algorithms, using classical conditions from numerical analysis. Multi-step schemes integrate the differential equation using larger step sizes, which intuitively explains the acceleration phenomenon.

artificial intelligence, multistep method, optimization problem, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.69)
North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Sharpness, Restart and Acceleration

Roulet, Vincent, d', Aspremont, Alexandre

Neural Information Processing SystemsDec-31-2017

The {\L}ojasiewicz inequality shows that H\"olderian error bounds on the minimum of convex optimization problems hold almost generically. Here, we clarify results of \citet{Nemi85} who show that H\"olderian error bounds directly controls the performance of restart schemes. The constants quantifying error bounds are of course unobservable, but we show that optimal restart strategies are robust, and searching for the best scheme only increases the complexity by a logarithmic factor compared to the optimal bound. Overall then, restart schemes generically accelerate accelerated methods.

artificial intelligence, optimization problem, restart scheme, (17 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
Europe > Russia (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Joint quantile regression in vector-valued RKHSs

Sangnier, Maxime, Fercoq, Olivier, d', Alché-Buc, Florence

Neural Information Processing SystemsDec-31-2016

Addressing the will to give a more complete picture than an average relationship provided by standard regression, a novel framework for estimating and predicting simultaneously several conditional quantiles is introduced. The proposed methodology leverages kernel-based multi-task learning to curb the embarrassing phenomenon of quantile crossing, with a one-step estimation procedure and no post-processing. Moreover, this framework comes along with theoretical guarantees and an efficient coordinate descent learning algorithm. Numerical experiments on benchmark and real datasets highlight the enhancements of our approach regarding the prediction error, the crossing occurrences and the training time.

artificial intelligence, machine learning, quantile regression, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)
Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Regularized Nonlinear Acceleration

Scieur, Damien, d', Aspremont, Alexandre, Bach, Francis

Neural Information Processing SystemsDec-31-2016

We describe a convergence acceleration technique for generic optimization problems. Ourscheme computes estimates of the optimum from a nonlinear average of the iterates produced by any optimization method. The weights in this average are computed via a simple and small linear system, whose solution can be updated online. This acceleration scheme runs in parallel to the base algorithm, providing improvedestimates of the solution on the fly, while the original optimization method is running. Numerical experiments are detailed on classical classification problems.

artificial intelligence, optimization problem, polynomial, (17 more...)

Neural Information Processing Systems

Country:

Europe > France (0.15)
Europe > Spain (0.14)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

SerialRank: Spectral Ranking using Seriation

Fogel, Fajwel, d', Aspremont, Alexandre, Vojnovic, Milan

Neural Information Processing SystemsDec-31-2014

We describe a seriation algorithm for ranking a set of n items given pairwise comparisons between these items. Intuitively, the algorithm assigns similar rankings to items that compare similarly with all others. It does so by constructing a similarity matrix from pairwise comparisons, using seriation methods to reorder this matrix and construct a ranking. We first show that this spectral seriation algorithm recovers the true ranking when all pairwise comparisons are observed and consistent with a total order. We then show that ranking reconstruction is still exact even when some pairwise comparisons are corrupted or missing, and that seriation based spectral ranking is more robust to noise than other scoring methods. An additional benefit of the seriation formulation is that it allows us to solve semi-supervised ranking problems. Experiments on both synthetic and real datasets demonstrate that seriation based spectral ranking achieves competitive and in some cases superior performance compared to classical ranking methods.

artificial intelligence, pairwise comparison, soccer, (17 more...)

Neural Information Processing Systems

Country: