AITopics | Gradient Descent

Collaborating Authors

Gradient Descent

News Overviews Instructional Materials AI-Alerts Classics

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

Islam, Riashat, Seraj, Raihan, Arnob, Samin Yeasar, Precup, Doina

arXiv.org Machine LearningDec-10-2019

We study the problem of off-policy critic evaluation in several variants of value-based off-policy actor-critic algorithms. Off-policy actor-critic algorithms require an off-policy critic evaluation step, to estimate the value of the new policy after every policy gradient update. Despite enormous success of off-policy policy gradients on control tasks, existing general methods suffer from high variance and instability, partly because the policy improvement depends on gradient of the estimated value function. In this work, we present a new way of off-policy policy evaluation in actor-critic, based on the doubly robust estimators. We extend the doubly robust estimator from off-policy policy evaluation (OPE) to actor-critic algorithms that consist of a reward estimator performance model. We find that doubly robust estimation of the critic can significantly improve performance in continuous control tasks. Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.

algorithm, estimator, evaluation, (12 more...)

arXiv.org Machine Learning

1912.05109

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Quebec > Montreal (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Statistically Robust Neural Network Classification

Wang, Benjie, Webb, Stefan, Rainforth, Tom

arXiv.org Machine LearningDec-10-2019

Recently there has been much interest in quantifying the robustness of neural network classifiers through adversarial risk metrics. However, for problems where test-time corruptions occur in a probabilistic manner, rather than being generated by an explicit adversary, adversarial metrics typically do not provide an accurate or reliable indicator of robustness. To address this, we introduce a statistically robust risk (SRR) framework which measures robustness in expectation over both network inputs and a corruption distribution. Unlike many adversarial risk metrics, which typically require separate applications on a point-by-point basis, the SRR can easily be directly estimated for an entire network and used as a training objective in a stochastic gradient scheme. Furthermore, we show both theoretically and empirically that it can scale to higher-dimensional networks by providing superior generalization performance compared with comparable adversarial risks.

neural network, proceedings, robustness, (16 more...)

arXiv.org Machine Learning

1912.04884

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Analysis of the Optimization Landscapes for Overcomplete Representation Learning

Qu, Qing, Zhai, Yuexiang, Li, Xiao, Zhang, Yuqian, Zhu, Zhihui

arXiv.org Machine LearningDec-10-2019

We study nonconvex optimization landscapes for learning overcomplete representations, including learning (i) sparsely used overcomplete dictionaries and (ii) convolutional dictionaries, where these unsupervised learning problems find many applications in high-dimensional data analysis. Despite the empirical success of simple nonconvex algorithms, theoretical justifications of why these methods work so well are far from satisfactory. In this work, we show these problems can be formulated as $\ell^4$-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes. For both problems, we show the nonconvex objectives have benign (global) geometric structures, in the sense that every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature. This discovery enables the development of guaranteed global optimization methods using simple initializations. For both problems, we show the nonconvex objectives have benign geometric structures -- every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature -- either in the entire space or within a sufficiently large region. This discovery ensures local search algorithms (such as Riemannian gradient descent) with simple initializations approximately find the target solutions. Finally, numerical experiments justify our theoretical discoveries.

equation, pa 0, probability, (13 more...)

arXiv.org Machine Learning

1912.02427

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

How to implement Gradient Descent in Python

#artificialintelligenceDec-9-2019, 20:47:12 GMT

We will try to build a single neuron network, which can predict the admissions of a graduate school. The data we will use is shared above in google drive. The first 5 rows of data are shown below. The first column admit indicates whether the student is getting admitted to the school or not, this will be the target for our model; the second column gre and the third column gpa are numerical features for the student; the fourth column rank is a categorical feature. We will apply one-hot encoding to the categorical feature to add dummy columns.

cross-entropy loss function, implement gradient descent, loss function, (6 more...)

#artificialintelligence

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Learn Electronic Health Records by Fully Decentralized Federated Learning

Lu, Songtao, Zhang, Yawen, Wang, Yunlong, Mack, Christina

arXiv.org Machine LearningDec-9-2019

Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network. In this paper, we focus on improving the communication efficiency for fully decentralized federated learning over a graph, where the algorithm performs local updates for several iterations and then enables communications among the nodes. In such a way, the communication rounds of exchanging the common interest of parameters can be saved significantly without loss of optimality of the solutions. Multiple numerical simulations based on large, real-world electronic health record databases showcase the superiority of the decentralized federated learning compared with classic methods.

algorithm, communication round, hospital, (12 more...)

arXiv.org Machine Learning

1912.01792

Country:

North America > United States > Minnesota (0.04)
North America > Canada (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.32)

Add feedback

Multi-Gradient Descent for Multi-Objective Recommender Systems

Milojkovic, Nikola, Antognini, Diego, Bergamin, Giancarlo, Faltings, Boi, Musat, Claudiu

arXiv.org Artificial IntelligenceDec-9-2019

Recommender systems need to mirror the complexity of the environment they are applied in. The more we know about what might benefit the user, the more objectives the recommender system has. In addition there may be multiple stakeholders - sellers, buyers, shareholders - in addition to legal and ethical constraints. Simultaneously optimizing for a multitude of objectives, correlated and not correlated, having the same scale or not, has proven difficult so far. We introduce a stochastic multi-gradient descent approach to recommender systems (MGDRec) to solve this problem. We show that this exceeds state-of-the-art methods in traditional objective mixtures, like revenue and recall. Not only that, but through gradient normalization we can combine fundamentally different objectives, having diverse scales, into a single coherent framework. We show that uncorrelated objectives, like the proportion of quality products, can be improved alongside accuracy. Through the use of stochasticity, we avoid the pitfalls of calculating full gradients and provide a clear setting for its applicability.

objective, recommendation, recommender system, (13 more...)

arXiv.org Artificial Intelligence

2001.00846

Country:

Europe > Switzerland (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.66)

Industry: Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Add feedback

Winning the Lottery with Continuous Sparsification

Savarese, Pedro, Silva, Hugo, Maire, Michael

arXiv.org Machine LearningDec-9-2019

The Lottery Ticket Hypothesis from Frankle & Carbin (2019) conjectures that, for typically-sized neural networks, it is possible to find small sub-networks which train faster and yield superior performance than their original counterparts. The proposed algorithm to search for "winning tickets", Iterative Magnitude Pruning, consistently finds sub-networks with $90-95\%$ less parameters which train faster and better than the overparameterized models they were extracted from, creating potential applications to problems such as transfer learning. In this paper, we propose Continuous Sparsification, a new algorithm to search for winning tickets which continuously removes parameters from a network during training, and learns the sub-network's structure with gradient-based methods instead of relying on pruning strategies. We show empirically that our method is capable of finding tickets that outperforms the ones learned by Iterative Magnitude Pruning, and at the same time providing faster search, when measured in number of training epochs or wall-clock time.

iteration, sparsity, ticket, (12 more...)

arXiv.org Machine Learning

1912.04427

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report (0.82)
Contests & Prizes (0.79)

Industry: Leisure & Entertainment (0.79)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Optimizing Rank-based Metrics with Blackbox Differentiation

Rolínek, Michal, Musil, Vít, Paulus, Anselm, Vlastelica, Marin, Michaelis, Claudio, Martius, Georg

arXiv.org Machine LearningDec-7-2019

Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In addition, we address optimization instability and sparsity of the supervision signal that both arise from using rank-based metrics as optimization targets. Resulting losses based on recall and Average Precision are applied to image retrieval and object detection tasks. We obtain performance that is competitive with state-of-the-art on standard image retrieval datasets and consistently improve performance of near state-of-the-art object detectors.

computer vision, computer vision and pattern recognition, ieee conference, (13 more...)

arXiv.org Machine Learning

1912.035

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

A Hybrid Approach Towards Two Stage Bengali Question Classification Utilizing Smart Data Balancing Technique

Rahman, Md. Hasibur, Rahman, Chowdhury Rafeed, Amin, Ruhul, Sifat, Md. Habibur Rahman, Anika, Afra

arXiv.org Machine LearningDec-7-2019

Question classification (QC) is the primary step of the Question Answering (QA) system. Question Classification (QC) system classifies the questions in particular classes so that Question Answering (QA) System can provide correct answers for the questions. Our system categorizes the factoid type questions asked in natural language after extracting features of the questions. We present a two stage QC system for Bengali. It utilizes one dimensional convolutional neural network for classifying questions into coarse classes in the first stage. Word2vec representation of existing words of the question corpus have been constructed and used for assisting 1D CNN. A smart data balancing technique has been employed for giving data hungry convolutional neural network the advantage of a greater number of effective samples to learn from. For each coarse class, a separate Stochastic Gradient Descent (SGD) based classifier has been used in order to differentiate among the finer classes within that coarse class. TF-IDF representation of each word has been used as feature for the SGD classifiers implemented as part of second stage classification. Experiments show the effectiveness of our proposed method for Bengali question classification.

classification, coarse class, question classification, (14 more...)

arXiv.org Machine Learning

1912.00127

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States (0.04)
Asia > China (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

A Quasi-Newton Method Based Vertical Federated Learning Framework for Logistic Regression

#artificialintelligenceDec-5-2019, 04:10:45 GMT

Data privacy and security becomes a major concern in building machine learning models from different data providers. Federated learning shows promise by leaving data at providers locally and exchanging encrypted information. This paper studies the vertical federated learning structure for logistic regression where the data sets at two parties have the same sample IDs but own disjoint subsets of features. Existing frameworks adopt the first-order stochastic gradient descent algorithm, which requires large number of communication rounds. To address the communication challenge, we propose a quasi-Newton method based vertical federated learning framework for logistic regression under the additively homomorphic encryption scheme.

logistic regression, quasi-newton method, vertical federated learning framework, (1 more...)

#artificialintelligence

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)

Add feedback