Goto

Collaborating Authors

 Oceania


Supplement to " Maximum Average Randomly Sampled: A Scale Free and Non-parametric Algorithm for Stochastic Bandits "

Neural Information Processing Systems

The following lemma given in [2] is useful for the proof of Theorem 1. Lemma 1. [2] Given a stochastic matrix H = 0 0 0 h The following propositions are used to prove this theorem. In this case, there is not enough observations to achieve an upper confidence bound using Proposition 2. The randomized UCB for this case has also an exact confidence as illustrated below: Pr{UCB In the second equality, the boundedness of the means of the arms and Proposition 1 were utilized. The steps in this proof closely follows the proof of Theorem 7.1 in [3]. Let us define a'good' event as G We are going to show 1. The next step is to bound the probability of the second set in (3).


LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning

Neural Information Processing Systems

This paper presents a new class of gradient methods for distributed machine learning that adaptively skip the gradient calculations to learn with reduced communication and computation. Simple rules are designed to detect slowly-varying gradients and, therefore, trigger the reuse of outdated gradients. The resultant gradient-based algorithms are termed Lazily A ggregated G radient -- justifying our acronym LAG used henceforth. Theoretically, the merits of this contribution are: i) the convergence rate is the same as batch gradient descent in strongly-convex, convex, and nonconvex cases; and, ii) if the distributed datasets are heterogeneous (quantified by certain measurable constants), the communication rounds needed to achieve a targeted accuracy are reduced thanks to the adaptive reuse of lagged gradients. Numerical experiments on both synthetic and real data corroborate a significant communication reduction compared to alternatives.




Mean-field theory of graph neural networks in graph partitioning

Neural Information Processing Systems

A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments.



AI bubble fears return as Wall Street falls back from short-lived rally

The Guardian

Fears of a growing bubble around the artificial intelligence frenzy resurfaced on Thursday as leading US stock markets fell, less than 24 hours after strong results from chipmaker Nvidia sparked a rally. Wall Street initially rose after Nvidia, the world's largest public company, reassured investors of strong demand for its advanced data center chips. But the relief dissipated, and technology stocks at the heart of the AI boom came under pressure. The benchmark S&P 500 closed down 1.6%, and the Dow Jones industrial average closed down 0.8% in New York. The tech-focused Nasdaq Composite closed down 2.2%.