AITopics | bandit sampling

Appendix to " Adam with Bandit Sampling for Deep Learning "

Neural Information Processing SystemsFeb-8-2026, 03:14:54 GMT

According to Theorem 4. 1 in [1], the convergence rate of Adam is We prove Lemma 1 using the framework of online learning with bandit feedback. Let's consider a special case where It follows simply by plugging Lemma 3 into Theorem 2. In the main paper, we compared our method with Adam and Adam with importance sampling. In the main paper, we have shown the plots of loss value vs. wall clock time. Here, we include some plots of error rate vs. wall

artificial intelligence, deep learning, machine learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Adam with Bandit Sampling for Deep Learning

Neural Information Processing SystemsDec-23-2025, 22:59:25 GMT

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities.

adam, bandit sampling, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)
Information Technology > Data Science > Data Mining > Big Data (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)

Add feedback

3a077e8acfc4a2b463c47f2125fdfac5-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 17:01:42 GMT

adam-impt 0, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.14)
North America > Canada (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Adam with Bandit Sampling for Deep Learning

Neural Information Processing SystemsJan-23-2025, 13:09:04 GMT

Additional Feedback: This work seems to propose an approach for sampling minibatches that can perhaps be applied to other procedures apart from ADAM. Therefore, apart from ADAM, was this approach (or suitable variants) explored (perhaps empirically) for other optimiztions procedures that involve minibatch? It can also be used to produce desired minibatches for better training. How does this approach compare to the state of the art in curriculum learning. In Algorithm 2 Line 2, what is L?

bandit sampling, deep learning, neurips paper, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Review for NeurIPS paper: Adam with Bandit Sampling for Deep Learning

Neural Information Processing SystemsJan-23-2025, 13:08:57 GMT

Authors propose a method for adaptive selection of data points for SGD. Specifically, authors use the ADAM method and extend it to adaptive sampling setting using multi-armed bandit. Proposed method is further analyzed and improvement in the convergence speed is quantified. Extensive empirical results also support the proposed method. All reviewers unanimously recommend accept.

bandit sampling, deep learning, neurips paper

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Adam with Bandit Sampling for Deep Learning

Neural Information Processing SystemsOct-9-2024, 23:55:48 GMT

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities. Experiments on various models and datasets demonstrate Adambs's fast convergence in practice.

bandit sampling, convergence, deep learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Add feedback

Reviews: Coordinate Descent with Bandit Sampling

Neural Information Processing SystemsOct-7-2024, 08:18:24 GMT

The paper introduce a coordinate descent algorithm with an adaptive sampling à la Gauss-Southwell. Based on a descent lemma that quantifies the decay of the objective function when a coordinate is selected, the authors propose the "max_r" strategy that iteratively choose the coordinate that yields to the largest decrease. The paper follows recent developments on coordinate descent notably (Csiba et al 2015), (Nutini et al 2015), (Perekrestenko et al 2017) with an improved convergence bounds. As for previous adaptive sampling, the proposed method require a computational complexity equivalent to a full gradient descent which can be prohibitive in large scale optimization problem. To overcome this issue, the authors propose to learn the best coordinate by approximating the "max_r" strategy.

bandit sampling, computational complexity, coordinate descent, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

Google & J.P. Morgan Propose Advanced Bandit Sampling for Multiplex Networks

#artificialintelligenceFeb-11-2022, 19:29:43 GMT

Graph neural networks (GNNs) have gained popularity in the AI research community due to their impressive performance in high-impact applications such as drug discovery and social network analyses. Most existing studies on GNNs however have focused on "monoplex" settings (networks with only a single type of connection between entities) and not on multiplex settings (multiple types of connections between entities), which reflect many real-world scenarios. In the new paper Bandit Sampling for Multiplex Networks, a team from Google Research and J.P. Morgan AI Research explores the problem of computationally efficient link prediction in the multiplex setting, introducing an algorithm for scalable learning on multiplex networks with a large number of layers. In evaluations, the proposed method is shown to improve efficiency over prior work such as Multiplex Network Embedding (MNE, Zhang et al., 2018) and the DEEPLEX layer-sampling approach (Potluru et al., 2020). The multiplex network problem can be considered as a graph with many layers, where each layer has nodes neighbouring other layers.

bandit sampling, morgan propose advanced bandit sampling, multiplex network, (8 more...)

#artificialintelligence

Industry: Information Technology (0.58)

Technology:

Information Technology > Communications > Social Media (0.58)
Information Technology > Communications > Networks (0.58)
Information Technology > Data Science > Data Mining (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Adam with Bandit Sampling for Deep Learning

Liu, Rui, Wu, Tianyi, Mozafari, Barzan

arXiv.org Machine LearningOct-24-2020

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities. We theoretically show that Adambs improves the convergence rate of Adam---$O(\sqrt{\frac{\log n}{T} })$ instead of $O(\sqrt{\frac{n}{T}})$ in some cases. Experiments on various models and datasets demonstrate Adambs's fast convergence in practice.

computer based training, convergence rate, upstream oil & gas, (17 more...)

arXiv.org Machine Learning

2010.12986

Country: