Goto

Collaborating Authors

 Berariu, Tudor


Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search

arXiv.org Artificial Intelligence

This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erd\H{o}s, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum -- jump-starting the search for larger graphs using good graphs found at smaller sizes -- we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs.


A study on the plasticity of neural networks

arXiv.org Artificial Intelligence

For example, PackNet (Mallya & Lazebnik, 2017) eventually One aim shared by multiple settings, such as continual gets to a point where all neurons are frozen and learning is learning or transfer learning, is to leverage not possible anymore. In the same fashion, accumulating previously acquired knowledge to converge faster constraints in EWC (Kirkpatrick et al., 2017) might lead on the current task. Usually this is done through to a strongly regularised objective that does not allow for fine-tuning, where an implicit assumption is that the new task's loss to be minimised. Alternatively, learning the network maintains its plasticity, meaning that might become less data efficient, referred to as negative the performance it can reach on any given task is forward transfer, an effect often noticed for regularisation not affected negatively by previously seen tasks.


When Does Re-initialization Work?

arXiv.org Artificial Intelligence

Re-initializing a neural network during training has been observed to improve generalization in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used in state-of-the-art training protocols. This raises the question of when re-initialization works, and whether it should be used together with regularization techniques such as data augmentation, weight decay and learning rate schedules. In this work, we conduct an extensive empirical comparison of standard training with a selection of re-initialization methods to answer this question, training over 15,000 models on a variety of image classification benchmarks. We first establish that such methods are consistently beneficial for generalization in the absence of any other regularization. However, when deployed alongside other carefully tuned regularization techniques, re-initialization methods offer little to no added benefit for generalization, although optimal generalization performance becomes less sensitive to the choice of learning rate and weight decay hyperparameters. To investigate the impact of re-initialization methods on noisy data, we also consider learning under label noise. Surprisingly, in this case, re-initialization significantly improves upon standard training, even in the presence of other carefully tuned regularization techniques.


Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective

arXiv.org Artificial Intelligence

Most of the recent deep reinforcement learning advances take an RL-centric perspective and focus on refinements of the training objective. We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. Constraining the Lipschitz constant of a single layer using spectral normalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of a more elaborated \rainbow{} agent on the challenging Atari domain. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics and show that is sufficient to modulate the parameter updates to recover most of the performance of spectral normalisation. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.