herding
TEAL: New Selection Strategy for Small Buffers in Experience Replay Class Incremental Learning
Shaul-Ariel, Shahar, Weinshall, Daphna
Continual Learning is an unresolved challenge, whose relevance increases when considering modern applications. Unlike the human brain, trained deep neural networks suffer from a phenomenon called Catastrophic Forgetting, where they progressively lose previously acquired knowledge upon learning new tasks. To mitigate this problem, numerous methods have been developed, many relying on replaying past exemplars during new task training. However, as the memory allocated for replay decreases, the effectiveness of these approaches diminishes. On the other hand, maintaining a large memory for the purpose of replay is inefficient and often impractical. Here we introduce TEAL, a novel approach to populate the memory with exemplars, that can be integrated with various experience-replay methods and significantly enhance their performance on small memory buffers. We show that TEAL improves the average accuracy of the SOTA method XDER as well as ER and ER-ACE on several image recognition benchmarks, with a small memory buffer of 1-3 exemplars per class in the final task. This confirms the hypothesis that when memory is scarce, it is best to prioritize the most typical data.
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
Multi-robot Implicit Control of Massive Herds
Sebastian, Eduardo, Montijano, Eduardo, Sagues, Carlos
This paper solves the problem of herding countless evaders by means of a few robots. The objective is to steer all the evaders towards a desired tracking reference while avoiding escapes. The problem is very challenging due to the highly complex repulsive evaders' dynamics and the underdetermined states to control. We propose a solution that is based on Implicit Control and a novel dynamic assignment strategy to select the evaders to be directly controlled. The former is a general technique that explicitly computes control inputs even in highly complex input-nonaffine dynamics. The latter is built upon a convex-hull dynamic clustering inspired by the Voronoi tessellation problem. The combination of both allows to choose the best evaders to directly control, while the others are indirectly controlled by exploiting the repulsive interactions among them. Simulations show that massive herds can be herd throughout complex patterns by means of a few herders.
- Europe > Spain (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Entropic Herding
Yamashita, Hiroshi, Suzuki, Hideyuki, Aihara, Kazuyuki
Herding is a deterministic algorithm used to generate data points that can be regarded as random samples satisfying input moment conditions. The algorithm is based on the complex behavior of a high-dimensional dynamical system and is inspired by the maximum entropy principle of statistical inference. In this paper, we propose an extension of the herding algorithm, called entropic herding, which generates a sequence of distributions instead of points. Entropic herding is derived as the optimization of the target function obtained from the maximum entropy principle. Using the proposed entropic herding algorithm as a framework, we discuss a closer connection between herding and the maximum entropy principle. Specifically, we interpret the original herding algorithm as a tractable version of entropic herding, the ideal output distribution of which is mathematically represented. We further discuss how the complex behavior of the herding algorithm contributes to optimization. We argue that the proposed entropic herding algorithm extends the application of herding to probabilistic modeling. In contrast to original herding, entropic herding can generate a smooth distribution such that both efficient probability density calculation and sample generation become possible. To demonstrate the viability of these arguments in this study, numerical experiments were conducted, including a comparison with other conventional methods, on both synthetic and real data.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as "samples" from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically. Herding may be interpreted as a"perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to models with latent variables and to a discriminative learning setting. The perceptron cycling theorem ensures that the fast moment matching property is preserved in the more general framework.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Oregon > Benton County > Corvallis (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- (3 more...)
An Average Classification Algorithm
van Rooyen, Brendan, Menon, Aditya Krishna, Williamson, Robert C.
Many classification algorithms produce a classifier that is a weighted average of kernel evaluations. When working with a high or infinite dimensional kernel, it is imperative for speed of evaluation and storage issues that as few training samples as possible are used in the kernel expansion. Popular existing approaches focus on altering standard learning algorithms, such as the Support Vector Machine, to induce sparsity, as well as post-hoc procedures for sparse approximations. Here we adopt the latter approach. We begin with a very simple classifier, given by the kernel mean $$ f(x) = \frac{1}{n} \sum\limits_{i=i}^{n} y_i K(x_i,x) $$ We then find a sparse approximation to this kernel mean via herding. The result is an accurate, easily parallelized algorithm for learning classifiers.
- Oceania > Australia > Queensland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Herding the Crowd: Automated Planning for Crowdsourced Planning
Talamadupula, Kartik (Arizona State University) | Kambhampati, Subbarao (Arizona State University) | Hu, Yuheng (Arizona State University) | Nguyen, Tuan Anh (Arizona State University) | Zhuo, Hankz Hankui (Sun Yat-sen University, Guangzhou, China)
An important application of human computation is crowdsourced planning and scheduling. In this paper, we present an architecture for an automated system that can significantly improve the effectiveness of the crowd in collaborating and coming up with effective plans by herding it. We define two main problems that have to be solved when designing such automated crowd-herding systems: interpretation, and steering; and discuss how automated planning techniques can be used to solve these problems.
Super-Samples from Kernel Herding
Chen, Yutian, Welling, Max, Smola, Alex
We extend the herding algorithm to continuous spaces by using the kernel trick. The resulting "kernel herding" algorithm is an infinite memory deterministic process that learns to approximate a PDF with a collection of samples. We show that kernel herding decreases the error of expectations of functions in the Hilbert space at a rate O(1/T) which is much faster than the usual O(1/pT) for iid random samples. We illustrate kernel herding by approximating Bayesian predictive distributions.
- North America > United States > California > Orange County > Irvine (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)