Plotting

 taxnodes:Technology: Instructional Materials


Supplementary Materials for: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Neural Information Processing Systems

Figure 1 illustrates our feedback models with single-layer and multi-layer structure as indicated in Sections 4.1 and 4.3. We present the pseudocode of one iteration of IDE training in Algorithm 1 to better illustrate our training method. Input: Network parameters ฮธ; Input data x; Label y; Time steps T; Other hyperparameters; Output: Trained network parameters ฮธ. Simulate the SNN by T time steps with input x based on Eq. (2) and calculate the final (weighted) average firing rate a[T ]; Calculate the output o and the loss L based on o and y. Update ฮธ based on the gradient-based optimizer.



Minimax Optimal Online Imitation Learning via Replay Estimation

Neural Information Processing Systems

Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy.


Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

Neural Information Processing Systems

We show that samples can be generated from this modified density by sampling in latent space according to an energy-based model induced by the sum of the latent prior log-density and the discriminator output score. We call this process of running Markov Chain Monte Carlo in the latent space, and then applying the generator function, Discriminator Driven Latent Sampling (DDLS). We show that DDLS is highly efficient compared to previous methods which work in the high-dimensional pixel space, and can be applied to improve on previously trained GANs of many types. We evaluate DDLS on both synthetic and real-world datasets qualitatively and quantitatively. On CIFAR-10, DDLS substantially improves the Inception Score of an off-the-shelf pre-trained SN-GAN [1] from 8.22 to 9.09 which is comparable to the class-conditional BigGAN [2] model. This achieves a new state-of-the-art in the unconditional image synthesis setting without introducing extra parameters or additional training.




A Bandit Learning Algorithm and Applications to Auction Design

Neural Information Processing Systems

We consider online bandit learning in which at every time step, an algorithm has to make a decision and then observe only its reward. The goal is to design efficient (polynomial-time) algorithms that achieve a total reward approximately close to that of the best fixed decision in hindsight. In this paper, we introduce a new notion of (ฮป, ยต)-concave functions and present a bandit learning algorithm that achieves a performance guarantee which is characterized as a function of the concavity parameters ฮป and ยต. The algorithm is based on the mirror descent algorithm in which the update directions follow the gradient of the multilinear extensions of the reward functions. The regret bound induced by our algorithm is ร•( T) which is nearly optimal.



A FineWeb Datasheet Dataset Details Purpose of the dataset

Neural Information Processing Systems

We released FineWeb to make large language model training more accessible to the machine learning community at large. The dataset was curated by Hugging Face. The dataset was funded by Hugging Face. The dataset is released under the Open Data Commons Attribution License (ODC-By) v1.0 license. The use of this dataset is also subject to Common-Crawl's Terms of Use.


No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

Neural Information Processing Systems

We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers'