Plotting


253f7b5d921338af34da817c00f42753-AuthorFeedback.pdf

Neural Information Processing Systems

Summary We would like to thank the entire review team for their efforts and insightful comments. DZPS18] ([DZPS18] refers to arXiv:1810.02054) approach zero (i.e., 0) as the sample size n . ImageNet dataset has 14 million images. For those applications, a non-diminishing convergence rate is more desirable. By Eq. (4), we know ลท Response to the concern on fixed second layer.



Rethinking Score Distillation as a Bridge Between Image Distributions David McAllister 1 Songwei Ge2 Jia-Bin Huang 2 David W. Jacobs 2

Neural Information Processing Systems

Score distillation sampling (SDS) has proven to be an important tool, enabling the use of large-scale diffusion priors for tasks operating in data-poor domains. Unfortunately, SDS has a number of characteristic artifacts that limit its usefulness in general-purpose applications. In this paper, we make progress toward understanding the behavior of SDS and its variants by viewing them as solving an optimal-cost transport path from a source distribution to a target distribution. Under this new interpretation, these methods seek to transport corrupted images (source) to the natural image distribution (target). We argue that current methods' characteristic artifacts are caused by (1) linear approximation of the optimal path and (2) poor estimates of the source distribution. We show that calibrating the text conditioning of the source distribution can produce high-quality generation and translation results with little extra overhead. Our method can be easily applied across many domains, matching or beating the performance of specialized methods. We demonstrate its utility in text-to-2D, text-based NeRF optimization, translating paintings to real images, optical illusion generation, and 3D sketch-to-real. We compare our method to existing approaches for score distillation sampling and show that it can produce high-frequency details with realistic colors.


DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales

Neural Information Processing Systems

In other words, if the input f(t) is composed of discrete events (as in the top panel of Figure 1), the memory representation of a particular event stored in f becomes more "fuzzy" as the time elapses. After enough time has elapsed, the events that were presented close in time will gradually blend together, as illustrated in the bottom panel of Figure 1. The top panel shows the one dimensional input signal consisting of a long, then short, then long pulse, with the activity of f shown at three different points in time in the panels below. This allows each SITH layer access to the entire compressed history at every time step without having to learn how long to maintain information from the past. A fifth parameter, dt, is the rate at which the input is "presented" to the network, which we set this value to 1 in all our experiments to indicate the input signal is being presented at the rate of 1 Hz. Figure 1: SITH layer compresses history leading up to the present Top A signal featuring a long, then short, then long pulse, separated by moments of no activation is the input to a SITH layer.


I just watched Gmail generate AI responses for me - and they were scarily accurate

ZDNet

The Google I/O keynote took place earlier this week, and the company took the stage to unveil new features across all of its product offerings. This included AI upgrades to the Google Workspace suite of applications, which millions of users rely on every day to get their work done, including Google Docs, Meet, Slides, Gmail, and Vids. Also: Google's popular AI tool gets its own Android app - how to use NotebookLM on your phone The features unveiled this year focused on practicality. They embed AI features into the Google apps you already use every day to speed up your daily workflow by performing tedious and time-consuming tasks, such as cleaning out your inbox. Everyone can relate to being bombarded with emails.




Conditional score-based diffusion models for Bayesian inference in infinite dimensions

Neural Information Processing Systems

Since their initial introduction, score-based diffusion models (SDMs) have been successfully applied to solve a variety of linear inverse problems in finite-dimensional vector spaces due to their ability to efficiently approximate the posterior distribution. However, using SDMs for inverse problems in infinite-dimensional function spaces has only been addressed recently, primarily through methods that learn the unconditional score. While this approach is advantageous for some inverse problems, it is mostly heuristic and involves numerous computationally costly forward operator evaluations during posterior sampling. To address these limitations, we propose a theoretically grounded method for sampling from the posterior of infinite-dimensional Bayesian linear inverse problems based on amortized conditional SDMs. In particular, we prove that one of the most successful approaches for estimating the conditional score in finite dimensions--the conditional denoising estimator--can also be applied in infinite dimensions. A significant part of our analysis is dedicated to demonstrating that extending infinite-dimensional SDMs to the conditional setting requires careful consideration, as the conditional score typically blows up for small times, contrarily to the unconditional score. We conclude by presenting stylized and large-scale numerical examples that validate our approach, offer additional insights, and demonstrate that our method enables large-scale, discretization-invariant Bayesian inference.


Improved Convergence in High Probability of Clipped Gradient Methods with Heavy Tailed Noise Thien Hang Nguyen Department of Computer Science Khoury College of Computer Sciences Boston University

Neural Information Processing Systems

In this work, we study the convergence in high probability of clipped gradient methods when the noise distribution has heavy tails, i.e., with bounded pth moments, for some 1 < p 2. Prior works in this setting follow the same recipe of using concentration inequalities and an inductive argument with union bound to bound the iterates across all iterations. This method results in an increase in the failure probability by a factor of T, where T is the number of iterations. We instead propose a new analysis approach based on bounding the moment generating function of a well chosen supermartingale sequence. We improve the dependency on T in the convergence guarantee for a wide range of algorithms with clipped gradients, including stochastic (accelerated) mirror descent for convex objectives and stochastic gradient descent for nonconvex objectives. Our high probability bounds achieve the optimal convergence rates and match the best currently known in-expectation bounds. Our approach naturally allows the algorithms to use time-varying step sizes and clipping parameters when the time horizon is unknown, which appears difficult or even impossible using existing techniques from prior works. Furthermore, we show that in the case of clipped stochastic mirror descent, several problem constants, including the initial distance to the optimum, are not required when setting step sizes and clipping parameters.