Plotting


6091bf1542b118287db4088bc16be8d9-Supplemental-Conference.pdf

Neural Information Processing Systems

These challenges have spawned the new task of'Subject-Drive Text-to-Image Generation', which is the core task of our paper aims to solve. As a result, it leaves us with 10M image clusters. We then apply the pre-trained CLIP ViT-L14 model [39] to filter out 81.1% of clusters that has the Though the mined clusters already contain (image, alt-text) information, the alt-text's noise level is For example, the generation model believes'teapot' should contain a SuTI's in-context generation that demonstrates its skill set. Results generated from a single model. Subject (image, text) and editing key words are annotated, with detailed template in the Appendix.


Subject-driven Text-to-Image Generation via Apprenticeship Learning Wenhu Chen Yandong Li Nataniel Ruiz

Neural Information Processing Systems

Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an "expert model" for a given subject from a few examples. However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replaces subject-specific fine tuning with in-context learning. Given a few demonstrations of a new subject, SuTI can instantly generate novel renditions of the subject in different scenes, without any subject-specific optimization. SuTI is powered by apprenticeship learning, where a single apprentice model is learned from data generated by a massive number of subject-specific expert models. Specifically, we mine millions of image clusters from the Internet, each centered around a specific visual subject.


Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts

Neural Information Processing Systems

Leveraging the model's outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground-truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias, especially under the natural shift.


Higher-Order Uncoupled Dynamics Do Not Lead to Nash Equilibrium -- Except When They Do

Neural Information Processing Systems

The framework of multi-agent learning explores the dynamics of how an agent's strategies evolve in response to the evolving strategies of other agents. Of particular interest is whether or not agent strategies converge to well-known solution concepts such as Nash Equilibrium (NE). In "higher-order" learning, agent dynamics include auxiliary states that can capture phenomena such as path dependencies. We introduce higher-order gradient play dynamics that resemble projected gradient ascent with auxiliary states. The dynamics are "payoff-based" and "uncoupled" in that each agent's dynamics depend on its own evolving payoff and has no explicit dependence on the utilities of other agents. We first show that for any polymatrix game with an isolated completely mixed-strategy NE, there exist higher-order gradient play dynamics that lead (locally) to that NE, both for the specific game and nearby games with perturbed utility functions. Conversely, we show that for any higher-order gradient play dynamics, there exists a game with a unique isolated completely mixed-strategy NE for which the dynamics do not lead to NE. Finally, we show that convergence to the mixed-strategy equilibrium in coordination games can come at the expense of the dynamics being inherently internally unstable.



Proportional Response: Contextual Bandits for Simple and Cumulative Regret Minimization

Neural Information Processing Systems

In many applications, e.g. in healthcare and e-commerce, the goal of a contextual bandit may be to learn an optimal treatment assignment policy at the end of the experiment. That is, to minimize simple regret. However, this objective remains understudied. We propose a new family of computationally efficient bandit algorithms for the stochastic contextual bandit setting, where a tuning parameter determines the weight placed on cumulative regret minimization (where we establish near-optimal minimax guarantees) versus simple regret minimization (where we establish state-of-the-art guarantees). Our algorithms work with any function class, are robust to model misspecification, and can be used in continuous arm settings. This flexibility comes from constructing and relying on "conformal arm sets" (CASs). CASs provide a set of arms for every context, encompassing the contextspecific optimal arm with a certain probability across the context distribution. Our positive results on simple and cumulative regret guarantees are contrasted with a negative result, which shows that no algorithm can achieve instance-dependent simple regret guarantees while simultaneously achieving minimax optimal cumulative regret guarantees.



LithoBench: Benchmarking AI Computational Lithography for Semiconductor Manufacturing Supplementary Materials

Neural Information Processing Systems

In addition to the data and data loaders, LithoBench also provides functionalities that can facilitate the development of DNN-based and traditional ILT algorithms. Based on PyTorch [1] and OpenILT [2], we implement the reference lithography simulation model as a PyTorch module, which can be used like a DNN layer. The GPU-based fast Fourier transform (FFT) can boost the speed of lithography simulation. PyTorch optimizers can be directly employed to optimize the masks according to ILT loss functions, significantly simplifying the development of ILT algorithms. To evaluate ILT results, LithoBench provides a simple interface to measure the L2 loss, PVB, EPE, and shots of the output masks.