Goto

Collaborating Authors

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations Luis Lara

Neural Information Processing Systems

An image editing model should be able to perform diverse edits, ranging from object replacement, changing attributes or style, to performing actions or movement, which require many forms of reasoning. Current general instruction-guided editing models have significant shortcomings with action and reasoning-centric edits. Object, attribute or stylistic changes can be learned from visually static datasets. On the other hand, high-quality data for action and reasoning-centric edits is scarce and has to come from entirely different sources that cover e.g.


Randomized Truthful Auctions with Learning Agents

Neural Information Processing Systems

We study a setting where agents use no-regret learning algorithms to participate in repeated auctions. Kolumbus and Nisan (2022a) showed, rather surprisingly, that when bidders participate in second-price auctions using no-regret bidding algorithms, no matter how large the number of interactions T is, the runner-up bidder may not converge to bidding truthfully. Our first result shows that this holds for general deterministic truthful auctions. We also show that the ratio of the learning rates of the bidders can qualitatively affect the convergence of the bidders. Next, we consider the problem of revenue maximization in this environment. In the setting with fully rational bidders, Myerson (1981) showed that revenue can be maximized by using a second-price auction with reserves. We show that, in stark contrast, in our setting with learning bidders, randomized auctions can have strictly better revenue guarantees than second-price auctions with reserves, when T is large enough. Finally, we study revenue maximization in the non-asymptotic regime. We define a notion of auctioneer regret comparing the revenue generated to the revenue of a second price auction with truthful bids.


Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

Neural Information Processing Systems

In Domain Generalization (DG) settings, models trained independently on a given set of training domains have notoriously chaotic performance on distribution shifted test domains, and stochasticity in optimization (e.g.


We would like to thank all the reviewers for writing the insightful comments, especially during this difficult time

Neural Information Processing Systems

Table 1: Starting from one auxiliary task (Exemplar-MT), we keep increasing the number of auxiliary tasks from one to four by adding one Table 2: Effect of ARML on harmful tasks. We would like to thank all the reviewers for writing the insightful comments, especially during this difficult time. By'covered', we mean p(T R1.2 - Our contribution and difference from previous works: Thanks. However, lowering training loss may cause overfitting, especially when training data is scarce. The superiority of ARML is verified in the experiments.


Soft ascent-descent as a stable and flexible alternative to flooding

Neural Information Processing Systems

As a heuristic for improving test accuracy in classification, the "flooding" method proposed by Ishida et al. (2020) sets a threshold for the average surrogate loss at training time; above the threshold, gradient descent is run as usual, but below the threshold, a switch to gradient ascent is made. While setting the threshold is nontrivial and is usually done with validation data, this simple technique has proved remarkably effective in terms of accuracy. On the other hand, what if we are also interested in other metrics such as model complexity or average surrogate loss at test time? As an attempt to achieve better overall performance with less fine-tuning, we propose a softened, pointwise mechanism called SoftAD (soft ascent-descent) that downweights points on the borderline, limits the effects of outliers, and retains the ascent-descent effect of flooding, with no additional computational overhead. We contrast formal stationarity guarantees with those for flooding, and empirically demonstrate how SoftAD can realize classification accuracy competitive with flooding (and the more expensive alternative SAM) while enjoying a much smaller loss generalization gap and model norm.


Learnability Matters: Active Learning for Video Captioning

Neural Information Processing Systems

This work focuses on the active learning in video captioning. In particular, we propose to address the learnability problem in active learning, which has been brought up by collective outliers in video captioning and neglected in the literature. To start with, we conduct a comprehensive study of collective outliers, exploring their hard-to-learn property and concluding that ground truth inconsistency is one of the main causes. Motivated by this, we design a novel active learning algorithm that takes three complementary aspects, namely learnability, diversity, and uncertainty, into account. Ideally, learnability is reflected by ground truth consistency. Under the active learning scenario where ground truths are not available until human involvement, we measure the consistency on estimated ground truths, where predictions from off-the-shelf models are utilized as approximations to ground truths. These predictions are further used to estimate sample frequency and reliability, evincing the diversity and uncertainty respectively. With the help of our novel caption-wise active learning protocol, our algorithm is capable of leveraging knowledge from humans in a more effective yet intellectual manner. Results on publicly available video captioning datasets with diverse video captioning models demonstrate that our algorithm outperforms SOTA active learning methods by a large margin,e.g.we achieve about 103% of full performance on CIDEr with 25% of human annotations on MSR-VTT.


4f20f7f5d2e7a1b640ebc8244428558c-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their valuable comments and suggestions, which will help us to improve the paper. While the usual "twin-delay" only involves minimization of two scalars, the one in IDAC Besides, we introduce SIA and an asymptotic lower bound for entropy estimation. These details will be added. R2: 1) We will clarify sorting is applied to each individual vector. To help address R3's concern, we This task requires extensive exploration to reach the optimal solution.


GC-Bench: An Open and Unified Benchmark for Graph Condensation Appendix

Neural Information Processing Systems

We also provide public access to the official algorithm implementations. "KRR" is short for Kernel Ridge Regression and "CTC" is short for computation tree compression. "GNN" is short for Graph, "GNTK" is short for graph neural tangent kernel, "SD" is short for spectral decomposition. "NC" is short for node classification, "LP" is short for link prediction, "AD" is short for anomaly detection, and "GC" is short for graph classification.


GC-Bench: An Open and Unified Benchmark for Graph Condensation

Neural Information Processing Systems

Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehensive evaluation and in-depth analysis, which creates a great obstacle to understanding the progress in this field. To fill this gap, we develop a comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation in different scenarios systematically. Specifically, GC-Bench systematically investigates the characteristics of graph condensation in terms of the following dimensions: effectiveness, transferability, and complexity. We comprehensively evaluate 12 state-of-the-art graph condensation algorithms in node-level and graphlevel tasks and analyze their performance in 12 diverse graph datasets. Further, we have developed an easy-to-use library for training and evaluating different GC methods to facilitate reproducible research.


Tractable Latent State Inference for Hidden Continuous-Time semi-Markov Chains Supplement Source Code at: https: //anonymous.4open.science/r/BFghJAGII31

Neural Information Processing Systems

We will first replicate an equation similar to (20) for the backward case. The derivation is similar to that of the forward equation, so that it uses a combination of equations (16), (18) and (19) while leaving out the observation likelihood function. The combination is again carried out using the Laplace transform.