Perceptual Kalman Filters: Online State Estimation under a Perfect Perceptual-Quality Constraint
Many practical settings call for the reconstruction of temporal signals from corrupted or missing data. Classic examples include decoding, tracking, signal enhancement and denoising. Since the reconstructed signals are ultimately viewed by humans, it is desirable to achieve reconstructions that are pleasing to human perception. Mathematically, perfect perceptual-quality is achieved when the distribution of restored signals is the same as that of natural signals, a requirement which has been heavily researched in static estimation settings (i.e. when a whole signal is processed at once). Here, we study the problem of optimal causal filtering under a perfect perceptual-quality constraint, which is a task of fundamentally different nature.
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly--practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work asks whether it is possible to improve prompt-based learning without additional labeled data. We approach this problem by attempting to modify the predictions of a prompt, rather than the prompt itself. Our intuition is that accurate predictions should also be consistent: samples which are similar under some feature representation should receive the same prompt prediction.
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback Marcel Torne 1,2 Max Balsells 3 Zihan Wang 3
Exploration and reward specification are fundamental and intertwined challenges for reinforcement learning. Solving sequential decision-making tasks requiring expansive exploration requires either careful design of reward functions or the use of novelty-seeking exploration bonuses. Human supervisors can provide effective guidance in the loop to direct the exploration process, but prior methods to leverage this guidance require constant synchronous high-quality human feedback, which is expensive and impractical to obtain. In this work, we present a technique called Human Guided Exploration (HuGE), which uses low-quality feedback from non-expert users that may be sporadic, asynchronous, and noisy.
A Proof of Proposition 2.5
Proposition 2.5 is a direct consequence of the following lemma (remember that rh() =[@h()] Assume that @h() () =0for all 2 . Let us first show the direct inclusion. Now let us show the converse inclusion. Y. Let us show that [@ By taking X = x and Y = y (i.e. a data set of one feature and one target), one has still by chain rules rE We recall (cf Example 2.10 and Example 2.11) that linear and 2-layer ReLU neural networks satisfy Assumption 2.9, which we recall reads as: Assumption 2.9 (Local reparameterization) For each parameter To proceed further we will rely on the following lemma that shows a direct consequence of (9) (in addition to Assumption 2.9 on the model g(,)). Under Assumption 2.9, considering a loss `(z,y) such that `(,y) is C Before proceeding to the proof of Lemma 2.13 and Theorem 2.14, let us show that (9) holds for standard ML losses.
Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows
Understanding the geometric properties of gradient descent dynamics is a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This "implicit bias" is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties. The purpose of this article is threefold. First, we rigorously expose the definition and basic properties of "conservation laws", that define quantities conserved during gradient flows of a given model (e.g. of a ReLU network with a given architecture) with any training data and any loss.
Explore to Generalize in Zero-Shot RL
We study zero-shot generalization in reinforcement learning--optimizing a policy on a set of training tasks to perform well on a similar but unseen test task. To mitigate overfitting, previous work explored different notions of invariance to the task. However, on problems such as the ProcGen Maze, an adequate solution that is invariant to the task visualization does not exist, and therefore invariance-based approaches fail. Our insight is that learning a policy that effectively explores the domain is harder to memorize than a policy that maximizes reward for a specific task, and therefore we expect such learned behavior to generalize well; we indeed demonstrate this empirically on several domains that are difficult for invariancebased approaches. Our Explore to Generalize algorithm (ExpGen) builds on this insight: we train an additional ensemble of agents that optimize reward. At test time, either the ensemble agrees on an action, and we generalize well, or we take exploratory actions, which generalize well and drive us to a novel part of the state space, where the ensemble may potentially agree again. We show that our approach is the state-of-the-art on tasks of the ProcGen challenge that have thus far eluded effective generalization, yielding a success rate of 83% on the Maze task and 74% on Heist with 200 training levels. ExpGen can also be combined with an invariance based approach to gain the best of both worlds, setting new state-of-the-art results on ProcGen.
Going beyond persistent homology using persistent homology
Representational limits of message-passing graph neural networks (MP-GNNs), e.g., in terms of the Weisfeiler-Leman (WL) test for isomorphism, are well understood. Augmenting these graph models with topological features via persistent homology (PH) has gained prominence, but identifying the class of attributed graphs that PH can recognize remains open. We introduce a novel concept of color-separating sets to provide a complete resolution to this important problem. Specifically, we establish the necessary and sufficient conditions for distinguishing graphs based on the persistence of their connected components, obtained from filter functions on vertex and edge colors. Our constructions expose the limits of vertexand edge-level PH, proving that neither category subsumes the other. Leveraging these theoretical insights, we propose RePHINE for learning topological features on graphs. RePHINE efficiently combines vertex-and edge-level PH, achieving a scheme that is provably more powerful than both. Integrating RePHINE into MP-GNNs boosts their expressive power, resulting in gains over standard PH on several benchmarks for graph classification.
Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation - Supplementary Material - Hao Shi 1,3 Wei Feng 1,2
In our implementation, we adopt a warm-up strategy to pretrain the backbone network for 50,000 iterations. After the warm-up phase, we generate pseudo annotations for the training data in target domain and determine the initial values of category-key features based on source images. Next, we initialize the representative-key features and discrepancy features of each set in OLDM using a First-Input, Fist-Output (FIFO) queue. To ensure comprehensive initialization of all sets in OLDM, this phase extends over 4000 iterations. Upon completion, our method empowers the execution of Discrepancy Memorization and Style Compensation. In this section, we provide an extensive and comprehensive display of experiments.