Information Technology
Contribution of each component in RPS-Net # Parameters vs Tasks RPS-Net vs iCARL for different #examplars Progressive Nets RPS-Net iCARL
We thank the reviewers for the constructive feedback. Code will be made public. Fig. (a, b, c) best viewed in zoom. R2.1: Difference from PathNet: Our RPS-Net is inspired by PathNet, yet there are notable differences: 1) Architecture: However, for our case i.e., 10+ tasks, PathNet is not feasible, due to a large number See R3.1 for comparison between random selection and genetic algorithms. R2.2: Impact of Varying Examplars: Fig (c) compares RPS-Net with the best existing method (iCARL) for various Our proposed RPS-Net consistently performs better across all budgets.
83715fd4755b33f9c3958e1a9ee221e1-AuthorFeedback.pdf
We appreciate the reviewers' efforts and suggestions (in blue)! We will answer the shared question and then reply to each reviewer. Tasks may prefer different distance metrics, but most physical systems have their own predefined ones, e.g., for We will add discussion of those works. For fair comparison, we need to modify either HER/CHER or baselines, like [Nair et. How sensitive is the performance to this parameter?
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
Most existing prompting methods suffer from the issues of generalizability and consistency, as they often rely on instance-specific solutions that may not be applicable to other instances and lack task-level consistency across the selected few-shot examples. To address these limitations, we propose a comprehensive framework, StrategyLLM, allowing LLMs to perform inductive reasoning, deriving general strategies from specific task instances, and deductive reasoning, applying these general strategies to particular task examples, for constructing generalizable and consistent few-shot prompts. It employs four LLM-based agents: strategy generator, executor, optimizer, and evaluator, working together to generate, evaluate, and select promising strategies for a given task. Experimental results demonstrate that StrategyLLM outperforms the competitive baseline CoT-SC that requires human-annotated solutions on 13 datasets across 4 challenging tasks without human involvement, including math reasoning (34.2% 38.8%), commonsense reasoning (70.3% 72.5%), algorithmic reasoning (73.7% 85.0%), and symbolic reasoning (30.0%
Latent Intrinsics Emerge from Training to Relight William Gao
Image relighting is the task of showing what a scene from a source image would look like if illuminated differently. Inverse graphics schemes recover an explicit representation of geometry and a set of chosen intrinsics, then relight with some form of renderer. However error control for inverse graphics is difficult, and inverse graphics methods can represent only the effects of the chosen intrinsics. This paper describes a relighting method that is entirely data-driven, where intrinsics and lighting are each represented as latent variables. Our approach produces SOTA relightings of real scenes, as measured by standard metrics. We show that albedo can be recovered from our latent intrinsics without using any example albedos, and that the albedos recovered are competitive with SOTA methods.
Categorized Bandits
We introduce a new stochastic multi-armed bandit setting where arms are grouped inside "ordered" categories. The motivating example comes from e-commerce, where a customer typically has a greater appetence for items of a specific wellidentified but unknown category than any other one. We introduce three concepts of ordering between categories, inspired by stochastic dominance between random variables, which are gradually weaker so that more and more bandit scenarios satisfy at least one of them. We first prove instance-dependent lower bounds on the cumulative regret for each of these models, indicating how the complexity of the bandit problems increases with the generality of the ordering concept considered. We also provide algorithms that fully leverage the structure of the model with their associated theoretical guarantees. Finally, we have conducted an analysis on real data to highlight that those ordered categories actually exist in practice.
The Power of Hard Attention Transformers on Data Sequences: A Formal Language Theoretic Perspective Chris Kรถcher RPTU Kaiserslautern-Landau
Formal language theory has recently been successfully employed to unravel the power of transformer encoders. This setting is primarily applicable in Natural Language Processing (NLP), as a token embedding function (where a bounded number of tokens is admitted) is first applied before feeding the input to the transformer.
Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation
Cross-domain few-shot segmentation (CD-FSS) is proposed to first pre-train the model on a large-scale source-domain dataset, and then transfer the model to data-scarce target-domain datasets for pixel-level segmentation. The significant domain gap between the source and target datasets leads to a sharp decline in the performance of existing few-shot segmentation (FSS) methods in cross-domain scenarios. In this work, we discover an intriguing phenomenon: simply filtering different frequency components for target domains can lead to a significant performance improvement, sometimes even as high as 14% mIoU. Then, we delve into this phenomenon for an interpretation, and find such improvements stem from the reduced inter-channel correlation in feature maps, which benefits CD-FSS with enhanced robustness against domain gaps and larger activated regions for segmentation. Based on this, we propose a lightweight frequency masker, which further reduces channel correlations by an Amplitude-Phase Masker (APM) module and an Adaptive Channel Phase Attention (ACPA) module. Notably, APM introduces only 0.01% additional parameters but improves the average performance by over 10%, and ACPA imports only 2.5% parameters but further improves the performance by over 1.5%, which significantly surpasses the state-of-the-art CD-FSS methods.
Max-value Entropy Search for Multi-Objective Bayesian Optimization
Syrine Belakaria, Aryan Deshwal, Janardhan Rao Doppa
We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations, where the goal is to approximate the true pareto-set of solutions by minimizing the number of function evaluations. For example, in hardware design optimization, we need to find the designs that trade-off performance, energy, and area overhead using expensive computational simulations. In this paper, we propose a novel approach referred as Max-value Entropy Search for Multi-objective Optimization (MESMO) to solve this problem. MESMO employs an output-space entropy based acquisition function to efficiently select the sequence of inputs for evaluation to quickly uncover high-quality pareto-set solutions. We also provide theoretical analysis to characterize the efficacy of MESMO. Our experiments on several synthetic and real-world benchmark problems show that MESMO consistently outperforms the state-of-the-art algorithms.