Plotting

FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space

Neural Information Processing Systems

However, existing watermarking methods fall short in robustness against regeneration attacks. In this paper, we propose a novel method called FreqMark that involves unconstrained optimization of the image latent frequency space obtained after VAE encoding.


Spectral Modification of Graphs for Improved Spectral Clustering

Neural Information Processing Systems

Spectral clustering algorithms provide approximate solutions to hard optimization problems that formulate graph partitioning in terms of the graph conductance. It is well understood that the quality of these approximate solutions is negatively affected by a possibly significant gap between the conductance and the second eigenvalue of the graph. In this paper we show that for any graph G, there exists a'spectral maximizer' graph H which is cut-similar to G, but has eigenvalues that are near the theoretical limit implied by the cut structure of G. Applying then spectral clustering on H has the potential to produce improved cuts that also exist in G due to the cut similarity. This leads to the second contribution of this work: we describe a practical spectral modification algorithm that raises the eigenvalues of the input graph, while preserving its cuts. Combined with spectral clustering on the modified graph, this yields demonstrably improved cuts.


unclear points and will update the paper accordingly in the final version. architectures from CycleGAN [38]: 9 residual blocks for generator and 4 convolution layers for discriminator. 2. F

Neural Information Processing Systems

We sincerely thank all the reviewers for their insightful comments to help us improve the paper. To Reviewer #2. 1. Are multiple sources more beneficial? This is largely due to the fact that domain gap also exists among different source domains. We will reorganize the layout of Figure 1 in the main paper to make it more clear. We thank the reviewer for pointing this out.


Evidential Mixture Machines: Deciphering Multi-Label Correlations for Active Learning Sensitivity Minghao Li2 Qi Yu1

Neural Information Processing Systems

Multi-label active learning is a crucial yet challenging area in contemporary machine learning, often complicated by a large and sparse label space. This challenge is further exacerbated in active learning scenarios where labeling resources are constrained. Drawing inspiration from existing mixture of Bernoulli models, which efficiently compress the label space into a more manageable weight coefficient space by learning correlated Bernoulli components, we propose a novel model called Evidential Mixture Machines (EMM). Our model leverages mixture components derived from unsupervised learning in the label space and improves prediction accuracy by predicting weight coefficients following the evidential learning paradigm. These coefficients are aggregated as proxy pseudo counts to enhance component offset predictions. The evidential learning approach provides an uncertainty-aware connection between input features and the predicted coefficients and components. Additionally, our method combines evidential uncertainty with predicted label embedding covariances for active sample selection, creating a richer, multi-source uncertainty metric beyond traditional uncertainty scores. Experiments on synthetic datasets show the effectiveness of evidential uncertainty prediction and EMM's capability to capture label correlations through predicted components. Further testing on real-world datasets demonstrates improved performance compared to existing multi-label active learning methods.


Adversarial Moment-Matching Distillation of Large Language Models

Neural Information Processing Systems

Knowledge distillation (KD) has been shown to be highly effective in guiding a student model with a larger teacher model and achieving practical benefits in improving the computational and memory efficiency for large language models (LLMs). State-of-the-art KD methods for LLMs mostly rely on minimizing explicit distribution distance between teacher and student probability predictions. Instead of optimizing these mandatory behavior cloning objectives, we explore an imitation learning strategy for KD of LLMs. In particular, we minimize the imitation gap by matching the action-value moments of the teacher's behavior from both on-and off-policy perspectives. To achieve this action-value moment-matching goal, we propose an adversarial training algorithm to jointly estimate the momentmatching distance and optimize the student policy to minimize it. Results from both task-agnostic instruction-following experiments and task-specific experiments demonstrate the effectiveness of our method and achieve new state-of-the-art performance.


Budgeted stream-based active learning via adaptive submodular maximization

Neural Information Processing Systems

Active learning enables us to reduce the annotation cost by adaptively selecting unlabeled instances to be labeled. For pool-based active learning, several effective methods with theoretical guarantees have been developed through maximizing some utility function satisfying adaptive submodularity. In contrast, there have been few methods for stream-based active learning based on adaptive submodularity. In this paper, we propose a new class of utility functions, policy-adaptive submodular functions, which includes many existing adaptive submodular functions appearing in real world problems. We provide a general framework based on policy-adaptive submodularity that makes it possible to convert existing poolbased methods to stream-based methods and give theoretical guarantees on their performance. In addition we empirically demonstrate their effectiveness by comparing with existing heuristics on common benchmark datasets.


db98dc0dbafde48e8f74c0de001d35e4-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their useful comments. The reference for the complexity of joint alignment of N time-series via DTW is [4]. Re Eq. (3): indeed, we define ฮธ As suggested, we will drop the LHS. Eq. (4) is correct: as written there, T The complexity of each gradient component is similar and the components are parallelized over. DTAN has 3 HPs: 1 for ฮฉ, and 2 for the prior.


Convolutional Neural Fabrics

Neural Information Processing Systems

Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a "fabric" that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern.


Private Online Learning via Lazy Algorithms

Neural Information Processing Systems

We study the problem of private online learning, focusing on online prediction from experts (OPE) and online convex optimization (OCO). We propose a new transformation that translates lazy, low-switching online learning algorithms into private algorithms. We apply our transformation to differentially private OPE and OCO using existing lazy algorithms for these problems.


Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

Neural Information Processing Systems

Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code. To address this problem, we propose Web2Code, a benchmark consisting of a new large-scale webpage-to-code dataset for instruction tuning and an evaluation framework for the webpage understanding and HTML code translation abilities of MLLMs. For dataset construction, we leverage pretrained LLMs to enhance existing webpageto-code datasets as well as generate a diverse pool of new webpages rendered into images. Specifically, the inputs are webpage images and instructions, while the responses are the webpage's HTML code. We further include diverse natural language QA pairs about the webpage content in the responses to enable a more comprehensive understanding of the web content. To evaluate model performance in these tasks, we develop an evaluation framework for testing MLLMs' abilities in webpage understanding and web-to-code generation. Extensive experiments show that our proposed dataset is beneficial not only to our proposed tasks but also in the general visual domain. We hope our work will contribute to the development of general MLLMs suitable for web-based content generation and task automation.