Goto

Collaborating Authors

 Technology


27aa3aeff0f8460a7b43d30fa6c5c032-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing Systems

Large Language Models (LLMs) are transforming search engines into Conversational Search Engines (CSE). Consequently, Search Engine Optimization (SEO) is being shifted into Conversational Search Engine Optimization (C-SEO). We are beginning to see dedicated C-SEO methods for modifying web documents to increase their visibility in CSE responses. However, they are often tested only for a limited breadth of application domains; we do not know whether certain C-SEO methods would be effective for a broad range of domains. Moreover, existing evaluations consider only a single-actor scenario where only one web document adopts a C-SEO method; in reality, multiple players are likely to competitively adopt the cutting-edge C-SEO techniques, drawing an analogy from the dynamics we have seen in SEO.


Private Training Large-scale Models with Efficient DP-SGD

Neural Information Processing Systems

As large language models (LLMs) increasingly underpin technological advancements, the privacy of their training data emerges as a critical concern. Differential Privacy (DP) serves as a rigorous mechanism to protect this data, yet its integration via Differentially Private Stochastic Gradient Descent (DP-SGD) introduces substantial challenges, primarily due to the complexities of per-sample gradient clipping.


Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals

Neural Information Processing Systems

Estimating motion primitives from video (e.g., optical flow and occlusion) is a critically important computer vision problem with many downstream applications, including controllable video generation and robotics. Current solutions are primarily supervised on synthetic data or require tuning of situation-specific heuristics, which inherently limits these models' capabilities in real-world contexts. A natural solution to transcend these limitations would be to deploy large-scale, selfsupervised video models, which can be trained scalably on unrestricted real-world video datasets. However, despite recent progress, motion-primitive extraction from large pretrained video models remains relatively underexplored. In this work, we describe Opt-CWM, a self-supervised flow and occlusion estimation technique from a pretrained video prediction model. Opt-CWM uses "counterfactual probes" to extract motion information from a base video model in a zero-shot fashion. The key problem we solve is optimizing the quality of these probes, using a combination of an efficient parameterization of the space counterfactual probes, together with a novel generic sparse-prediction principle for learning the probe-generation parameters in a self-supervised fashion. Opt-CWM achieves state-of-the-art performance for motion estimation on real-world videos while requiring no labeled data. 1


Fast Rank-1 Lattice Targeted Sampling for Black-box Optimization Anonymous Author(s) Affiliation Address email

Neural Information Processing Systems

Black-box optimization has gained great attention for its success in recent ap-1 plications. However, scaling up to high-dimensional problems with good query2 efficiency remains challenging. This paper proposes a novel Rank-1 Lattice Tar-3 geted Sampling (RLTS) technique to address this issue. Our RLTS benefits from4 random rank-1 lattice Quasi-Monte Carlo, which enables us to perform fast local5 exact Gaussian processes (GP) training and inference with O(nlogn)complexity6 w.r.t.


Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff

Neural Information Processing Systems

Recent efforts in neural compression have focused on the rate-distortion-perception (RDP) tradeoff, where the perception constraint ensures the source and reconstruction distributions are close in terms of a statistical divergence. Theoretical work on RDP describes properties of RDP-optimal compressors without providing constructive and low complexity solutions. While classical rate-distortion theory shows that optimal compressors should efficiently pack space, RDP theory additionally shows that infinite randomness shared between the encoder and decoder may be necessary for RDP optimality. In this paper, we propose neural compressors that are low complexity and benefit from high packing efficiency through lattice coding and shared randomness through shared dithering over the lattice cells. For two important settings, namely infinite shared and zero shared randomness, we analyze the RDP tradeoff achieved by our proposed neural compressors and show optimality in both cases. Experimentally, we investigate the roles that these two components of our design, lattice coding and randomness, play in the performance of neural compressors on synthetic and real-world data. We observe that performance improves with more shared randomness and better lattice packing.


SpaceX IPO raised 10bn more than thought

BBC News

SpaceX raised $10bn (ยฃ7.5bn) more than initially thought when it sold shares to the public on Friday - bringing in a total of $85.7bn. Elon Musk's rocket and Artificial Intellgience (AI) company pulled off the biggest initial public offering (IPO) in history when it joined New York's Nasdaq stock exchange last week. The listing had raised $75bn from investors, which Musk told employees will be spent funding a significant growth phase. But the banks which backed the IPO exercised a so-called greenshoe clause, which let them purchase an extra $10bn of SpaceX shares. The extra $10bn raised, revealed in a statement by SpaceX announcing the completion of the listing, would by itself rank as one of the biggest IPOs in history.


Imitation Beyond Expectation Using Pluralistic Stochastic Dominance

Neural Information Processing Systems

Imitation learning seeks to estimate policies reflecting the values of demonstrated behaviors. Prevalent approaches learn to match or exceed the demonstrator's performance in expectation without knowing the demonstrator's reward function. Unfortunately, this does not induce pluralistic imitators that learn to support distinct demonstrations.


Training-Free Constrained Generation With Stable Diffusion Models

Neural Information Processing Systems

Stable diffusion models represent the state-of-the-art in data synthesis across diverse domains and hold transformative potential for applications in science and engineering, e.g., by facilitating the discovery of novel solutions and simulating systems that are computationally intractable to model explicitly. While there is increasing effort to incorporate physics-based constraints into generative models, existing techniques are either limited in their applicability to latent diffusion frameworks or lack the capability to strictly enforce domain-specific constraints. To address this limitation this paper proposes a novel integration of stable diffusion models with constrained optimization frameworks, enabling the generation of outputs satisfying stringent physical and functional requirements.


Non-Asymptotic Analysis Of Data Augmentation For Precision Matrix Estimation

Neural Information Processing Systems

This paper addresses the problem of inverse covariance (also known as precision matrix) estimation in high-dimensional settings. Specifically, we focus on two classes of estimators: linear shrinkage estimators with a target proportional to the identity matrix, and estimators derived from data augmentation (DA). Here, DA refers to the common practice of enriching a dataset with artificial samples--typically generated via a generative model or through random transformations of the original data--prior to model fitting. For both classes of estimators, we derive estimators and provide concentration bounds for their quadratic error. This allows for both method comparison and hyperparameter tuning, such as selecting the optimal proportion of artificial samples. On the technical side, our analysis relies on tools from random matrix theory. We introduce a novel deterministic equivalent for generalized resolvent matrices, accommodating dependent samples with specific structure. We support our theoretical results with numerical experiments.


Differentially Private Bilevel Optimization: Efficient Algorithms with Near-Optimal Rates

Neural Information Processing Systems

Bilevel optimization, in which one optimization problem is nested inside another, underlies many machine learning applications with a hierarchical structure--such as meta-learning and hyperparameter optimization. Such applications often involve sensitive training data, raising pressing concerns about individual privacy. Motivated by this, we study differentially private bilevel optimization. We first focus on settings where the outer-level objective is convex, and provide novel upper and lower bounds on the excess empirical risk for both pure and approximate differential privacy. These bounds are nearly tight and essentially match the optimal rates for standard single-level differentially private ERM, up to additional terms that capture the intrinsic complexity of the nested bilevel structure.