Goto

Collaborating Authors

 Industry


Fast Rank-1 Lattice Targeted Sampling for Black-box Optimization Anonymous Author(s) Affiliation Address email

Neural Information Processing Systems

Black-box optimization has gained great attention for its success in recent ap-1 plications. However, scaling up to high-dimensional problems with good query2 efficiency remains challenging. This paper proposes a novel Rank-1 Lattice Tar-3 geted Sampling (RLTS) technique to address this issue. Our RLTS benefits from4 random rank-1 lattice Quasi-Monte Carlo, which enables us to perform fast local5 exact Gaussian processes (GP) training and inference with O(nlogn)complexity6 w.r.t.


Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff

Neural Information Processing Systems

Recent efforts in neural compression have focused on the rate-distortion-perception (RDP) tradeoff, where the perception constraint ensures the source and reconstruction distributions are close in terms of a statistical divergence. Theoretical work on RDP describes properties of RDP-optimal compressors without providing constructive and low complexity solutions. While classical rate-distortion theory shows that optimal compressors should efficiently pack space, RDP theory additionally shows that infinite randomness shared between the encoder and decoder may be necessary for RDP optimality. In this paper, we propose neural compressors that are low complexity and benefit from high packing efficiency through lattice coding and shared randomness through shared dithering over the lattice cells. For two important settings, namely infinite shared and zero shared randomness, we analyze the RDP tradeoff achieved by our proposed neural compressors and show optimality in both cases. Experimentally, we investigate the roles that these two components of our design, lattice coding and randomness, play in the performance of neural compressors on synthetic and real-world data. We observe that performance improves with more shared randomness and better lattice packing.


SpaceX IPO raised 10bn more than thought

BBC News

SpaceX raised $10bn (ยฃ7.5bn) more than initially thought when it sold shares to the public on Friday - bringing in a total of $85.7bn. Elon Musk's rocket and Artificial Intellgience (AI) company pulled off the biggest initial public offering (IPO) in history when it joined New York's Nasdaq stock exchange last week. The listing had raised $75bn from investors, which Musk told employees will be spent funding a significant growth phase. But the banks which backed the IPO exercised a so-called greenshoe clause, which let them purchase an extra $10bn of SpaceX shares. The extra $10bn raised, revealed in a statement by SpaceX announcing the completion of the listing, would by itself rank as one of the biggest IPOs in history.


Training-Free Constrained Generation With Stable Diffusion Models

Neural Information Processing Systems

Stable diffusion models represent the state-of-the-art in data synthesis across diverse domains and hold transformative potential for applications in science and engineering, e.g., by facilitating the discovery of novel solutions and simulating systems that are computationally intractable to model explicitly. While there is increasing effort to incorporate physics-based constraints into generative models, existing techniques are either limited in their applicability to latent diffusion frameworks or lack the capability to strictly enforce domain-specific constraints. To address this limitation this paper proposes a novel integration of stable diffusion models with constrained optimization frameworks, enabling the generation of outputs satisfying stringent physical and functional requirements.


Non-Asymptotic Analysis Of Data Augmentation For Precision Matrix Estimation

Neural Information Processing Systems

This paper addresses the problem of inverse covariance (also known as precision matrix) estimation in high-dimensional settings. Specifically, we focus on two classes of estimators: linear shrinkage estimators with a target proportional to the identity matrix, and estimators derived from data augmentation (DA). Here, DA refers to the common practice of enriching a dataset with artificial samples--typically generated via a generative model or through random transformations of the original data--prior to model fitting. For both classes of estimators, we derive estimators and provide concentration bounds for their quadratic error. This allows for both method comparison and hyperparameter tuning, such as selecting the optimal proportion of artificial samples. On the technical side, our analysis relies on tools from random matrix theory. We introduce a novel deterministic equivalent for generalized resolvent matrices, accommodating dependent samples with specific structure. We support our theoretical results with numerical experiments.


Differentially Private Bilevel Optimization: Efficient Algorithms with Near-Optimal Rates

Neural Information Processing Systems

Bilevel optimization, in which one optimization problem is nested inside another, underlies many machine learning applications with a hierarchical structure--such as meta-learning and hyperparameter optimization. Such applications often involve sensitive training data, raising pressing concerns about individual privacy. Motivated by this, we study differentially private bilevel optimization. We first focus on settings where the outer-level objective is convex, and provide novel upper and lower bounds on the excess empirical risk for both pure and approximate differential privacy. These bounds are nearly tight and essentially match the optimal rates for standard single-level differentially private ERM, up to additional terms that capture the intrinsic complexity of the nested bilevel structure.


Why do South Koreans love AI so much?

MIT Technology Review

Why do South Koreans love AI so much? From eldercare robots to humanoid monks, South Koreans just can't get enough of AI. When I landed in Seoul after a grueling 12-hour flight from San Francisco, I walked through an unmanned immigration checkpoint, where a machine scanned my face and passport. On the subway home, people were glued to their phones (powered by flawless 5G even underground), as we raced past platforms lined with LED screens of ads celebrating K-pop idols ' birthdays. When I got off the station in Gangnam, a cartoon-eyed robot on wheels was waiting patiently at a crosswalk to deliver someone's dinner. Internet cafรฉs dotted the sidewalks, crammed with teenagers playing computer games, maybe hoping to become the next legendary pro gamer .


Anthropic to meet White House over AI tool suspension

BBC News

Bosses at the artificial intelligence (AI) firm Anthropic are set to meet senior White House officials amid fresh national security concerns over the company's latest release. The meeting is set to take place on Monday in Washington DC between executives at Anthropic and the US Department of Commerce, a government department led by Secretary Howard Lutnick, according to two people familiar with the matter. It comes after Anthropic blocked all public access to the recent release of its latest AI tool on Friday, which it has previously said is too powerful. The firm made the decision after the US government prohibited Anthropic from allowing any foreign national access to the technology. The AI tool at issue is named Fable 5 or Mythos 5. Fable 5 is a version of the tool with extra safeguards made available to the public, while Mythos 5 has different controls and is only available to a select group of organisations.


DeepKD: ADeeply Decoupled and Denoised Knowledge Distillation Trainer

Neural Information Processing Systems

Recent advances in knowledge distillation have emphasized the importance of decoupling different knowledge components. While existing methods utilize momentum mechanisms to separate task-oriented and distillation gradients, they overlook the inherent conflict between target-class and non-target-class knowledge flows. Furthermore, low-confidence dark knowledge in non-target classes introduces noisy signals that hinder effective knowledge transfer. To address these limitations, we propose DeepKD, a novel training framework that integrates duallevel decoupling with adaptive denoising. First, through theoretical analysis of gradient signal-to-noise ratio (GSNR) characteristics in task-oriented and non-taskoriented knowledge distillation, we design independent momentum updaters for each component to prevent mutual interference. We observe that the optimal momentum coefficients for task-oriented gradient (TOG), target-class gradient (TCG), and non-target-class gradient (NCG) should be positively related to their GSNR. Second, we introduce a dynamic top-k mask (DTM) mechanism that gradually increases K from a small initial value to incorporate more non-target classes as training progresses, following curriculum learning principles. The DTM jointly filters low-confidence logits from both teacher and student models, effectively purifying dark knowledge during early training. Extensive experiments on CIFAR-100, ImageNet, and MS-COCO demonstrate DeepKD's effectiveness.


Efficient Part-level 3DObject Generation via Dual Volume Packing

Neural Information Processing Systems

Recent progress in 3D object generation has greatly improved both the quality and efficiency. However, most existing methods generate a single mesh with all parts fused together, which limits the ability to edit or manipulate individual parts. A key challenge is that different objects may have a varying number of parts. To address this, we propose a new end-to-end framework for part-level 3D object generation. Given a single input image, our method generates high-quality 3D objects with an arbitrary number of complete and semantically meaningful parts. We introduce a dual volume packing strategy that organizes all parts into two complementary volumes, allowing for the creation of complete and interleaved parts that assemble into the final object. Experiments show that our model achieves better quality, diversity, and generalization than previous image-based part-level generation methods. Our project page is at https://research.nvidia.com/