Technology
Learning Theory for Kernel Bilevel Optimization
Bilevel optimization has emerged as a technique for addressing a wide range of machine learning problems that involve an outer objective implicitly determined by the minimizer of an inner problem. While prior works have primarily focused on the parametric setting, a learning-theoretic foundation for bilevel optimization in the nonparametric case remains relatively unexplored. In this paper, we take a first step toward bridging this gap by studying Kernel Bilevel Optimization (KBO), where the inner objective is optimized over a reproducing kernel Hilbert space. This setting enables rich function approximation while providing a foundation for rigorous theoretical analysis. In this context, we derive novel finite-sample generalization bounds for KBO, leveraging tools from empirical process theory. These bounds further allow us to assess the statistical accuracy of gradient-based methods applied to the empirical discretization of KBO. We numerically illustrate our theoretical findings on a synthetic instrumental variable regression task.
Localizing Knowledge in Diffusion Transformers
Understanding how knowledge is distributed across the layers of generative models is crucial for improving interpretability, controllability, and adaptation. While prior work has explored knowledge localization in UNet-based architectures, Diffusion Transformer (DiT)-based models remain underexplored in this context. In this paper, we propose a model-and knowledge-agnostic method to localize where specific types of knowledge are encoded within the DiT blocks. We evaluate our method on state-of-the-art DiT-based models, including PixArt-ฮฑ, FLUX, and SANA, across six diverse knowledge categories. We show that the identified blocks are both interpretable and causally linked to the expression of knowledge in generated outputs. Building on these insights, we apply our localization framework to two key applications: model personalization and knowledge unlearning. In both settings, our localized fine-tuning approach enables efficient and targeted updates, reducing computational cost, improving task-specific performance, and better preserving general model behavior with minimal interference to unrelated or surrounding content. Overall, our findings offer new insights into the internal structure of DiTs and introduce a practical pathway for more interpretable, efficient, and controllable model editing. 1
UniHG: ALarge-scale Universal Heterogeneous Graph Dataset and Benchmark for Representation Learning and Cross-Domain Transferring
Irregular data in the real world are usually organized as heterogeneous graphs consisting of multiple types of nodes and edges. However, current heterogeneous graph research confronts three fundamental challenges: i) Benchmark Deficiency, ii) Semantic Disalignment, and iii) Propagation Degradation. In this paper, we construct a large-scale, universal, and joint multi-domain heterogeneous graph dataset named UniHG to facilitate heterogeneous graph representation learning and cross-domain knowledge mining. Overall, UniHG contains 77.31 million nodes and 564 million directed edges with thousands of labels and attributes, which is currently the largest universal heterogeneous graph dataset available to the best of our knowledge. To perform effective learning and provide comprehensively benchmarks on UniHG, two key measures are taken, including i) the semantic alignment strategy for multi-attribute entities, which projects the feature description of multi-attribute nodes and edges into a common embedding space to facilitate information aggregation; ii) proposing the novel Heterogeneous Graph Decoupling (HGD) framework with a specifically designed Anisotropy Feature Propagation (AFP) module for learning effective multi-hop anisotropic propagation kernels. These two strategies enable efficient information propagation among a tremendous number of multi-attribute entities and meanwhile mine multi-attribute association adaptively through the multi-hop aggregation in large-scale heterogeneous graphs. Comprehensive benchmark results demonstrate that our model significantly outperforms existing methods with an accuracy improvement of 28.93%. And the UniHG can facilitate downstream tasks, achieving an NDCG@20 improvement rate of 11.48% and 11.71%.
Estimating cognitive biases with attention-aware inverse planning
People's goal-directed behaviors are influenced by their cognitive biases, and autonomous systems that interact with people should be aware of this. For example, people's attention to objects in their environment will be biased in a way that systematically affects how they perform everyday tasks such as driving to work. Here, building on recent work in computational cognitive science, we formally articulate the attention-aware inverse planning problem, in which the goal is to estimate a person's attentional biases from their actions. We demonstrate how attention-aware inverse planning systematically differs from standard inverse reinforcement learning and how cognitive biases can be inferred from behavior. Finally, we present an approach to attention-aware inverse planning that combines deep reinforcement learning with computational cognitive modeling. We use this approach to infer the attentional strategies of RL agents in real-life driving scenarios selected from the Waymo Open Dataset, demonstrating the scalability of estimating cognitive biases with attention-aware inverse planning.
Intermediate Domain Alignment and Morphology Analogy for Patent-Product Image Retrieval
Recent advances in artificial intelligence have significantly impacted image retrieval tasks, yet Patent-Product Image Retrieval (PPIR) has received limited attention. PPIR, which retrieves patent images based on product images to identify potential infringements, presents unique challenges: (1) both product and patent images often contain numerous categories of artificial objects, but models pre-trained on standard datasets exhibit limited discriminative power to recognize some of those unseen objects; and (2) the significant domain gap between binary patent line drawings and colorful RGB product images further complicates similarity comparisons for product-patent pairs. To address these challenges, we formulate it as an open-set image retrieval task and introduce a comprehensive Patent-Product Image Retrieval Dataset (PPIRD) including a test set with 439 product-patent pairs, a retrieval pool of 727,921 patents, and an unlabeled pre-training set of 3,799,695 images. We further propose a novel Intermediate Domain Alignment and Morphology Analogy (IDAMA) strategy. IDAMA maps both image types to an intermediate sketch domain using edge detection to minimize the domain discrepancy, and employs a Morphology Analogy Filter to select discriminative patent images based on visual features via analogical reasoning. Extensive experiments on PPIRD demonstrate that IDAMA significantly outperforms baseline methods (+7.58 mAR) and offers valuable insights into domain mapping and representation learning for PPIR.
1543d6d5cb976e4f9fbfaedf2e257967-Supplemental-Datasets_and_Benchmarks_Track.pdf
LCDB 1.1: ADatabase Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought For the actual appendices, please see the main paper submission. Here, we would like to make a few2 notes regarding the dataset hosting.3 Self-Hosting Platform Our dataset is self-hosted on the 4TU.ResearchData platform, a trusted4 institutional repository based in the Netherlands, which guarantees long-term preservation of research5 data for a minimum of 15 years.16 Data Access Note We provide a public access link (also attached in the main submission).27 Machine Access via Croissant Metadata For machine access, Croissant metadata file can be8 found in our GitHub repository.39
Appendices776
ALimitations777 As described in Sections 4 and 6, users would tailor attacks to image clusters. In the case of beige778 box, we outright provided these clusters by disclosing which image indices corresponded to which779 general watermark type. For the black-box track, several winning teams clustered images into groups780 by artifact varieties and did so by hand. For the latter, this was made possible because (1) our data set781 was relatively small, enabling this type of manual data labeling, and (2) they were made aware that782 the dataset contained mixtures of several watermarks. A database owner who uses only one type of783 watermark will unlikely produce such variation in artifacts.784 Additionally, we use the watermark models and setting provided in the original papers and do not785 calibrate the strength of watermarks.
ATechnical Report on " Erasing the Invisible ": The 2024 NeurIPS Competition on Stress Testing Image Watermarks
AI-generated images have become pervasive, raising critical concerns around content authenticity, intellectual property, and the spread of misinformation. Invisible watermarks offer a promising solution for identifying AI-generated images, preserving content provenance without degrading visual quality. However, their real-world robustness remains uncertain due to the lack of standardized evaluation protocols and large-scale stress testing. To bridge this gap, we organized "Erasing the Invisible," a NeurIPS 2024 competition and newly established benchmark designed to systematically stress testing the resilience of watermarking techniques. The competition introduced two attack tracks--Black-box and Beige-box--that simulate practical scenarios with varying levels of attacker knowledge on watermarks, providing a comprehensive assessment of watermark robustness.
Starmer to confirm social media ban for U.K. teens ahead of G7 meet
Starmer to confirm social media ban for U.K. teens ahead of G7 meet U.K. Prime Minister Keir Starmer is expected to confirm a social media ban on children under 16 on Monday morning. U.K. Prime Minister Keir Starmer will start a crucial week for his premiership by announcing a package of strong restrictions designed to protect British teenagers from online threats. Starmer is expected Monday morning to confirm a ban on children under 16 using major social media platforms, as well as other measures including curfews on older teenagers and tough regulations on chatbots. He will then depart for a Group of Seven summit at Evian-les-Bains, France, where he faces awkward questions following last week's resignation of his defense secretary and uncertainty around the U.K.'s military budget. A ban on young teenagers using social media is popular with the U.K. public despite concerns around how effectively it can be enforced. The Labour government's new range of restrictions -- including some against chatbots and online games -- will go further than laws in Australia, according to a person familiar with the situation, where a ban on social media for teens came into effect last year.