Geo-Diverse Safety Alignment Da Yin

Neural Information Processing Systems

Content Warning: This paper may contain examples of harmful contents by nature. In the rapidly evolving field of Large Language Models (LLMs), ensuring safety is a crucial and widely discussed topic. However, existing works often overlook the geo-diversity of cultural and legal standards across the world.


Symmetric Linear Bandits with Hidden Symmetry Nam Phuong Tran Long Tran-Thanh Department of Computer Science Department of Computer Science University of Warwick

Neural Information Processing Systems

High-dimensional linear bandits with low-dimensional structure have received considerable attention in recent studies due to their practical significance. The most common structure in the literature is sparsity. However, it may not be available in practice. Symmetry, where the reward is invariant under certain groups of transformations on the set of arms, is another important inductive bias in the highdimensional case that covers many standard structures, including sparsity. In this work, we study high-dimensional symmetric linear bandits where the symmetry is hidden from the learner, and the correct symmetry needs to be learned in an online setting. We examine the structure of a collection of hidden symmetry and provide a method based on model selection within the collection of low-dimensional subspaces.


Long-term Causal Effects via Behavioral Game Theory

Neural Information Processing Systems

Planned experiments are the gold standard in reliably comparing the causal effect of switching from a baseline policy to a new policy. One critical shortcoming of classical experimental methods, however, is that they typically do not take into account the dynamic nature of response to policy changes. For instance, in an experiment where we seek to understand the effects of a new ad pricing policy on auction revenue, agents may adapt their bidding in response to the experimental pricing changes. Thus, causal effects of the new pricing policy after such adaptation period, the long-term causal effects, are not captured by the classical methodology even though they clearly are more indicative of the value of the new policy. Here, we formalize a framework to define and estimate long-term causal effects of policy changes in multiagent economies. Central to our approach is behavioral game theory, which we leverage to formulate the ignorability assumptions that are necessary for causal inference. Under such assumptions we estimate long-term causal effects through a latent space approach, where a behavioral model of how agents act conditional on their latent behaviors is combined with a temporal model of how behaviors evolve over time.


InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models

Neural Information Processing Systems

Large Language Models for code (code LLMs) have witnessed tremendous progress in recent years. With the rapid development of code LLMs, many popular evaluation benchmarks, such as HumanEval, DS-1000, and MBPP, have emerged to measure the performance of code LLMs with a particular focus on code generation tasks. However, they are insufficient to cover the full range of expected capabilities of code LLMs, which span beyond code generation to answering diverse coding-related questions. To fill this gap, we propose InfiBench, the first large-scale freeform question-answering (QA) benchmark for code to our knowledge, comprising 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages. InfiBench uses four types of model-free automatic metrics to evaluate response correctness where domain experts carefully concretize the criterion for each question. We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.


Neural Krylov Iteration for Accelerating Linear System Solving Jian Luo 1 Jie Wang 1 Hong Wang 1 Huanshuo Dong

Neural Information Processing Systems

Solving large-scale sparse linear systems is essential in fields like mathematics, science, and engineering. Traditional numerical solvers, mainly based on the Krylov subspace iteration algorithm, suffer from the low-efficiency problem, which primarily arises from the less-than-ideal iteration. To tackle this problem, we propose a novel method, namely Neural Krylov Iteration (NeurKItt), for accelerating linear system solving. Specifically, NeurKItt employs a neural operator to predict the invariant subspace of the linear system and then leverages the predicted subspace to accelerate linear system solving. To enhance the subspace prediction accuracy, we utilize QR decomposition for the neural operator outputs and introduce a novel projection loss function for training. NeurKItt benefits the solving by using the predicted subspace to guide the iteration process, significantly reducing the number of iterations. We provide extensive experiments and comprehensive theoretical analyses to demonstrate the feasibility and efficiency of NeurKItt. In our main experiments, NeurKItt accelerates the solving of linear systems across various settings and datasets, achieving up to a 5.5 speedup in computation time and a 16.1 speedup in the number of iterations.


Adversarial Multiclass Classification: A Risk Minimization Perspective

Neural Information Processing Systems

Recently proposed adversarial classification methods have shown promising results for cost sensitive and multivariate losses. In contrast with empirical risk minimization (ERM) methods, which use convex surrogate losses to approximate the desired non-convex target loss function, adversarial methods minimize non-convex losses by treating the properties of the training data as being uncertain and worst case within a minimax game. Despite this difference in formulation, we recast adversarial classification under zero-one loss as an ERM method with a novel prescribed loss function. We demonstrate a number of theoretical and practical advantages over the very closely related hinge loss ERM methods. This establishes adversarial classification under the zero-one loss as a method that fills the long standing gap in multiclass hinge loss classification, simultaneously guaranteeing Fisher consistency and universal consistency, while also providing dual parameter sparsity and high accuracy predictions in practice.


Private Algorithms for Stochastic Saddle Points and Variational Inequalities: Beyond Euclidean Geometry

Neural Information Processing Systems

In this work, we conduct a systematic study of stochastic saddle point problems (SSP) and stochastic variational inequalities (SVI) under the constraint of (ฯต, ฮด)- differential privacy (DP) in both Euclidean and non-Euclidean setups.


Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing Supplementary Material

Neural Information Processing Systems

Figure 1: The Slice-100K dataset consists of STL files and their G-code counterparts. Each pair here consists of STL (left) and its slices (right) for G-code. Slice-100K will provide a foundational platform that enables broader research at the intersection of manufacturing and artificial intelligence, particularly with the recent advances in multimodal large language models (LLMs). Furthermore, openly-available data is essential for the democratization of knowledge in the scientific community. We believe Slice-100K will benefit the economy and will be a net positive contribution. However, we do foresee some potential negative societal impacts.


Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing Kelly O. Marshall

Neural Information Processing Systems

G-code (Geometric code) or RS-274 is the most widely used computer numerical control (CNC) and 3D printing programming language. G-code provides machine instructions for the movement of the 3D printer, especially for the nozzle, stage, and extrusion of material for extrusion-based additive manufacturing. Currently, there does not exist a large repository of curated CAD models along with their corresponding G-code files for additive manufacturing. To address this issue, we present Slice-100K, a first-of-its-kind dataset of over 100,000 G-code files, along with their tessellated CAD model, LVIS (Large Vocabulary Instance Segmentation) categories, geometric properties, and renderings. We build our dataset from triangulated meshes derived from Objaverse-XL and Thingi10K datasets. We demonstrate the utility of this dataset by finetuning GPT-2 on a subset of the dataset for G-code translation from a legacy G-code format (Sailfish) to a more modern, widely used format (Marlin). Our dataset can be found here. Slice-100K will be the first step in developing a multimodal foundation model for digital manufacturing.


A hierarchical decomposition for explaining ML performance discrepancies

Neural Information Processing Systems

Machine learning (ML) algorithms can often differ in performance across domains. Understanding why their performance differs is crucial for determining what types of interventions (e.g., algorithmic or operational) are most effective at closing the performance gaps. Aggregate decompositions express the total performance gap as the gap due to a shift in the feature distribution p(X) plus the gap due to a shift in the outcome's conditional distribution p(Y |X). While this coarse explanation is helpful for guiding root cause analyses, it provides limited details and can only suggest coarse fixes involving all variables in an ML system. Detailed decompositions quantify the importance of each variable to each term in the aggregate decomposition, which can provide a deeper understanding and suggest more targeted interventions. Although parametric methods exist for conducting a full hierarchical decomposition of an algorithm's performance gap at the aggregate and detailed levels, current nonparametric methods only cover parts of the hierarchy; many also require knowledge of the entire causal graph. We introduce a nonparametric hierarchical framework for explaining why the performance of an ML algorithm differs across domains, without requiring causal knowledge. Furthermore, we derive debiased, computationally-efficient estimators and statistical inference procedures to construct confidence intervals for the explanations.