Goto

Collaborating Authors

 facet


Learning Cuts via Enumeration Oracles

Neural Information Processing Systems

Cutting-planes are one of the most important building blocks for solving large-scale integer programming (IP) problems to (near) optimality. The majority of cutting plane approaches rely on explicit rules to derive valid inequalities that can separate the target point from the feasible set. Local cuts, on the other hand, seek to directly derive the facets of the underlying polyhedron and use them as cutting planes. However, current approaches rely on solving Linear Programming (LP) problems in order to derive such a hyperplane. In this paper, we present a novel generic approach for learning the facets of the underlying polyhedron by accessing it implicitly via an enumeration oracle in a reduced dimension. This is achieved by embedding the oracle in a variant of the Frank-Wolfe algorithm which is capable of generating strong cutting planes, effectively turning the enumeration oracle into a separation oracle. We demonstrate the effectiveness of our approach with a case study targeting the multidimensional knapsack problem (MKP).


Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor

Olteanu, Alexandra, Blodgett, Su Lin, Balayn, Agathe, Wang, Angelina, Diaz, Fernando, Calmon, Flavio du Pin, Mitchell, Margaret, Ekstrand, Michael, Binns, Reuben, Barocas, Solon

arXiv.org Artificial Intelligence

In AI research and practice, rigor remains largely understood in terms of methodological rigor -- such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about the capabilities of AI systems. Our position is that a broader conception of what rigorous AI research and practice should entail is needed. We believe such a conception -- in addition to a more expansive understanding of (1) methodological rigor -- should include aspects related to (2) what background knowledge informs what to work on (epistemic rigor); (3) how disciplinary, community, or personal norms, standards, or beliefs influence the work (normative rigor); (4) how clearly articulated the theoretical constructs under use are (conceptual rigor); (5) what is reported and how (reporting rigor); and (6) how well-supported the inferences from existing evidence are (interpretative rigor). In doing so, we also provide useful language and a framework for much-needed dialogue about the AI community's work by researchers, policymakers, journalists, and other stakeholders.





Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

Matt Jordan, Justin Lewis, Alexandros G. Dimakis

Neural Information Processing Systems

We relate the problem of computing pointwise robustness of these networks to that of computing the maximum norm ball with a fixed center that can be contained in a non-convex polytope. This is a challenging problem in general, however we show that there exists an efficient algorithm to compute this for polyhedral complices.


Enhancing the Medical Context-Awareness Ability of LLMs via Multifaceted Self-Refinement Learning

Zhou, Yuxuan, Wang, Yubin, Wang, Bin, Ning, Chen, Liu, Xien, Wu, Ji, Hao, Jianye

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown great promise in the medical domain, achieving strong performance on several benchmarks. However, they continue to underperform in real-world medical scenarios, which often demand stronger context-awareness, i.e., the ability to recognize missing or critical details (e.g., user identity, medical history, risk factors) and provide safe, helpful, and contextually appropriate responses. To address this issue, we propose Multifaceted Self-Refinement (MuSeR), a data-driven approach that enhances LLMs' context-awareness along three key facets (decision-making, communication, and safety) through self-evaluation and refinement. Specifically, we first design a attribute-conditioned query generator that simulates diverse real-world user contexts by varying attributes such as role, geographic region, intent, and degree of information ambiguity. An LLM then responds to these queries, self-evaluates its answers along three key facets, and refines its responses to better align with the requirements of each facet. Finally, the queries and refined responses are used for supervised fine-tuning to reinforce the model's context-awareness ability. Evaluation results on the latest HealthBench dataset demonstrate that our method significantly improves LLM performance across multiple aspects, with particularly notable gains in the context-awareness axis. Furthermore, by incorporating knowledge distillation with the proposed method, the performance of a smaller backbone LLM (e.g., Qwen3-32B) surpasses its teacher model, achieving a new SOTA across all open-source LLMs on HealthBench (63.8%) and its hard subset (43.1%). Code and dataset will be released at https://muser-llm.github.io.


Appendix

Neural Information Processing Systems

Appendix A contains details on the experiments conducted throughout this paper. Details regarding the datasets used in the experiments are included in Table 2. All of the attacks are run on an Ubuntu (16.04) cluster with Y ang et al. [2020] uses Gurobi as the solver. For GeoAdEx, we choose to compute distance to cell and set m to 20 and applied a time limit of 100 seconds per test point. Wang et al. [2019] is generally the fastest, and GeoAdEx is faster than Y ang et al.



Appendices A Proofs

Neural Information Processing Systems

This part contains the proofs of Lemma 3.1 and Theorem 3.2. We also restate Lemma 3.1 and Theorem 3.2 using the new notations here, so that this part can be Under the new notations here, Equation (2) in Section 3.2 becomes: f (x) = tnull xnull By Lemma A.2, the optimal solution of (7) can be found on the vertices of Now we consider the discreteness constraint. The detailed parameters in the training and pruning stages of our method are listed in Table 4. The main building block of ResNet-50 is the bottleneck block [He et al., 2016], as shown in Figure 5. " of a bottleneck block are already 0 or very close to 0. So we do not apply any extra We visualize the layer-wise distribution of scaling factors in Figure 6. Figure 6 compares the layer-wise distributions of scaling factors between the baseline ResNet-50 model and the model trained with our polarization regularizer on ImageNet dataset. Figure 6: Comparison of the layer-wise scaling factor distributions between baseline ResNet-50 model and the model trained with our polarization regularizer on ImageNet dataset.