positivity
Supplementary Material for VDE and GCFN A Theoretical Details and Proofs Notation We use the expectation operator in different contexts in the proof
We use the expectation operator in different contexts in the proof. Here, we show the full derivation of the lower bound for negative mutual-information. We derive the lower bound for the general case where there are both observed and unobserved confounders. The VDE optimization involves the expectations of distributions with parameters with respect to a distribution that also has parameters. In our experiments, we let the control function be a categorical variable.
- North America > United States > Virginia (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Optimal and Fair Encouragement Policy Evaluation and Learning
In consequential domains, it is often impossible to compel individuals to take treatment, so that optimal policy rules are merely suggestions in the presence of human non-adherence to treatment recommendations. In these same domains, there may be heterogeneity both in who responds in taking-up treatment, and heterogeneity in treatment efficacy. For example, in social services, a persistent puzzle is the gap in take-up of beneficial services among those who may benefit from them the most. When in addition the decision-maker has distributional preferences over both access and average outcomes, the optimal decision rule changes. We study identification, doubly-robust estimation, and robust estimation under potential violations of positivity. We consider fairness constraints such as demographic parity in treatment take-up, and other constraints, via constrained optimization. Our framework can be extended to handle algorithmic recommendations under an often-reasonable covariate-conditional exclusion restriction, using our robustness checks for lack of positivity in the recommendation. We develop a two-stage, online learning-based algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds. We assess improved recommendation rules in a stylized case study of optimizing recommendation of supervised release in the PSA-DMF pretrial risk-assessment tool while reducing surveillance disparities.
Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities
Zhao, Haoyu, Geng, Yihan, Tang, Shange, Lin, Yong, Lyu, Bohan, Lin, Hongzhou, Jin, Chi, Arora, Sanjeev
LLM-based formal proof assistants (e.g., in Lean) hold great promise for automating mathematical discovery. But beyond syntactic correctness, do these systems truly understand mathematical structure as humans do? We investigate this question in context of mathematical inequalities -- specifically the prover's ability to recognize that the given problem simplifies by applying a known inequality such as AM/GM. Specifically, we are interested in their ability to do this in a compositional setting where multiple inequalities must be applied as part of a solution. We introduce Ineq-Comp, a benchmark built from elementary inequalities through systematic transformations, including variable duplication, algebraic rewriting, and multi-step composition. Although these problems remain easy for humans, we find that most provers -- including Goedel, STP, and Kimina-7B -- struggle significantly. DeepSeek-Prover-V2-7B shows relative robustness, but still suffers a 20% performance drop (pass@32). Even for DeepSeek-Prover-V2-671B model, the gap between compositional variants and seed problems exists, implying that simply scaling up the model size alone does not fully solve the compositional weakness. Strikingly, performance remains poor for all models even when formal proofs of the constituent parts are provided in context, revealing that the source of weakness is indeed in compositional reasoning. Our results expose a persisting gap between the generalization behavior of current AI provers and human mathematical intuition. All data and evaluation code can be found at https://github.com/haoyuzhao123/LeanIneqComp.
- North America > United States (0.04)
- Europe > Germany > Berlin (0.04)
Delay Independent Safe Control with Neural Networks: Positive Lur'e Certificates for Risk Aware Autonomy
Hedesh, Hamidreza Montazeri, Siami, Milad
We present a risk-aware safety certification method for autonomous, learning enabled control systems. Focusing on two realistic risks, state/input delays and interval matrix uncertainty, we model the neural network (NN) controller with local sector bounds and exploit positivity structure to derive linear, delay-independent certificates that guarantee local exponential stability across admissible uncertainties. To benchmark performance, we adopt and implement a state-of-the-art IQC NN verification pipeline. On representative cases, our positivity-based tests run orders of magnitude faster than SDP-based IQC while certifying regimes the latter cannot-providing scalable safety guarantees that complement risk-aware control.
- North America > United States > Virginia (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- (2 more...)
3e883840fee4384dd3d2afea5e822517-AuthorFeedback.pdf
We thank all reviewers for their comments and acknowledgement of our contribution. Theorem 3 and Corollary 4, as Reviewer 3 suggested. How to choose the proper Bregman divergence? It is yet unclear whether there exist ways to systematically design the "best Bregman divergence in a (k 1) This is also commonly adopted in the literature. Is continuity of the intensity function restrictive?
Incongruent Positivity: When Miscalibrated Positivity Undermines Online Supportive Conversations
In emotionally supportive conversations, well-intended positivity can sometimes misfire, leading to responses that feel dismissive, minimizing, or unrealistically optimistic. We examine this phenomenon of incongruent positivity as miscalibrated expressions of positive support in both human and LLM generated responses. To this end, we collected real user-assistant dialogues from Reddit across a range of emotional intensities and generated additional responses using large language models for the same context. We categorize these conversations by intensity into two levels: Mild, which covers relationship tension and general advice, and Severe, which covers grief and anxiety conversations. This level of categorization enables a comparative analysis of how supportive responses vary across lower and higher stakes contexts. Our analysis reveals that LLMs are more prone to unrealistic positivity through dismissive and minimizing tone, particularly in high-stakes contexts. To further study the underlying dimensions of this phenomenon, we finetune LLMs on datasets with strong and weak emotional reactions. Moreover, we developed a weakly supervised multilabel classifier ensemble (DeBERTa and MentalBERT) that shows improved detection of incongruent positivity types across two sorts of concerns (Mild and Severe). Our findings shed light on the need to move beyond merely generating generic positive responses and instead study the congruent support measures to balance positive affect with emotional acknowledgment. This approach offers insights into aligning large language models with affective expectations in the online supportive dialogue, paving the way toward context-aware and trust preserving online conversation systems.
- North America > United States (0.06)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)