Goto

Collaborating Authors

 temp


Foraprobabilityspace (ΩBox,E,PBox),withΩBox Rd,theGaussian-boxprocessisgeneratedas µi ΩBox, σi Rd+, r Rd+ Ci N(µi,σi), Xi, =Ci+ri, Xi, =Ci ri, Box(Xi) = dY

Neural Information Processing Systems

All coordinates will be modeled by independent Gumbel distributions, and thus it is enough to calculate the expected side-length of a box as the expected volume will simply be the product of the expected side-lengths. To properly restrict the Gumbel distributions to[0,1], we can either formcensoredortruncated distributions. Thetruncateddistribution,ontheotherhand,multipliesthe densities with the indicator function for[0,1]and renormalizes them to integrate to 1. The higher the temperature of the boxes, the more the true integral will tend to provide larger conditional probabilities. Monte Carlo experiments support this conclusion.


How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study

Dubois, Matthieu, Yvon, François, Piantanida, Pablo

arXiv.org Artificial Intelligence

As texts generated by Large Language Models (LLMs) are ever more common and often indistinguishable from human-written content, research on automatic text detection has attracted growing attention. Many recent detectors report near-perfect accuracy, often boasting AUROC scores above 99\%. However, these claims typically assume fixed generation settings, leaving open the question of how robust such systems are to changes in decoding strategies. In this work, we systematically examine how sampling-based decoding impacts detectability, with a focus on how subtle variations in a model's (sub)word-level distribution affect detection performance. We find that even minor adjustments to decoding parameters - such as temperature, top-p, or nucleus sampling - can severely impair detector accuracy, with AUROC dropping from near-perfect levels to 1\% in some settings. Our findings expose critical blind spots in current detection methods and emphasize the need for more comprehensive evaluation protocols. To facilitate future research, we release a large-scale dataset encompassing 37 decoding configurations, along with our code and evaluation framework https://github.com/BaggerOfWords/Sampling-and-Detection



Uncertainty-aware Reward Design Process

Yang, Yang, Zhou, Xiaolu, Ding, Bosong, Xin, Miao

arXiv.org Artificial Intelligence

Designing effective reward functions is a cornerstone of reinforcement learning (RL), yet it remains a challenging process due to the inefficiencies and inconsistencies inherent in conventional reward engineering methodologies. Recent advances have explored leveraging large language models (LLMs) to automate reward function design. However, their suboptimal performance in numerical optimization often yields unsatisfactory reward quality, while the evolutionary search paradigm demonstrates inefficient utilization of simulation resources, resulting in prohibitively lengthy design cycles with disproportionate computational overhead. To address these challenges, we propose the Uncertainty-aware Reward Design Process (URDP), a novel framework that integrates large language models to streamline reward function design and evaluation in RL environments. URDP quantifies candidate reward function uncertainty based on self-consistency analysis, enabling simulation-free identification of ineffective reward components while discovering novel reward components. Furthermore, we introduce uncertainty-aware Bayesian optimization (UABO), which incorporates uncertainty estimation to significantly enhance hyperparameter configuration efficiency. Finally, we construct a bi-level optimization architecture by decoupling the reward component optimization and the hyperparameter tuning. URDP orchestrates synergistic collaboration between the reward logic reasoning of the LLMs and the numerical optimization strengths of the Bayesian Optimization. We conduct a comprehensive evaluation of URDP across 35 diverse tasks spanning three benchmark environments. Our experimental results demonstrate that URDP not only generates higher-quality reward functions but also achieves significant improvements in the efficiency of automated reward design compared to existing approaches.


Reviews: A flexible model for training action localization with varying levels of supervision

Neural Information Processing Systems

Paper Summary: The paper describes a method for spatio-temporal human action localization in temporally untrimmed videos based on discriminative clustering [3, 47]. The main contribution of this paper is a new action detection approach which is flexible in the sense that it can be trained with various levels and amounts of supervision. For example, the model can be trained with very weak level of supervision, i.e., train the model for action detection only using ground truth video-level action labels; and also it can be trained with full supervision i.e. with dense per frame bounding box and their class labels. Experimental results demonstrate the strengths and weaknesses for a wide range of supervisory signals such as, video level action labels, single temporal point, one GT bounding box, temporal bounds etc. The method is experimentally evaluated on the UCF-101-24 and DALY action detection datasets.


GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick

Fu, Jiayi, Zhao, Xuandong, Yang, Ruihan, Zhang, Yuansen, Chen, Jiangjie, Xiao, Yanghua

arXiv.org Artificial Intelligence

Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty. Decoding-based watermark, particularly the GumbelMax-trick-based watermark(GM watermark), is a standout solution for safeguarding machine-generated texts due to its notable detectability. However, GM watermark encounters a major challenge with generation diversity, always yielding identical outputs for the same prompt, negatively impacting generation diversity and user experience. To overcome this limitation, we propose a new type of GM watermark, the Logits-Addition watermark, and its three variants, specifically designed to enhance diversity. Among these, the GumbelSoft watermark (a softmax variant of the Logits-Addition watermark) demonstrates superior performance in high diversity settings, with its AUROC score outperforming those of the two alternative variants by 0.1 to 0.3 and surpassing other decoding-based watermarking methods by a minimum of 0.1.


Automated fault tree learning from continuous-valued sensor data: a case study on domestic heaters

Verkuil, Bart, Budde, Carlos E., Bucur, Doina

arXiv.org Artificial Intelligence

Many industrial sectors have been collecting big sensor data. With recent technologies for processing big data, companies can exploit this for automatic failure detection and prevention. We propose the first completely automated method for failure analysis, machine-learning fault trees from raw observational data with continuous variables. Our method scales well and is tested on a real-world, five-year dataset of domestic heater operations in The Netherlands, with 31 million unique heater-day readings, each containing 27 sensor and 11 failure variables. Our method builds on two previous procedures: the C4.5 decision-tree learning algorithm, and the LIFT fault tree learning algorithm from Boolean data. C4.5 pre-processes each continuous variable: it learns an optimal numerical threshold which distinguishes between faulty and normal operation of the top-level system. These thresholds discretise the variables, thus allowing LIFT to learn fault trees which model the root failure mechanisms of the system and are explainable. We obtain fault trees for the 11 failure variables, and evaluate them in two ways: quantitatively, with a significance score, and qualitatively, with domain specialists. Some of the fault trees learnt have almost maximum significance (above 0.95), while others have medium-to-low significance (around 0.30), reflecting the difficulty of learning from big, noisy, real-world sensor data. The domain specialists confirm that the fault trees model meaningful relationships among the variables.