Goto

Collaborating Authors

 pseudolabel


065e259a1d2d955e63b99aac6a3a3081-Paper-Conference.pdf

Neural Information Processing Systems

In the adversarial training framework of Carmon et al. (2019); Gowal et al. (2021), people use generated/real unlabeled data with pseudolabels to improve adversarial robustness. We provide statistical insights to explain why the artificially generated data improve adversarial training. In particular, we study how the attack strength and the quality of the unlabeled data affect adversarial robustness in this framework. Our results show that with a high-quality unlabeled data generator, adversarial training can benefit greatly from this framework under large attack strength, while a poor generator can still help to some extent. To make adaptions concerning the quality of generated data, we propose an algorithm that performs online adjustment to the weight between the labeled real data and the generated data, aiming to optimize the adversarial risk. Numerical studies are conducted to verify our theories and show the effectiveness of the proposed algorithm.



LiftingWeakSupervisionToStructuredPrediction

Neural Information Processing Systems

For labels taking values in a finite metric space, we introduce techniques new to weak supervision based on pseudo-Euclidean embeddings andtensor decompositions, providing anearly-consistent noise rate estimator.


LiftingWeakSupervisionToStructuredPrediction

Neural Information Processing Systems

For labels taking values in a finite metric space, we introduce techniques new to weak supervision based on pseudo-Euclidean embeddings andtensor decompositions, providing anearly-consistent noise rate estimator.




Lifting Weak Supervision To Structured Prediction

Neural Information Processing Systems

Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from various sources. WS is theoretically well-understood for binary classification, where simple approaches enable consistent estimation of pseudolabel noise rates. Using this result, it has been shown that downstream models trained on the pseudolabels have generalization guarantees nearly identical to those trained on clean labels. While this is exciting, users often wish to use WS for \emph{structured prediction}, where the output space consists of more than a binary or multi-class label set: e.g.




Unsupervised Hallucination Detection by Inspecting Reasoning Processes

arXiv.org Artificial Intelligence

Unsupervised hallucination detection aims to identify hallucinated content generated by large language models (LLMs) without relying on labeled data. While unsupervised methods have gained popularity by eliminating labor-intensive human annotations, they frequently rely on proxy signals unrelated to factual correctness. This misalignment biases detection probes toward superficial or non-truth-related aspects, limiting generalizability across datasets and scenarios. To overcome these limitations, we propose IRIS, an unsupervised hallucination detection framework, leveraging internal representations intrinsic to factual correctness. IRIS prompts the LLM to carefully verify the truthfulness of a given statement, and obtain its contextualized embedding as informative features for training. Meanwhile, the uncertainty of each response is considered a soft pseudolabel for truthfulness. Experimental results demonstrate that IRIS consistently outperforms existing unsupervised methods. Our approach is fully unsupervised, computationally low cost, and works well even with few training data, making it suitable for real-time detection.