Goto

Collaborating Authors

 conditioning event


Malliavin Calculus with Weak Derivatives for Counterfactual Stochastic Optimization

arXiv.org Artificial Intelligence

We study counterfactual stochastic optimization of conditional loss functionals under misspecified and noisy gradient information. The difficulty is that when the conditioning event has vanishing or zero probability, naive Monte Carlo estimators are prohibitively inefficient; kernel smoothing, though common, suffers from slow convergence. We propose a two-stage kernel-free methodology. First, we show using Malliavin calculus that the conditional loss functional of a diffusion process admits an exact representation as a Skorohod integral, yielding variance comparable to classical Monte-Carlo variance. Second, we establish that a weak derivative estimate of the conditional loss functional with respect to model parameters can be evaluated with constant variance, in contrast to the widely used score function method whose variance grows linearly in the sample path length. Together, these results yield an efficient framework for counterfactual conditional stochastic gradient algorithms in rare-event regimes.


Multicalibration for Confidence Scoring in LLMs

arXiv.org Machine Learning

This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting groupings of the data. We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctness via two techniques: clustering within an embedding space, and "self-annotation" - querying the LLM by asking it various yes-or-no questions about the prompt. We also develop novel variants of multicalibration algorithms that offer performance improvements by reducing their tendency to overfit. Through systematic benchmarking across various question answering datasets and LLMs, we show how our techniques can yield confidence scores that provide substantial improvements in fine-grained measures of both calibration and accuracy compared to existing methods.


Exact Selective Inference with Randomization

arXiv.org Machine Learning

The polyhedral method by Lee et al. (2016) introduced confidence intervals for exact selective inference in Gaussian regression models. This method provides valid inferences for selected parameters by conditioning on the outcome of selection. A pivot is obtained for each selected parameter from a truncated Gaussian distribution, provided the outcome of selection can be described by linear constraints, also known as polyhedral constraints. However, as shown by Kivaranovic and Leeb (2021), confidence intervals based on this pivot can have infinite length in expectation. Randomizing data at the time of selection and conditioning on the outcome of randomized selection produces narrower confidence intervals than the polyhedral method.


Testing for equality between conditional copulas given discretized conditioning events

arXiv.org Machine Learning

Several procedures have been recently proposed to test the simplifying assumption for conditional copulas. Instead of considering pointwise conditioning events, we study the constancy of the conditional dependence structure when some covariates belong to general borelian conditioning subsets. Several test statistics based on the equality of conditional Kendall's tau are introduced, and we derive their asymptotic distributions under the null. When such conditioning events are not fixed ex ante, we propose a data-driven procedure to recursively build such relevant subsets. It is based on decision trees that maximize the differences between the conditional Kendall's taus corresponding to the leaves of the trees. The performances of such tests are illustrated in a simulation experiment. Moreover, a study of the conditional dependence between financial stock returns is managed, given some clustering of their past values. The last application deals with the conditional dependence between coverage amounts in an insurance dataset.


On conditional versus marginal bias in multi-armed bandits

arXiv.org Machine Learning

The bias of the sample means of the arms in multi-armed bandits is an important issue in adaptive data analysis that has recently received considerable attention in the literature. Existing results relate in precise ways the sign and magnitude of the bias to various sources of data adaptivity, but do not apply to the conditional inference setting in which the sample means are computed only if some specific conditions are satisfied. In this paper, we characterize the sign of the conditional bias of monotone functions of the rewards, including the sample mean. Our results hold for arbitrary conditioning events and leverage natural monotonicity properties of the data collection policy. We further demonstrate, through several examples from sequential testing and best arm identification, that the sign of the conditional and unconditional bias of the sample mean of an arm can be different, depending on the conditioning event. Our analysis offers new and interesting perspectives on the subtleties of assessing the bias in data adaptive settings.


Unforeseen Evidence

arXiv.org Artificial Intelligence

In this note, I propose a normative updating rule, extended Bayesianism, for the incorporation of probabilistic information arising from the process of becoming more aware. Extended Bayesianism generalizes standard Bayesian updating to allow the posterior to reside on richer probability space than the prior. I then provide an observable criterion on prior and posterior beliefs such that they were consistent with extended Bayesianism. Key words: extended Bayesianism; reverse Bayesianism; conditional expectations. Conditioning on Unforeseen Evidence Decision maker's (DM's) who are unaware, cannot conceive of, nor articulate, the decision relevant contingencies they are unaware of.


Anticipatory Thinking: A Metacognitive Capability

arXiv.org Artificial Intelligence

Anticipatory thinking is a complex cognitive process for assessing and managing risk in many contexts. Humans use anticipatory thinking to identify potential future issues and proactively take actions to manage their risks. In this paper we define a cognitive systems approach to anticipatory thinking as a metacognitive goal reasoning mechanism. The contributions of this paper include (1) defining anticipatory thinking in the MIDCA cognitive architecture, (2) operationalizing anticipatory thinking as a three step process for managing risk in plans, and (3) a numeric risk assessment calculating an expected cost-benefit ratio for modifying a plan with anticipatory actions.


Controlling Global Statistics in Recurrent Neural Network Text Generation

AAAI Conferences

Recurrent neural network language models (RNNLMs) are an essential component for many language generation tasks such as machine translation, summarization, and automated conversation. Often, we would like to subject the text generated by the RNNLM to constraints, in order to overcome systemic errors (e.g. word repetition) or achieve application-specific goals (e.g. more positive sentiment). In this paper, we present a method for training RNNLMs to simultaneously optimize likelihood and follow a given set of statistical constraints on text generation.  The problem is challenging because the statistical constraints are defined over aggregate model behavior, rather than model parameters, meaning that a straightforward parameter regularization approach is insufficient.  We solve this problem using a dynamic regularizer that updates as training proceeds, based on the generative behavior of the RNNLMs.  Our experiments show that the dynamic regularizer outperforms both generic training and a static regularization baseline.  The approach is successful at improving word-level repetition statistics by a factor of four in RNNLMs on a definition modeling task.  It also improves model perplexity when the statistical constraints are $n$-gram statistics taken from a large corpus.