Goto

Collaborating Authors

 statistic



A tensor network formalism for neuro-symbolic AI

Goessmann, Alex, Schütte, Janina, Fröhlich, Maximilian, Eigel, Martin

arXiv.org Machine Learning

The unification of neural and symbolic approaches to artificial intelligence remains a central open challenge. In this work, we introduce a tensor network formalism, which captures sparsity principles originating in the different approaches in tensor decompositions. In particular, we describe a basis encoding scheme for functions and model neural decompositions as tensor decompositions. The proposed formalism can be applied to represent logical formulas and probability distributions as structured tensor decompositions. This unified treatment identifies tensor network contractions as a fundamental inference class and formulates efficiently scaling reasoning algorithms, originating from probability theory and propositional logic, as contraction message passing schemes. The framework enables the definition and training of hybrid logical and probabilistic models, which we call Hybrid Logic Network. The theoretical concepts are accompanied by the python library tnreason, which enables the implementation and practical use of the proposed architectures.


Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking

Neural Information Processing Systems

Stochastic dominance is an important concept in probability theory, econometrics and social choice theory for robustly modeling agents' preferences between random outcomes. While many works have been dedicated to the univariate case,little has been done in the multivariate scenario, wherein an agent has to decide between different multivariate outcomes. By exploiting a characterization of multivariate first stochastic dominance in terms of couplings, we introduce a statistic that assesses multivariate almost stochastic dominance under the framework of Optimal Transport with a smooth cost. Further, we introduce an entropic regularization of this statistic, and establish a central limit theorem (CLT) and consistency of the bootstrap procedure for the empirical statistic. Armed with this CLT, we propose a hypothesis testing framework as well as an efficient implementation using the Sinkhorn algorithm. We showcase our method in comparing and benchmarking Large Language Models that are evaluated on multiple metrics. Our multivariate stochastic dominance test allows us to capture the dependencies between the metrics in order to make an informed and statistically significant decision on the relative performance of the models.


Semiparametric KSD test: unifying score and distance-based approaches for goodness-of-fit testing

Huang, Zhihan, Niu, Ziang

arXiv.org Machine Learning

Goodness-of-fit (GoF) tests are fundamental for assessing model adequacy. Score-based tests are appealing because they require fitting the model only once under the null. However, extending them to powerful nonparametric alternatives is difficult due to the lack of suitable score functions. Through a class of exponentially tilted models, we show that the resulting score-based GoF tests are equivalent to the tests based on integral probability metrics (IPMs) indexed by a function class. When the class is rich, the test is universally consistent. This simple yet insightful perspective enables reinterpretation of classical distance-based testing procedures-including those based on Kolmogorov-Smirnov distance, Wasserstein-1 distance, and maximum mean discrepancy-as arising from score-based constructions. Building on this insight, we propose a new nonparametric score-based GoF test through a special class of IPM induced by kernelized Stein's function class, called semiparametric kernelized Stein discrepancy (SKSD) test. Compared with other nonparametric score-based tests, the SKSD test is computationally efficient and accommodates general nuisance-parameter estimators, supported by a generic parametric bootstrap procedure. The SKSD test is universally consistent and attains Pitman efficiency. Moreover, SKSD test provides simple GoF tests for models with intractable likelihoods but tractable scores with the help of Stein's identity and we use two popular models, kernel exponential family and conditional Gaussian models, to illustrate the power of our method. Our method achieves power comparable to task-specific normality tests such as Anderson-Darling and Lilliefors, despite being designed for general nonparametric alternatives.


Neural Tangent Kernel Maximum Mean Discrepancy

Neural Information Processing Systems

We present a novel neural network Maximum Mean Discrepancy (MMD) statistic by identifying a new connection between neural tangent kernel (NTK) and MMD. This connection enables us to develop a computationally efficient and memory-efficient approach to compute the MMD statistic and perform NTK based two-sample tests towards addressing the long-standing challenge of memory and computational complexity of the MMD statistic, which is essential for online implementation to assimilating new samples. Theoretically, such a connection allows us to understand the NTK test statistic properties, such as the Type-I error and testing power for performing the two-sample test, by adapting existing theories for kernel MMD. Numerical experiments on synthetic and real-world datasets validate the theory and demonstrate the effectiveness of the proposed NTK-MMD statistic.


Inference for Batched Adaptive Experiments

Kemper, Jan, Rostam-Afschar, Davud

arXiv.org Artificial Intelligence

The advantages of adaptive experiments have led to their rapid adoption in economics, other fields, as well as among practitioners. However, adaptive experiments pose challenges for causal inference. This note suggests a BOLS (batched ordinary least squares) test statistic for inference of treatment effects in adaptive experiments. The statistic provides a precision-equalizing aggregation of per-period treatment-control differences under heteroskedasticity. The combined test statistic is a normalized average of heteroskedastic per-period z-statistics and can be used to construct asymptotically valid confidence intervals. We provide simulation results comparing rejection rates in the typical case with few treatment periods and few (or many) observations per batch.


Machine Learning-based Unfolding for Cross Section Measurements in the Presence of Nuisance Parameters

Zhu, Huanbiao, Desai, Krish, Kuusela, Mikael, Mikuni, Vinicius, Nachman, Benjamin, Wasserman, Larry

arXiv.org Machine Learning

Statistically correcting measured cross sections for detector effects is an important step across many applications. In particle physics, this inverse problem is known as \textit{unfolding}. In cases with complex instruments, the distortions they introduce are often known only implicitly through simulations of the detector. Modern machine learning has enabled efficient simulation-based approaches for unfolding high-dimensional data. Among these, one of the first methods successfully deployed on experimental data is the \textsc{OmniFold} algorithm, a classifier-based Expectation-Maximization procedure. In practice, however, the forward model is only approximately specified, and the corresponding uncertainty is encoded through nuisance parameters. Building on the well-studied \textsc{OmniFold} algorithm, we show how to extend machine learning-based unfolding to incorporate nuisance parameters. Our new algorithm, called Profile \textsc{OmniFold}, is demonstrated using a Gaussian example as well as a particle physics case study using simulated data from the CMS Experiment at the Large Hadron Collider.


Empirical Decision Theory

Jansen, Christoph, Schollmeyer, Georg, Augustin, Thomas, Rodemann, Julian

arXiv.org Machine Learning

Analyzing decision problems under uncertainty commonly relies on idealizing assumptions about the describability of the world, with the most prominent examples being the closed world and the small world assumption. Most assumptions are operationalized by introducing states of the world, conditional on which the decision situation can be analyzed without any remaining uncertainty. Conversely, most classical decision-theoretic approaches are not applicable if the states of the world are inaccessible. We propose a decision model that retains the appeal and simplicity of the original theory, but completely overcomes the need to specify the states of the world explicitly. The main idea of our approach is to address decision problems in a radically empirical way: instead of specifying states and consequences prior to the decision analysis, we only assume a protocol of observed act--consequence pairs as model primitives. We show how optimality in such empirical decision problems can be addressed by using protocol-based empirical choice functions and discuss three approaches for deriving inferential guarantees: (I) consistent statistical estimation of choice sets, (II) consistent statistical testing of choice functions with robustness guarantees, and (III) direct inference for empirical choice functions using credal sets. We illustrate our theory with a proof-of-concept application comparing different prompting strategies in generative AI models.


Fusion of classical and quantum kernels enables accurate and robust two-sample tests

Terada, Yu, Ogio, Yugo, Arai, Ken, Tezuka, Hiroyuki, Tanaka, Yu

arXiv.org Artificial Intelligence

Two-sample tests have been extensively employed in various scientific fields and machine learning such as evaluation on the effectiveness of drugs and A/B testing on different marketing strategies to discriminate whether two sets of samples come from the same distribution or not. Kernel-based procedures for hypothetical testing have been proposed to efficiently disentangle high-dimensional complex structures in data to obtain accurate results in a model-free way by embedding the data into the reproducing kernel Hilbert space (RKHS). While the choice of kernels plays a crucial role for their performance, little is understood about how to choose kernel especially for small datasets. Here we aim to construct a hypothetical test which is effective even for small datasets, based on the theoretical foundation of kernel-based tests using maximum mean discrepancy, which is called MMD-FUSE. To address this, we enhance the MMD-FUSE framework by incorporating quantum kernels and propose a novel hybrid testing strategy that fuses classical and quantum kernels. This approach creates a powerful and adaptive test by combining the domain-specific inductive biases of classical kernels with the unique expressive power of quantum kernels. We evaluate our method on various synthetic and real-world clinical datasets, and our experiments reveal two key findings: 1) With appropriate hyperparameter tuning, MMD-FUSE with quantum kernels consistently improves test power over classical counterparts, especially for small and high-dimensional data. 2) The proposed hybrid framework demonstrates remarkable robustness, adapting to different data characteristics and achieving high test power across diverse scenarios. These results highlight the potential of quantum-inspired and hybrid kernel strategies to build more effective statistical tests, offering a versatile tool for data analysis where sample sizes are limited.