AITopics

2409.04313

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Austria (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Timonen, Juho, Lähdesmäki, Harri

Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data

arXiv.org Artificial IntelligenceSep-6-2024

Gaussian process (GP) models that combine both categorical and continuous input variables have found use in longitudinal data analysis of and computer experiments. However, standard inference for these models has the typical cubic scaling, and common scalable approximation schemes for GPs cannot be applied since the covariance function is non-continuous. In this work, we derive a basis function approximation scheme for mixed-domain covariance functions, which scales linearly with respect to the number of observations and total number of basis functions. The proposed approach is naturally applicable to also Bayesian GP regression with discrete observation models. We demonstrate the scalability of the approach and compare model reduction techniques for additive GP models in a longitudinal data context. We confirm that we can approximate the exact GP model accurately in a fraction of the runtime compared to fitting the corresponding exact model. In addition, we demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models when dealing with a large number of candidate predictors.

experiment, kernel, snr, (16 more...)

2111.02019

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceSep-6-2024

Exploiting the Data Gap: Utilizing Non-ignorable Missingness to Manipulate Model Learning

Koyuncu, Deniz, Gittens, Alex, Yener, Bülent, Yung, Moti

Missing data is commonly encountered in practice, and when the missingness is non-ignorable, effective remediation depends on knowledge of the missingness mechanism. Learning the underlying missingness mechanism from the data is not possible in general, so adversaries can exploit this fact by maliciously engineering non-ignorable missingness mechanisms. Such Adversarial Missingness (AM) attacks have only recently been motivated and introduced, and then successfully tailored to mislead causal structure learning algorithms into hiding specific cause-and-effect relationships. However, existing AM attacks assume the modeler (victim) uses full-information maximum likelihood methods to handle the missing data, and are of limited applicability when the modeler uses different remediation strategies. In this work we focus on associational learning in the context of AM attacks. We consider (i) complete case analysis, (ii) mean imputation, and (iii) regression-based imputation as alternative strategies used by the modeler. Instead of combinatorially searching for missing entries, we propose a novel probabilistic approximation by deriving the asymptotic forms of these methods used for handling the missing entries. We then formulate the learning of the adversarial missingness mechanism as a bi-level optimization problem. Experiments on generalized linear models show that AM attacks can be used to change the p-values of features from significant to insignificant in real datasets, such as the California-housing dataset, while using relatively moderate amounts of missingness (<20%). Additionally, we assess the robustness of our attacks against defense strategies based on data valuation.

imputation, mechanism, missingness mechanism, (16 more...)

2409.04407

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.89)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
(3 more...)

Mora, Carlos, Yousefpour, Amin, Hosseinmardi, Shirin, Owhadi, Houman, Bostanabad, Ramin

Operator Learning with Gaussian Processes

arXiv.org Machine LearningSep-6-2024

Operator learning focuses on approximating mappings $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\mathcal{V}$ between infinite-dimensional spaces of functions, such as $u: \Omega_u\rightarrow\mathbb{R}$ and $v: \Omega_v\rightarrow\mathbb{R}$. This makes it particularly suitable for solving parametric nonlinear partial differential equations (PDEs). While most machine learning methods for operator learning rely on variants of deep neural networks (NNs), recent studies have shown that Gaussian Processes (GPs) are also competitive while offering interpretability and theoretical guarantees. In this paper, we introduce a hybrid GP/NN-based framework for operator learning that leverages the strengths of both methods. Instead of approximating the function-valued operator $\mathcal{G}^\dagger$, we use a GP to approximate its associated real-valued bilinear form $\widetilde{\mathcal{G}}^\dagger: \mathcal{U}\times\mathcal{V}^*\rightarrow\mathbb{R}.$ This bilinear form is defined by $\widetilde{\mathcal{G}}^\dagger(u,\varphi) := [\varphi,\mathcal{G}^\dagger(u)],$ which allows us to recover the operator $\mathcal{G}^\dagger$ through $\mathcal{G}^\dagger(u)(y)=\widetilde{\mathcal{G}}^\dagger(u,\delta_y).$ The GP mean function can be zero or parameterized by a neural operator and for each setting we develop a robust training mechanism based on maximum likelihood estimation (MLE) that can optionally leverage the physics involved. Numerical benchmarks show that (1) it improves the performance of a base neural operator by using it as the mean function of a GP, and (2) it enables zero-shot data-driven models for accurate predictions without prior training. Our framework also handles multi-output operators where $\mathcal{G}^\dagger:\mathcal{U} \rightarrow\prod_{s=1}^S\mathcal{V}^s$, and benefits from computational speed-ups via product kernel structures and Kronecker product matrix representations.

gaussian process, operator learning

arXiv.org Machine Learning

2409.04538

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.53)

Dellaporta, Charita, O'Hara, Patrick, Damoulas, Theodoros

Distributionally Robust Optimisation with Bayesian Ambiguity Sets

Decision making under uncertainty is challenging since the data-generating process (DGP) is often unknown. Bayesian inference proceeds by estimating the DGP through posterior beliefs about the model's parameters. However, minimising the expected risk under these posterior beliefs can lead to sub-optimal decisions due to model uncertainty or limited, noisy observations. To address this, we introduce Distributionally Robust Optimisation with Bayesian Ambiguity Sets (DRO-BAS) which hedges against uncertainty in the model by optimising the worst-case risk over a posterior-informed ambiguity set. We show that our method admits a closed-form dual representation for many exponential family members and showcase its improved out-of-sample robustness against existing Bayesian DRO methodology in the Newsvendor problem.

ambiguity, bdro, normal-gamma distribution, (13 more...)

2409.03492

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Pérez, Aritz, Echegoyen, Carlos, Santafé, Guzmán

Risk-based Calibration for Probabilistic Classifiers

We introduce a general iterative procedure called risk-based calibration (RC) designed to minimize the empirical risk under the 0-1 loss (empirical error) for probabilistic classifiers. These classifiers are based on modeling probability distributions, including those constructed from the joint distribution (generative) and those based on the class conditional distribution (conditional). RC can be particularized to any probabilistic classifier provided a specific learning algorithm that computes the classifier's parameters in closed form using data statistics. RC reinforces the statistics aligned with the true class while penalizing those associated with other classes, guided by the 0-1 loss. The proposed method has been empirically tested on 30 datasets using na\"ive Bayes, quadratic discriminant analysis, and logistic regression classifiers. RC improves the empirical error of the original closed-form learning algorithms and, more notably, consistently outperforms the gradient descent approach with the three classifiers.

algorithm, classifier, statistics, (17 more...)

2409.03542

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Navarre > Pamplona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Painful intelligence: What AI can tell us about human suffering

Hyvärinen, Aapo

buddhist meditation manual visuddhimagga, pattern analysis and machine intelligence, unpredictability and uncontrollability, (15 more...)

This book uses the modern theory of artificial intelligence (AI) to understand human suffering or mental pain. Both humans and sophisticated AI agents process information about the world in order to achieve goals and obtain rewards, which is why AI can be used as a model of the human brain and mind. This book intends to make the theory accessible to a relatively general audience, requiring only some relevant scientific background. The book starts with the assumption that suffering is mainly caused by frustration. Frustration means the failure of an agent (whether AI or human) to achieve a goal or a reward it wanted or expected. Frustration is inevitable because of the overwhelming complexity of the world, limited computational resources, and scarcity of good data. In particular, such limitations imply that an agent acting in the real world must cope with uncontrollability, unpredictability, and uncertainty, which all lead to frustration. Fundamental in such modelling is the idea of learning, or adaptation to the environment. While AI uses machine learning, humans and animals adapt by a combination of evolutionary mechanisms and ordinary learning. Even frustration is fundamentally an error signal that the system uses for learning. This book explores various aspects and limitations of learning algorithms and their implications regarding suffering. At the end of the book, the computational theory is used to derive various interventions or training methods that will reduce suffering in humans. The amount of frustration is expressed by a simple equation which indicates how it can be reduced. The ensuing interventions are very similar to those proposed by Buddhist and Stoic philosophy, and include mindfulness meditation. Therefore, this book can be interpreted as an exposition of a computational theory justifying why such philosophies and meditation reduce human suffering.

2205.15409

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
North America > United States > New York (0.04)
(14 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report (1.00)
(4 more...)

Industry:

Transportation (1.00)
Media (1.00)
Leisure & Entertainment > Games (1.00)
(13 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(16 more...)

arXiv.org Machine LearningSep-5-2024

Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria

Guo, Anna, Nabi, Razieh

The identification theory for causal effects in directed acyclic graphs (DAGs) with hidden variables is well-developed, but methods for estimating and inferring functionals beyond the g-formula remain limited. Previous studies have proposed semiparametric estimators for identifiable functionals in a broad class of DAGs with hidden variables. While demonstrating double robustness in some models, existing estimators face challenges, particularly with density estimation and numerical integration for continuous variables, and their estimates may fall outside the parameter space of the target estimand. Their asymptotic properties are also underexplored, especially when using flexible statistical and machine learning models for nuisance estimation. This study addresses these challenges by introducing novel one-step corrected plug-in and targeted minimum loss-based estimators of causal effects for a class of DAGs that extend classical back-door and front-door criteria (known as the treatment primal fixability criterion in prior literature). These estimators leverage machine learning to minimize modeling assumptions while ensuring key statistical properties such as asymptotic linearity, double robustness, efficiency, and staying within the bounds of the target parameter space. We establish conditions for nuisance functional estimates in terms of L2(P)-norms to achieve root-n consistent causal effect estimates. To facilitate practical application, we have developed the flexCausal package in R.

estimation, estimator, loss function, (17 more...)

arXiv.org Machine Learning

2409.03962

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.45)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Brouillon, Jean-Sébastien, Dörfler, Florian, Ferrari-Trecate, Giancarlo

Maximum likelihood inference for high-dimensional problems with multiaffine variable relations

Maximum Likelihood Estimation of continuous variable models can be very challenging in high dimensions, due to potentially complex probability distributions. The existence of multiple interdependencies among variables can make it very difficult to establish convergence guarantees. This leads to a wide use of brute-force methods, such as grid searching and Monte-Carlo sampling and, when applicable, complex and problem-specific algorithms. In this paper, we consider inference problems where the variables are related by multiaffine expressions. We propose a novel Alternating and Iteratively-Reweighted Least Squares (AIRLS) algorithm, and prove its convergence for problems with Generalized Normal Distributions. We also provide an efficient method to compute the variance of the estimates obtained using AIRLS. Finally, we show how the method can be applied to graphical statistical models. We perform numerical experiments on several inference problems, showing significantly better performance than state-of-the-art approaches in terms of scalability, robustness to noise, and convergence speed due to an empirically observed super-linear convergence rate.

airl, algorithm 1, assumption 1, (15 more...)

2409.03495

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Ohio (0.04)
North America > United States > Michigan (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (0.84)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Semi-Supervised Sparse Gaussian Classification: Provable Benefits of Unlabeled Data

Azar, Eyar, Nadler, Boaz

The premise of semi-supervised learning (SSL) is that combining labeled and unlabeled data yields significantly more accurate models. Despite empirical successes, the theoretical understanding of SSL is still far from complete. In this work, we study SSL for high dimensional sparse Gaussian classification. To construct an accurate classifier a key task is feature selection, detecting the few variables that separate the two classes. % For this SSL setting, we analyze information theoretic lower bounds for accurate feature selection as well as computational lower bounds, assuming the low-degree likelihood hardness conjecture. % Our key contribution is the identification of a regime in the problem parameters (dimension, sparsity, number of labeled and unlabeled samples) where SSL is guaranteed to be advantageous for classification. Specifically, there is a regime where it is possible to construct in polynomial time an accurate SSL classifier. However, % any computationally efficient supervised or unsupervised learning schemes, that separately use only the labeled or unlabeled data would fail. Our work highlights the provable benefits of combining labeled and unlabeled data for {classification and} feature selection in high dimensions. We present simulations that complement our theoretical analysis.

classification, classifier, theorem 2, (16 more...)

2409.03335

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)