AITopics | i-projection

Collaborating Authors

i-projection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence

Fuller, Franklin, Bertolini, Daniele, Liang, Samantha, Christopher, Jason, Smith, Aaron M.

arXiv.org Machine LearningMay-18-2026

Many decision in clinical science and epidemiology -- estimating probability of technical success for a clinical trial, assessing comparative effectiveness of two therapies, imputing a placebo effect onto natural history data -- rely on combining sources of information about a clinical cohort that comes from different kinds of studies. Specifically we contrast patient-level sources that provide granular pictures of individual disease course (clinical trial, registries, or electronic health records) with aggregate sources such as published clinical trial results and the TFLs (tables figures and listings). One strategy for combining aggregate with patient-level data sources is to bring each into a common format for a unified analysis. If one wants to maintain the analytic flexibility of patient-level data, then a natural solution is to convert the aggregate data information into a simulated patient-level dataset that recapitulate those aggregate statistics. This is an under-determined inverse problem in that there are many such datasets, and it cannot be well specified without further constraints. FRESH(Fusion of Recent Evidence with Subject Histories) provides a well-defined method for solving this problem, and therefore providing maximal analytic flexibility.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Machine Learning

2605.16246

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

f86890095c957e9b949d11d15f0d0cd5-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 00:06:26 GMT

denote, distribution shift, information, (13 more...)

Neural Information Processing Systems

Country:

North America (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

f86890095c957e9b949d11d15f0d0cd5-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 00:06:22 GMT

distribution shift, estimator, information, (13 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

f86890095c957e9b949d11d15f0d0cd5-Supplemental.pdf

Neural Information Processing SystemsAug-19-2025, 00:03:04 GMT

data mining, distribution shift, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Neural Information Processing SystemsAug-19-2025, 00:03:00 GMT

Consider building a model for diagnosing a specific heart disease, and suppose that most participants of the study are middle to high-aged men.

data mining, distribution shift, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Discrete Copula Diffusion

Liu, Anji, Broadrick, Oliver, Niepert, Mathias, Broeck, Guy Van den

arXiv.org Artificial IntelligenceOct-2-2024

Discrete diffusion models have recently shown significant progress in modeling complex data, such as natural languages and DNA sequences. However, unlike diffusion models for continuous data, which can generate high-quality samples in just a few denoising steps, modern discrete diffusion models still require hundreds or even thousands of denoising steps to perform well. In this paper, we identify a fundamental limitation that prevents discrete diffusion models from achieving strong performance with fewer steps -- they fail to capture dependencies between output variables at each denoising step. To address this issue, we provide a formal explanation and introduce a general approach to supplement the missing dependency information by incorporating another deep generative model, termed the copula model. Our method does not require fine-tuning either the diffusion model or the copula model, yet it enables high-quality sample generation with significantly fewer denoising steps. When we apply this approach to autoregressive copula models, the combined model outperforms both models individually in unconditional and conditional text generation. Specifically, the hybrid model achieves better (un)conditional text generation using 8 to 32 times fewer denoising steps than the diffusion model alone. In addition to presenting an effective discrete diffusion generation algorithm, this paper emphasizes the importance of modeling inter-variable dependencies in discrete diffusion.

diffusion model, discrete diffusion model, univariate marginal, (12 more...)

arXiv.org Artificial Intelligence

2410.01949

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Switzerland (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
(8 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports (1.00)
Government (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Probabilistic Control and Majorization of Optimal Control

Lefebvre, Tom

arXiv.org Artificial IntelligenceNov-15-2023

Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.

formulation, objective, optimal control, (17 more...)

arXiv.org Artificial Intelligence

2205.03279

Country:

Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.55)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Expected Information Maximization: Using the I-Projection for Mixture Density Estimation

Becker, Philipp, Arenz, Oleg, Neumann, Gerhard

arXiv.org Machine LearningJan-23-2020

Modelling highly multi-modal data is a challenging problem in machine learning. Most algorithms are based on maximizing the likelihood, which corresponds to the M(oment)-projection of the data distribution to the model distribution. The M-projection forces the model to average over modes it cannot represent. In contrast, the I(information)-projection ignores such modes in the data and concentrates on the modes the model can represent. Such behavior is appealing whenever we deal with highly multi-modal data where modelling single modes correctly is more important than covering all the modes. Despite this advantage, the I-projection is rarely used in practice due to the lack of algorithms that can efficiently optimize it based on data. In this work, we present a new algorithm called Expected Information Maximization (EIM) for computing the I-projection solely based on samples for general latent variable models, where we focus on Gaussian mixtures models and Gaussian mixtures of experts. Our approach applies a variational upper bound to the I-projection objective which decomposes the original objective into single objectives for each mixture component as well as for the coefficients, allowing an efficient optimization. Similar to GANs, our approach employs discriminators but uses a more stable optimization procedure, using a tight upper bound. We show that our algorithm is much more effective in computing the I-projection than recent GAN approaches and we illustrate the effectiveness of our approach for modelling multi-modal behavior on two pedestrian and traffic prediction datasets.

eim, i-projection, objective, (12 more...)

arXiv.org Machine Learning

2001.08682

Country:

Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > Russia (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.64)

Add feedback

A geometric characterisation of sensitivity analysis in monomial models

Leonelli, Manuele, Riccomagno, Eva

arXiv.org Artificial IntelligenceDec-18-2018

Sensitivity analysis in probabilistic discrete graphical models is usually conducted by varying one probability value at a time and observing how this affects output probabilities of interest. When one probability is varied then others are proportionally covaried to respect the sum-to-one condition of probability laws. The choice of proportional covariation is justified by a variety of optimality conditions, under which the original and the varied distributions are as close as possible under different measures of closeness. For variations of more than one parameter at a time proportional covariation is justified in some special cases only. In this work, for the large class of discrete statistical models entertaining a regular monomial parametrisation, we demonstrate the optimality of newly defined proportional multi-way schemes with respect to an optimality criterion based on the notion of I-divergence. We demonstrate that there are varying parameters choices for which proportional covariation is not optimal and identify the sub-family of model distributions where the distance between the original distribution and the one where probabilities are covaried proportionally is minimum. This is shown by adopting a new formal, geometric characterization of sensitivity analysis in monomial models, which include a wide array of probabilistic graphical models. We also demonstrate the optimality of proportional covariation for multi-way analyses in Naive Bayes classifiers.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1901.02058

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Incorporating Type II Error Probabilities from Independence Tests into Score-Based Learning of Bayesian Network Structure

Brenner, Eliot, Sontag, David

arXiv.org Machine LearningMay-12-2015

We give a new consistent scoring function for structure learning of Bayesian networks. In contrast to traditional approaches to score-based structure learning, such as BDeu or MDL, the complexity penalty that we propose is data-dependent and is given by the probability that a conditional independence test correctly shows that an edge cannot exist. What really distinguishes this new scoring function from earlier work is that it has the property of becoming computationally easier to maximize as the amount of data increases. We prove a polynomial sample complexity result, showing that maximizing this score is guaranteed to correctly learn a structure with no false edges and a distribution close to the generating distribution, whenever there exists a Bayesian network which is a perfect map for the data generating distribution. Although the new score can be used with any search algorithm, in our related UAI 2013 paper [BS13], we have given empirical results showing that it is particularly effective when used together with a linear programming relaxation approach to Bayesian network structure learning. The present paper contains all details of the proofs of the finite-sample complexity results in [BS13] as well as detailed explanation of the computation of the certain error probabilities called beta-values, whose precomputation and tabulation is necessary for the implementation of the algorithm in [BS13].

algorithm, bayesian network, probability, (13 more...)

arXiv.org Machine Learning

1505.0287

Country:

North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback