Goto

Collaborating Authors

 observational model


State-observation augmented diffusion model for nonlinear assimilation

Li, Zhuoyuan, Dong, Bin, Zhang, Pingwen

arXiv.org Machine Learning

Data assimilation has become a crucial technique aiming to combine physical models with observational data to estimate state variables. Traditional assimilation algorithms often face challenges of high nonlinearity brought by both the physical and observational models. In this work, we propose a novel data-driven assimilation algorithm based on generative models to address such concerns. Our State-Observation Augmented Diffusion (SOAD) model is designed to handle nonlinear physical and observational models more effectively. The marginal posterior associated with SOAD has been derived and then proved to match the real posterior under mild assumptions, which shows theoretical superiority over previous score-based assimilation works. Experimental results also indicate that our SOAD model may offer improved accuracy over existing data-driven methods.


Efficient Data-Driven Optimization with Noisy Data

Van Parys, Bart P. G.

arXiv.org Artificial Intelligence

Classical Kullback-Leibler or entropic distances are known to enjoy certain desirable statistical properties in the context of decision-making with noiseless data. However, in most practical situations the data available to a decision maker is subject to a certain amount of measurement noise. We hence study here data-driven prescription problems in which the data is corrupted by a known noise source. We derive efficient data-driven formulations in this noisy regime and indicate that they enjoy an entropic optimal transport interpretation. Finally, we show that these efficient robust formulations are tractable in several interesting settings by exploiting a classical representation result by Strassen.


Counterfactual Risk Assessments, Evaluation, and Fairness

Coston, Amanda, Chouldechova, Alexandra, Kennedy, Edward H.

arXiv.org Machine Learning

Algorithmic risk assessments are increasingly used to help humans make decisions in high-stakes settings, such as medicine, criminal justice and education. In each of these cases, the purpose of the risk assessment tool is to inform actions, such as medical treatments or release conditions, often with the aim of reducing the likelihood of an adverse event such as hospital readmission or recidivism. Problematically, most tools are trained and evaluated on historical data in which the outcomes observed depend on the historical decision-making policy. These tools thus reflect risk under the historical policy, rather than under the different decision options that the tool is intended to inform. Even when tools are constructed to predict risk under a specific decision, they are often improperly evaluated as predictors of the target outcome. Focusing on the evaluation task, in this paper we define counterfactual analogues of common predictive performance and algorithmic fairness metrics that we argue are better suited for the decision-making context. We introduce a new method for estimating the proposed metrics using doubly robust estimation. We provide theoretical results that show that only under strong conditions can fairness according to the standard metric and the counterfactual metric simultaneously hold. Consequently, fairness-promoting methods that target parity in a standard fairness metric may --- and as we show empirically, do --- induce greater imbalance in the counterfactual analogue. We provide empirical comparisons on both synthetic data and a real world child welfare dataset to demonstrate how the proposed method improves upon standard practice.


An Event-Based Framework for Process Inference

Joya, Michael (Department of Computing Science University of Alberta)

AAAI Conferences

We focus on a class of models used for representing the dynamics between a discrete set of probabilistic events in a continuous-time setting. The proposed framework offers tractable learning and inference procedures and provides compact state representations for processes which exhibit variable delays between events. The approach is applied to a heart sound labeling task that exhibits long-range dependencies on previous events, and in which explicit modeling of the rhythm timings is justifiable by cardiological principles.