dcem
From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions
Selective labels occur when label observations are subject to a decision-making process; e.g., diagnoses that depend on the administration of laboratory tests. We study a clinically-inspired selective label problem called disparate censorship, where labeling biases vary across subgroups and unlabeled individuals are imputed as "negative" (i.e., no diagnostic test = no illness). Machine learning models naively trained on such labels could amplify labeling bias. Inspired by causal models of selective labels, we propose Disparate Censorship Expectation-Maximization (DCEM), an algorithm for learning in the presence of disparate censorship. We theoretically analyze how DCEM mitigates the effects of disparate censorship on model performance. We validate DCEM on synthetic data, showing that it improves bias mitigation (area between ROC curves) without sacrificing discriminative performance (AUC) compared to baselines. We achieve similar results in a sepsis classification task using clinical data.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Austria > Vienna (0.14)
- (6 more...)
A deep complementary energy method for solid mechanics using minimum complementary energy principle
Wang, Yizheng, Sun, Jia, Rabczuk, Timon, Liu, Yinghua
In recent years, the rapid advancement of deep learning has significantly impacted various fields, particularly in solving partial differential equations (PDEs) in the realm of solid mechanics, benefiting greatly from the remarkable approximation capabilities of neural networks. In solving PDEs, Physics-Informed Neural Networks (PINNs) and the Deep Energy Method (DEM) have garnered substantial attention. The principle of minimum potential energy and complementary energy are two important variational principles in solid mechanics. However, the well-known Deep Energy Method (DEM) is based on the principle of minimum potential energy, but there lacks the important form of minimum complementary energy. To bridge this gap, we propose the deep complementary energy method (DCEM) based on the principle of minimum complementary energy. The output function of DCEM is the stress function, which inherently satisfies the equilibrium equation. We present numerical results using the Prandtl and Airy stress functions, and compare DCEM with existing PINNs and DEM algorithms when modeling representative mechanical problems. The results demonstrate that DCEM outperforms DEM in terms of stress accuracy and efficiency and has an advantage in dealing with complex displacement boundary conditions, which is supported by theoretical analyses and numerical simulations. We extend DCEM to DCEM-Plus (DCEM-P), adding terms that satisfy partial differential equations. Furthermore, we propose a deep complementary energy operator method (DCEM-O) by combining operator learning with physical equations. Initially, we train DCEM-O using high-fidelity numerical results and then incorporate complementary energy. DCEM-P and DCEM-O further enhance the accuracy and efficiency of DCEM.
- North America > United States (0.27)
- Asia > China (0.14)
- Europe > Germany (0.14)
- Health & Medicine (0.92)
- Energy > Oil & Gas > Upstream (0.67)
The Differentiable Cross-Entropy Method
T HE D IFFERENTIABLEC ROSS-E NTROPYM ETHOD Brandon Amos 1 Denis Y arats 12 1 Facebook AI Research 2 New Y ork University A BSTRACT We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant (DCEM) that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In this paper we focus on the setting of optimizing an unconstrained, non-convex, and continuous objective function f θ(x): R n Θ R as ˆ x arg min x f θ(x), where f is parameterized by θ Θ and has inputs x R n . If it exists, some (sub-)derivative θˆ x is useful in the machine learning setting to make the output of the optimization procedure end-to-end learnable. For example, θ could parameterize a predictive model that is generating potential outcomes conditional on x happening that you want to optimize over. End-to-end learning in these settings can be done by defining a loss function L on top of ˆ x and taking gradient steps θL . If f θ were convex this gradient is easy to analyze and compute when it exists and is unique (Gould et al., 2016; Johnson et al., 2016; Amos et al., 2017; Amos & Kolter, 2017). Unfortunately analyzing and computing a "derivative" through the non-convex arg min here is not as easy and is challenging in theory and practice. No such derivative may exist in theory, it might not be unique, and even if it uniquely exists, the numerical solver being used to compute the solution may not find a global or even local optimum of f . One promising direction to sidestep these issues is to approximate the arg min operation with an explicit optimization procedure that is interpreted as just another compute graph and unrolled through.
- North America > United States (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Jordan (0.04)
- Instructional Material > Course Syllabus & Notes (0.46)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)