AITopics | Luo, Yucen

Learning Counterfactually Invariant Predictors

Quinzan, Francesco, Casolo, Cecilia, Muandet, Krikamol, Luo, Yucen, Kilbertus, Niki

arXiv.org Machine LearningOct-13-2023

Invariance, or equivariance to certain data transformations, has proven essential in numerous applications of machine learning (ML), since it can lead to better generalization capabilities [Arjovsky et al., 2019, Bloem-Reddy and Teh, 2020, Chen et al., 2020]. For instance, in image recognition, predictions ought to remain unchanged under scaling, translation, or rotation of the input image. Data augmentation, an early heuristic to promote such invariances, has become indispensable for successfully training deep neural networks (DNNs) [Shorten and Khoshgoftaar, 2019, Xie et al., 2020]. Well-known examples of "invariance by design" include convolutional neural networks (CNNs) for translation invariance [Krizhevsky et al., 2012], group equivariant NNs for general group transformations [Cohen and Welling, 2016], recurrent neural networks (RNNs) and transformers for sequential data [Vaswani et al., 2017], DeepSet [Zaheer et al., 2017] for sets, and graph neural networks (GNNs) for different types of geometric structures [Battaglia et al., 2018]. Many applications in modern ML, however, call for arguably stronger notions of invariance based on causality. This case has been made for image classification, algorithmic fairness [Hardt et al., 2016, Mitchell et al., 2021], robustness [Bühlmann, 2020], and out-of-distribution generalization [Lu et al., 2021]. The goal is invariance with respect to hypothetical manipulations of the data generating process (DGP). Various works develop methods that assume observational distributions (across environments or between training and test) to be governed by shared causal mechanisms, but differ due to various types of distribution shifts encoded by the causal model [Arjovsky et al., 2019, Bühlmann, 2020, Heinze-Deml et al., 2018, Makar et al., 2022, Part of this work was done while Francesco Quinzan visited the Max Planck Institute for Intelligent Systems, Tübingen, Germany.

artificial intelligence, counterfactual invariance, machine learning, (16 more...)

arXiv.org Machine Learning

2207.09768

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.24)
Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Industry: Law (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Iterative Teaching by Data Hallucination

Qiu, Zeju, Liu, Weiyang, Xiao, Tim Z., Liu, Zhen, Bhatt, Umang, Luo, Yucen, Weller, Adrian, Schölkopf, Bernhard

arXiv.org Artificial IntelligenceApr-12-2023

We consider the problem of iterative machine teaching, where a teacher sequentially provides examples based on the status of a learner under a discrete input space (i.e., a pool of finite samples), which greatly limits the teacher's capability. To address this issue, we study iterative teaching under a continuous input space where the input example (i.e., image) can be either generated by solving an optimization problem or drawn directly from a continuous distribution. Specifically, we propose data hallucination teaching (DHT) where the teacher can generate input data intelligently based on labels, the learner's status and the target concept. We study a number of challenging teaching setups (e.g., linear/neural learners in omniscient and black-box settings). Extensive empirical results verify the effectiveness of DHT.

machine learning, reinforcement learning, teaching, (17 more...)

arXiv.org Artificial Intelligence

2210.17467

Country: Europe (0.67)

Genre: Research Report (0.85)

Industry:

Education (1.00)
Government (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Spectral Representation Learning for Conditional Moment Models

Wang, Ziyu, Luo, Yucen, Li, Yueru, Zhu, Jun, Schölkopf, Bernhard

arXiv.org Artificial IntelligenceDec-28-2022

Many problems in causal inference and economics can be formulated in the framework of conditional moment models, which characterize the target function through a collection of conditional moment restrictions. For nonparametric conditional moment models, efficient estimation often relies on preimposed conditions on various measures of ill-posedness of the hypothesis space, which are hard to validate when flexible models are used. In this work, we address this issue by proposing a procedure that automatically learns representations with controlled measures of ill-posedness. Our method approximates a linear representation defined by the spectral decomposition of a conditional expectation operator, which can be used for kernelized estimators and is known to facilitate minimax optimal estimation in certain settings. We show this representation can be efficiently estimated from data, and establish L2 consistency for the resulting estimator. We evaluate the proposed method on proximal causal inference tasks, exhibiting promising performance on high-dimensional, semi-synthetic data.

artificial intelligence, kernel, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2210.16525

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

Conditional Generative Moment-Matching Networks

Ren, Yong, Zhu, Jun, Li, Jialian, Luo, Yucen

Neural Information Processing SystemsFeb-14-2020, 12:41:39 GMT

Maximum mean discrepancy (MMD) has been successfully applied to learn deep generative models for characterizing a joint distribution of variables via kernel mean embedding. In this paper, we present conditional generative moment-matching networks (CGMMN), which learn a conditional distribution given some input variables based on a conditional maximum mean discrepancy (CMMD) criterion. The learning is performed by stochastic gradient descent with the gradient calculated by back-propagation. We evaluate CGMMN on a wide range of tasks, including predictive modeling, contextual generation, and Bayesian dark knowledge, which distills knowledge from a Bayesian model by learning a relatively small CGMMN student network. Our results demonstrate competitive performance in all the tasks.

artificial intelligence, conditional generative moment-matching network, neural network, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.32)

Add feedback

Semi-crowdsourced Clustering with Deep Generative Models

Luo, Yucen, TIAN, TIAN, Shi, Jiaxin, Zhu, Jun, Zhang, Bo

Neural Information Processing SystemsFeb-14-2020, 12:12:03 GMT

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework.

Add feedback

A Simple yet Effective Baseline for Robust Deep Learning with Noisy Labels

Luo, Yucen, Zhu, Jun, Pfister, Tomas

arXiv.org Machine LearningSep-20-2019

Recently deep neural networks have shown their capacity to memorize training data, even with noisy labels, which hurts generalization performance. To mitigate this issue, we propose a simple but effective baseline that is robust to noisy labels, even with severe noise. Our objective involves a variance regularization term that implicitly penalizes the Jacobian norm of the neural network on the whole training set (including the noisy-labeled data), which encourages generalization and prevents overfitting to the corrupted labels. Experiments on both synthetically generated incorrect labels and realistic large-scale noisy datasets demonstrate that our approach achieves state-of-the-art performance with a high tolerance to severe noise.

deep learning, neural network, noisy label, (16 more...)

arXiv.org Machine Learning

1909.09338

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Semi-crowdsourced Clustering with Deep Generative Models

Luo, Yucen, TIAN, TIAN, Shi, Jiaxin, Zhu, Jun, Zhang, Bo

Neural Information Processing SystemsDec-31-2018

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.

annotation, deep learning, neural network, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Semi-crowdsourced Clustering with Deep Generative Models

Luo, Yucen, TIAN, TIAN, Shi, Jiaxin, Zhu, Jun, Zhang, Bo

Neural Information Processing SystemsDec-31-2018

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.

annotation, deep learning, neural network, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Semi-crowdsourced Clustering with Deep Generative Models

Luo, Yucen, Tian, Tian, Shi, Jiaxin, Zhu, Jun, Zhang, Bo

arXiv.org Machine LearningOct-29-2018

We consider the semi-supervised clustering problem where crowdsourcing provides noisy information about the pairwise comparisons on a small subset of data, i.e., whether a sample pair is in the same cluster. We propose a new approach that includes a deep generative model (DGM) to characterize low-level features of the data, and a statistical relational model for noisy pairwise annotations on its subset. The two parts share the latent variables. To make the model automatically trade-off between its complexity and fitting data, we also develop its fully Bayesian variant. The challenge of inference is addressed by fast (natural-gradient) stochastic variational inference algorithms, where we effectively combine variational message passing for the relational part and amortized learning of the DGM under a unified framework. Empirical results on synthetic and real-world datasets show that our model outperforms previous crowdsourced clustering methods.

annotation, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1810.11971

Country:

Asia > China (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Smooth Neighbors on Teacher Graphs for Semi-supervised Learning

Luo, Yucen, Zhu, Jun, Li, Mengxi, Ren, Yong, Zhang, Bo

arXiv.org Machine LearningMar-28-2018

The recently proposed self-ensembling methods have achieved promising results in deep semi-supervised learning, which penalize inconsistent predictions of unlabeled data under different perturbations. However, they only consider adding perturbations to each single data point, while ignoring the connections between data samples. In this paper, we propose a novel method, called Smooth Neighbors on Teacher Graphs (SNTG). In SNTG, a graph is constructed based on the predictions of the teacher model, i.e., the implicit self-ensemble of models. Then the graph serves as a similarity measure with respect to which the representations of "similar" neighboring points are learned to be smooth on the low-dimensional manifold. We achieve state-of-the-art results on semi-supervised learning benchmarks. The error rates are 9.89%, 3.99% for CIFAR-10 with 4000 labels, SVHN with 500 labels, respectively. In particular, the improvements are significant when the labels are fewer. For the non-augmented MNIST with only 20 labels, the error rate is reduced from previous 4.81% to 1.36%. Our method also shows robustness to noisy labels.

deep learning, graph, neural network, (20 more...)

arXiv.org Machine Learning

1711.00258

Country: