naive approach
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Imitation learning (IL) algorithms excel in acquiring high-quality policies from expert data for sequential decision-making tasks. But, their effectiveness is hampered when faced with limited expert data. To tackle this challenge, a novel framework called (offline) IL with supplementary data has been proposed, which enhances learning by incorporating an additional yet imperfect dataset obtained inexpensively from sub-optimal policies. Nonetheless, learning becomes challenging due to the potential inclusion of out-of-expert-distribution samples. In this work, we propose a mathematical formalization of this framework, uncovering its limitations.
Robust Optimization for Fairness with Noisy Protected Groups
Many existing fairness criteria for machine learning involve equalizing some metric across protected groups such as race or gender. However, practitioners trying to audit or enforce such group-based criteria can easily face the problem of noisy or biased protected group information. First, we study the consequences of naively relying on noisy protected group labels: we provide an upper bound on the fairness violations on the true groups $G$ when the fairness criteria are satisfied on noisy groups $\hat{G}$. Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on $\hat{G}$, are guaranteed to satisfy fairness criteria on the true protected groups $G$ while minimizing a training objective. We provide theoretical guarantees that one such approach converges to an optimal feasible solution. Using two case studies, we show empirically that the robust approaches achieve better true group fairness guarantees than the naive approach.
- North America > United States > Iowa > Johnson County > Iowa City (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Robust Optimization for Fairness with Noisy Protected Groups
Many existing fairness criteria for machine learning involve equalizing some metric across protected groups such as race or gender. However, practitioners trying to audit or enforce such group-based criteria can easily face the problem of noisy or biased protected group information. First, we study the consequences of naively relying on noisy protected group labels: we provide an upper bound on the fairness violations on the true groups G when the fairness criteria are satisfied on noisy groups \hat{G} . Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on \hat{G}, are guaranteed to satisfy fairness criteria on the true protected groups G while minimizing a training objective. We provide theoretical guarantees that one such approach converges to an optimal feasible solution. Using two case studies, we show empirically that the robust approaches achieve better true group fairness guarantees than the naive approach.
Robust Optimization for Fairness with Noisy Protected Groups
Many existing fairness criteria for machine learning involve equalizing some metric across protected groups such as race or gender. However, practitioners trying to audit or enforce such group-based criteria can easily face the problem of noisy or biased protected group information. First, we study the consequences of naively relying on noisy protected group labels: we provide an upper bound on the fairness violations on the true groups G when the fairness criteria are satisfied on noisy groups \hat{G} . Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on \hat{G}, are guaranteed to satisfy fairness criteria on the true protected groups G while minimizing a training objective. We provide theoretical guarantees that one such approach converges to an optimal feasible solution. Using two case studies, we show empirically that the robust approaches achieve better true group fairness guarantees than the naive approach.
Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
Imitation learning (IL) algorithms excel in acquiring high-quality policies from expert data for sequential decision-making tasks. But, their effectiveness is hampered when faced with limited expert data. To tackle this challenge, a novel framework called (offline) IL with supplementary data has been proposed, which enhances learning by incorporating an additional yet imperfect dataset obtained inexpensively from sub-optimal policies. Nonetheless, learning becomes challenging due to the potential inclusion of out-of-expert-distribution samples. In this work, we propose a mathematical formalization of this framework, uncovering its limitations.
Reviews: Conditional Generative Moment-Matching Networks
The naive approach of extending GMMNs to conditional setting is to estimate a GMMN for each conditional distribution, and all these conditional distributions share parameters through the use of the same neural network. The problem of this approach is that each conditional distribution only has very few examples, and in the case of continuous domain for the conditioning variables, each conditional distribution may only have one single example, causing data sparsity problem. The proposed approach treats all the conditional distributions as a family and tries to match the model with the conditional embedding operator directly rather than matching each individual conditional distributions. The advantage of the proposed approach seems clear, but in some cases I can still see the naive approach do a reasonable job, for example in conditional generation where the conditioning variable takes one of 10 values as in MNIST. It would be interesting to compare to such a naive approach as a baseline.
Fault Identification Enhancement with Reinforcement Learning (FIERL)
Zaccaria, Valentina, Sartor, Davide, Del Favero, Simone, Susto, Gian Antonio
This letter presents a novel approach in the field of Active Fault Detection (AFD), by explicitly separating the task into two parts: Passive Fault Detection (PFD) and control input design. This formulation is very general, and most existing AFD literature can be viewed through this lens. By recognizing this separation, PFD methods can be leveraged to provide components that make efficient use of the available information, while the control input is designed in order to optimize the gathering of information. The core contribution of this work is FIERL, a general simulation-based approach for the design of such control strategies, using Constrained Reinforcement Learning (CRL) to optimize the performance of arbitrary passive detectors. The control policy is learned without the need of knowing the passive detector inner workings, making FIERL broadly applicable. However, it is especially useful when paired with the design of an efficient passive component. Unlike most AFD approaches, FIERL can handle fairly complex scenarios such as continuous sets of fault modes. The effectiveness of FIERL is tested on a benchmark problem for actuator fault diagnosis, where FIERL is shown to be fairly robust, being able to generalize to fault dynamics not seen in training.
59c33016884a62116be975a9bb8257e3-Reviews.html
It is assumed that all the training outputs are observed at all inputs, which leads to a Kronecker product covariance. Noisy observations are modeled via a structured process and this is the main contribution of the paper. While previous work on multi-task GP approaches with Kronecker covariances has considered iid noise in order to carry out efficient computations, this paper shows that it is possible to consider a noise process with Kronecker structure, while maintaining efficient computations. In other words, as in the iid noise case, one never has to compute a Kronecker product and hence computations are O(N 3 T 3) instead of O(N 3T 3). This is achieved by whitening the noise process and projecting the (noiseless) covariance of the system into the eigen-basis of the noise covariance (scaled by the eigenvalues). Their experiments show that the proposed structured-noise multi-task GP approach outperforms the baseline iid-noise multi-task GP method and independent GPs on synthetic data and real applications.
Scaling Team Coordination on Graphs with Reinforcement Learning
Limbu, Manshi, Hu, Zechen, Wang, Xuan, Shishika, Daigo, Xiao, Xuesu
This paper studies Reinforcement Learning (RL) techniques to enable team coordination behaviors in graph environments with support actions among teammates to reduce the costs of traversing certain risky edges in a centralized manner. While classical approaches can solve this non-standard multi-agent path planning problem by converting the original Environment Graph (EG) into a Joint State Graph (JSG) to implicitly incorporate the support actions, those methods do not scale well to large graphs and teams. To address this curse of dimensionality, we propose to use RL to enable agents to learn such graph traversal and teammate supporting behaviors in a data-driven manner. Specifically, through a new formulation of the team coordination on graphs with risky edges problem into Markov Decision Processes (MDPs) with a novel state and action space, we investigate how RL can solve it in two paradigms: First, we use RL for a team of agents to learn how to coordinate and reach the goal with minimal cost on a single EG. We show that RL efficiently solves problems with up to 20/4 or 25/3 nodes/agents, using a fraction of the time needed for JSG to solve such complex problems; Second, we learn a general RL policy for any $N$-node EGs to produce efficient supporting behaviors. We present extensive experiments and compare our RL approaches against their classical counterparts.
- North America > United States > Colorado > Jefferson County > Golden (0.04)
- North America > United States > California > Monterey County > Monterey (0.04)
- Research Report (1.00)
- Overview (0.88)