Goto

Collaborating Authors

 controlling



Estimating and Controlling for Equalized Odds via Sensitive Attribute Predictors

Neural Information Processing Systems

As the use of machine learning models in real world high-stakes decision settings continues to grow, it is highly important that we are able to audit and control for any potential fairness violations these models may exhibit towards certain groups. To do so, one naturally requires access to sensitive attributes, such as demographics, biological sex, or other potentially sensitive features that determine group membership. Unfortunately, in many settings, this information is often unavailable. In this work we study the well known equalized odds (EOD) definition of fairness. In a setting without sensitive attributes, we first provide tight and computable upper bounds for the EOD violation of a predictor. These bounds precisely reflect the worst possible EOD violation. Second, we demonstrate how one can provably control the worst-case EOD by a new post-processing correction method. Our results characterize when directly controlling for EOD with respect to the predicted sensitive attributes is -- and when is not -- optimal when it comes to controlling worst-case EOD. Our results hold under assumptions that are milder than previous works, and we illustrate these results with experiments on synthetic and real datasets.


Controlling the image generation process with parametric activation functions

Pavlov, Ilia

arXiv.org Artificial Intelligence

As image generative models continue to increase not only in their fidelity but also in their ubiquity the development of tools that leverage direct interaction with their internal mechanisms in an interpretable way has received little attention In this work we introduce a system that allows users to develop a better understanding of the model through interaction and experimentation By giving users the ability to replace activation functions of a generative network with parametric ones and a way to set the parameters of these functions we introduce an alternative approach to control the networks output We demonstrate the use of our method on StyleGAN2 and BigGAN networks trained on FFHQ and ImageNet respectively.



Safe and Efficient In-Context Learning via Risk Control

Wynn, Andrea, Jazbec, Metod, Peris, Charith, Khaziev, Rinat, Liu, Anqi, Khashabi, Daniel, Nalisnick, Eric

arXiv.org Artificial Intelligence

Large language models (LLMs) demonstrate a remarkable ability to learn new tasks from a few in-context examples. However, this flexibility introduces safety concerns: LLMs can be influenced by incorrect or malicious demonstrations -- for example, if an adversary tampers with or injects harmful examples without a human supervisor noticing. This motivates principled designs in which the system itself includes built-in mechanisms to guard against such attacks. We propose a novel approach to limit the degree to which harmful demonstrations can degrade model performance. First, we define a baseline ``safe'' behavior for the model -- the model's performance given no in-context demonstrations (zero-shot). Next, we apply distribution-free risk control (DFRC) to control the extent to which in-context samples can decay performance below zero-shot. We achieve this by leveraging dynamic early exit prediction, ignoring later attention heads that attend the most to the unsafe inputs. Finally, we propose modifications to DFRC that allow it to both control risk for harmful inputs \textit{and} leverage performance and efficiency gains on helpful inputs. We present both theoretical and empirical results showing that our approach can effectively control risk for harmful in-context demonstrations while simultaneously achieving substantial computational efficiency gains with helpful demonstrations.


InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint

Neural Information Processing Systems

Text-conditioned motion synthesis has made remarkable progress with the emergence of diffusion models. However, the majority of these motion diffusion models are primarily designed for a single character and overlook multi-human interactions. In our approach, we strive to explore this problem by synthesizing human motion with interactions for a group of characters of any size in a zero-shot manner. The key aspect of our approach is the adaptation of human-wise interactions as pairs of human joints that can be either in contact or separated by a desired distance. In contrast to existing methods that necessitate training motion generation models on multi-human motion datasets with a fixed number of characters, our approach inherently possesses the flexibility to model human interactions involving an arbitrary number of individuals, thereby transcending the limitations imposed by the training data. We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs.


Estimating and Controlling for Equalized Odds via Sensitive Attribute Predictors

Neural Information Processing Systems

As the use of machine learning models in real world high-stakes decision settings continues to grow, it is highly important that we are able to audit and control for any potential fairness violations these models may exhibit towards certain groups. To do so, one naturally requires access to sensitive attributes, such as demographics, biological sex, or other potentially sensitive features that determine group membership. Unfortunately, in many settings, this information is often unavailable. In this work we study the well known equalized odds (EOD) definition of fairness. In a setting without sensitive attributes, we first provide tight and computable upper bounds for the EOD violation of a predictor.


Controlling the Misinformation Diffusion in Social Media by the Effect of Different Classes of Agents

Yalabadi, Ali Khodabandeh, Yazdani-Jahromi, Mehdi, Abdidizaji, Sina, Garibay, Ivan, Garibay, Ozlem Ozmen

arXiv.org Artificial Intelligence

The rapid and widespread dissemination of misinformation through social networks is a growing concern in today's digital age. This study focused on modeling fake news diffusion, discovering the spreading dynamics, and designing control strategies. A common approach for modeling the misinformation dynamics is SIR-based models. Our approach is an extension of a model called 'SBFC' which is a SIR-based model. This model has three states, Susceptible, Believer, and Fact-Checker. The dynamics and transition between states are based on neighbors' beliefs, hoax credibility, spreading rate, probability of verifying the news, and probability of forgetting the current state. Our contribution is to push this model to real social networks by considering different classes of agents with their characteristics. We proposed two main strategies for confronting misinformation diffusion. First, we can educate a minor class, like scholars or influencers, to improve their ability to verify the news or remember their state longer. The second strategy is adding fact-checker bots to the network to spread the facts and influence their neighbors' states. Our result shows that both of these approaches can effectively control the misinformation spread.


Offline Imitation Learning by Controlling the Effective Planning Horizon

Ahn, Hee-Jun, Shim, Seong-Woong, Lee, Byung-Jun

arXiv.org Artificial Intelligence

In offline imitation learning (IL), we generally assume only a handful of expert trajectories and a supplementary offline dataset from suboptimal behaviors to learn the expert policy. While it is now common to minimize the divergence between state-action visitation distributions so that the agent also considers the future consequences of an action, a sampling error in an offline dataset may lead to erroneous estimates of state-action visitations in the offline case. In this paper, we investigate the effect of controlling the effective planning horizon (i.e., reducing the discount factor) as opposed to imposing an explicit regularizer, as previously studied. Unfortunately, it turns out that the existing algorithms suffer from magnified approximation errors when the effective planning horizon is shortened, which results in a significant degradation in performance. We analyze the main cause of the problem and provide the right remedies to correct the algorithm. We show that the corrected algorithm improves on popular imitation learning benchmarks by controlling the effective planning horizon rather than an explicit regularization.


Momentum Provably Improves Error Feedback!

Fatkhullin, Ilyas, Tyurin, Alexander, Richtárik, Peter

arXiv.org Artificial Intelligence

Due to the high communication overhead when training machine learning models in a distributed environment, modern algorithms invariably rely on lossy communication compression. However, when untreated, the errors caused by compression propagate, and can lead to severely unstable behavior, including exponential divergence. Almost a decade ago, Seide et al [2014] proposed an error feedback (EF) mechanism, which we refer to as EF14, as an immensely effective heuristic for mitigating this issue. However, despite steady algorithmic and theoretical advances in the EF field in the last decade, our understanding is far from complete. In this work we address one of the most pressing issues. In particular, in the canonical nonconvex setting, all known variants of EF rely on very large batch sizes to converge, which can be prohibitive in practice. We propose a surprisingly simple fix which removes this issue both theoretically, and in practice: the application of Polyak's momentum to the latest incarnation of EF due to Richt\'{a}rik et al. [2021] known as EF21. Our algorithm, for which we coin the name EF21-SGDM, improves the communication and sample complexities of previous error feedback algorithms under standard smoothness and bounded variance assumptions, and does not require any further strong assumptions such as bounded gradient dissimilarity. Moreover, we propose a double momentum version of our method that improves the complexities even further. Our proof seems to be novel even when compression is removed from the method, and as such, our proof technique is of independent interest in the study of nonconvex stochastic optimization enriched with Polyak's momentum.