Goto

Collaborating Authors

 ibu


Reward Machines for Deep RL in Noisy and Uncertain Environments

Neural Information Processing Systems

Reward Machines provide an automaton-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing the underlying structure of a reward function, they enable the decomposition of an RL task, leading to impressive gains in sample efficiency.



On the Utility Gain of Iterative Bayesian Update for Locally Differentially Private Mechanisms

Arcolezi, Héber H., Cerna, Selene, Palamidessi, Catuscia

arXiv.org Artificial Intelligence

This paper investigates the utility gain of using Iterative Bayesian Update (IBU) for private discrete distribution estimation using data obfuscated with Locally Differentially Private (LDP) mechanisms. We compare the performance of IBU to Matrix Inversion (MI), a standard estimation technique, for seven LDP mechanisms designed for one-time data collection and for other seven LDP mechanisms designed for multiple data collections (e.g., RAPPOR). To broaden the scope of our study, we also varied the utility metric, the number of users n, the domain size k, and the privacy parameter {\epsilon}, using both synthetic and real-world data. Our results suggest that IBU can be a useful post-processing tool for improving the utility of LDP mechanisms in different scenarios without any additional privacy cost. For instance, our experiments show that IBU can provide better utility than MI, especially in high privacy regimes (i.e., when {\epsilon} is small). Our paper provides insights for practitioners to use IBU in conjunction with existing LDP mechanisms for more accurate and privacy-preserving data analysis. Finally, we implemented IBU for all fourteen LDP mechanisms into the state-of-the-art multi-freq-ldpy Python package (https://pypi.org/project/multi-freq-ldpy/) and open-sourced all our code used for the experiments as tutorials.


What's That Beer Style? Ask a Neighbor, or Two

@machinelearnbot

Beer is delicious but it is not one thing. If you disagree with the former part of the previous sentence please keep the latter in mind[1]. Think of sports, for instance. Many would agree with the blanket statement "sports are fun" but depending on what you have in mind two people can easily have opposite reactions to being presented the opportunity to play ping-pong. Sports are not one thing, music is not one thing, and neither is beer. Presented with a finely crafted brew in a style of your preference it is difficult to have a more pleasurable gastronomical experience.


Selecting Informative Universum Sample for Semi-Supervised Learning

Chen, Shuo (Tsinghua University) | Zhang, Changshui (Tsinghua University)

AAAI Conferences

The Universum sample, which is defined as the sample that doesn't belong to any of the classes the learning task concerns, has been proved to be helpful in both supervised and semi-supervised settings. The former works treat the Universum samples equally. Our research found that not all the Universum samples are helpful, and we propose a method to pick the informative ones, i.e., in-between Universum samples. We also set up a new semi-supervised framework to incorporate the in-between Universum samples. Empirical experiments show that our method outperforms the former ones.