advisee
BRNES: Enabling Security and Privacy-aware Experience Sharing in Multiagent Robotic and Autonomous Systems
Hossain, Md Tamjid, La, Hung Manh, Badsha, Shahriar, Netchaev, Anton
Although experience sharing (ES) accelerates multiagent reinforcement learning (MARL) in an advisor-advisee framework, attempts to apply ES to decentralized multiagent systems have so far relied on trusted environments and overlooked the possibility of adversarial manipulation and inference. Nevertheless, in a real-world setting, some Byzantine attackers, disguised as advisors, may provide false advice to the advisee and catastrophically degrade the overall learning performance. Also, an inference attacker, disguised as an advisee, may conduct several queries to infer the advisors' private information and make the entire ES process questionable in terms of privacy leakage. To address and tackle these issues, we propose a novel MARL framework (BRNES) that heuristically selects a dynamic neighbor zone for each advisee at each learning step and adopts a weighted experience aggregation technique to reduce Byzantine attack impact. Furthermore, to keep the agent's private information safe from adversarial inference attacks, we leverage the local differential privacy (LDP)-induced noise during the ES process. Our experiments show that our framework outperforms the state-of-the-art in terms of the steps to goal, obtained reward, and time to goal metrics. Particularly, our evaluation shows that the proposed framework is 8.32x faster than the current non-private frameworks and 1.41x faster than the private frameworks in an adversarial setting.
- North America > United States > Nevada > Washoe County > Reno (0.14)
- Europe (0.04)
Should Young Computer Scientists Stop Collaborating with Their Doctoral Advisors?
Shortly after the first author started his tenure-track position at Bar-Ilan University, he published a few additional papers with his doctoral advisor. These papers were mostly "lingering" results from his Ph.D. or direct extensions thereof. He was very surprised that his department chair reprimanded him for this, claiming it could be harmful to his career. Surprisingly, until now, we were unable to find any support to that claim in the literature. The benefits and importance of mentoring have been long established and span a wide variety of vocational fields both in and outside of academia.2,7 In the academic realm, the supervision benefits are commonly mutual:6 The advisor extends her ability to conduct research by delegation, extends her influence network, and the advisee learns the important skills needed to conduct scientific research, receives various types of academic support, and so on.
- Asia > Middle East > Israel (0.04)
- North America > United States > California > Alameda County > Oakland (0.04)
- Asia > Middle East > Jordan (0.04)
Chess as a Testing Grounds for the Oracle Approach to AI Safety
Miller, James D., Yampolskiy, Roman, Haggstrom, Olle, Armstrong, Stuart
September 29, 2020 James D. Miller, Roman Yampolskiy, Olle Häggström, Stuart Armstrong Abstract To reduce the danger of powerful super-intelligent AIs, we might make the first such AIs oracles that can only send and receive messages. This paper proposes a possibly practical means of using machine learning to create two classes of narrow AI oracles that would provide chess advice: those aligned with the player's interest, and those that want the player to lose and give deceptively bad advice. The player would be uncertain which type of oracle it was interacting with. As the oracles would be vastly more intelligent than the player in the domain of chess, experience with these oracles might help us prepare for future artificial general intelligence oracles. Introduction A few years before the term artificial intelligence (AI) was coined, Turing (1951) suggested that once a sufficiently capable AI has been created, we can "expect the machines to take control". This ominous prediction was almost entirely ignored by the research community for half a century, and only in the last couple of decades have academics begun to address the issue of what happens when we build a socalled artificial general intelligence (AGI), i.e., a machine that has human or superhuman level intelligence across the full range of relevant cognitive skills. An increasing number of scientists and scholars have pointed out the crucial importance of making sure that the AGI's goal or utility function is sufficiently aligned with ours, and doing that before the machine takes control; see, e.g., Yudkowsky (2008), Bostrom (2014) and Russell (2019) for influential accounts of this problem, which today goes by the name AI Alignment. Unfortunately, the standard trial-and-error approach to software development under which we write code with the intention of doing some of the debugging after development would go disastrously wrong if an AGI took control before we determined how to align the AGI's utility function with our values. An alternative - or more likely a complement - to AI Alignment that has sometimes been suggested is to initially limit the AGI's ability to interact with the environment before we have verified that the AGI is aligned.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (3 more...)
Beliefs and Expertise in Sequential Decision Making
Seo, Daewon, Raman, Ravi Kiran, Rhim, Joong Bum, Goyal, Vivek K, Varshney, Lav R
This work explores a sequential decision making problem with agents having diverse expertise and mismatched beliefs. We consider an $N$-agent sequential binary hypothesis test in which each agent sequentially makes a decision based not only on a private observation, but also on previous agents' decisions. In addition, the agents have their own beliefs instead of the true prior, and have varying expertise in terms of the noise variance in the private signal. We focus on the risk of the last-acting agent, where precedent agents are selfish. Thus, we call this advisor(s)-advisee sequential decision making. We first derive the optimal decision rule by recursive belief update and conclude, counterintuitively, that beliefs deviating from the true prior could be optimal in this setting. The impact of diverse noise levels (which means diverse expertise levels) in the two-agent case is also considered and the analytical properties of the optimal belief curves are given. These curves, for certain cases, resemble probability weighting functions from cumulative prospect theory, and so we also discuss the choice of Prelec weighting functions as an approximation for the optimal beliefs, and the possible psychophysical optimality of human beliefs. Next, we consider an advisor selection problem wherein the advisee of a certain belief chooses an advisor from a set of candidates with varying beliefs. We characterize the decision region for choosing such an advisor and argue that an advisee with beliefs varying from the true prior often ends up selecting a suboptimal advisor, indicating the need for a social planner. We close with a discussion on the implications of the study toward designing artificial intelligence systems for augmenting human intelligence.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)