ad7ed5d47b9baceb12045a929e7e2f66-Supplemental.pdf

Neural Information Processing Systems 

A.1 Costforincentivization We justify the way in which LIO accounts for the cost of incentivization as follows. However, both the reward-giverand recipients require sufficient time tolearn the effect ofincentives,which means that too large anα would lead to the degenerate result ofrηi = 0. On the other extreme, α = 0means there isno penalty and may result inprofligate incentivization that serves no useful purpose. Let θi for i {1,2} denote each agent's probability of taking the cooperative action. Each plot has afixed value for the incentive givenfortheotheraction. Each agent observesallagents' positions andcanmoveamong thethree available states: lever, start, and door.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found