A Further discussion
–Neural Information Processing Systems
A.1 Cost for incentivization We justify the way in which LIO accounts for the cost of incentivization as follows. Recall that this cost is incurred in the objective for LIO's incentive function (see (5) and (6)), instead of being accounted in the total reward (1) that is maximized by LIO's policy. Fundamentally, the reason is that the cost should be incurred only by the part of the agent that is directly responsible for incentivization. In LIO, the policy and incentive function are separate modules: while the former takes regular actions to maximize external rewards, only the latter produces incentives that directly and actively shape the behavior of other agents. The policy is decoupled from incentivization, and it would be incorrect to penalize it for the behavior of the incentive function. Instead, we need to attribute the cost directly to the incentive function parameters via (6).
Neural Information Processing Systems
May-31-2025, 11:07:31 GMT
- Technology: