punisher
Towards the Scalable Evaluation of Cooperativeness in Language Models
Chan, Alan, Riché, Maxime, Clifton, Jesse
It is likely that AI systems driven by pre-trained language models (PLMs) will increasingly be used to assist humans in high-stakes interactions with other agents, such as negotiation or conflict resolution. Consistent with the goals of Cooperative AI \citep{dafoe_open_2020}, we wish to understand and shape the multi-agent behaviors of PLMs in a pro-social manner. An important first step is the evaluation of model behaviour across diverse cooperation problems. Since desired behaviour in an interaction depends upon precise game-theoretic structure, we focus on generating scenarios with particular structures with both crowdworkers and a language model. Our work proceeds as follows. First, we discuss key methodological issues in the generation of scenarios corresponding to particular game-theoretic structures. Second, we employ both crowdworkers and a language model to generate such scenarios. We find that the quality of generations tends to be mediocre in both cases. We additionally get both crowdworkers and a language model to judge whether given scenarios align with their intended game-theoretic structure, finding mixed results depending on the game. Third, we provide a dataset of scenario based on our data generated. We provide both quantitative and qualitative evaluations of UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up, while other models seemed to have flat scaling trends.
- Asia > Pakistan (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > India (0.04)
- (13 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- Government (1.00)
- Leisure & Entertainment > Games (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Legible Normativity for AI Alignment: The Value of Silly Rules
Hadfield-Menell, Dylan, Andrus, McKane, Hadfield, Gillian K.
It has become commonplace to assert that autonomous agents will have to be built to follow human rules of behavior-social norms and laws. But human laws and norms are complex and culturally varied systems; in many cases agents will have to learn the rules. This requires autonomous agents to have models of how human rule systems work so that they can make reliable predictions about rules. In this paper we contribute to the building of such models by analyzing an overlooked distinction between important rules and what we call silly rules --rules with no discernible direct impact on welfare. We show that silly rules render a normative system both more robust and more adaptable in response to shocks to perceived stability. They make normativity more legible for humans, and can increase legibility for AI systems as well. For AI systems to integrate into human normative systems, we suggest, it may be important for them to have models that include representations of silly rules.
- South America > Brazil (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (6 more...)
Everything You Need to Watch on TV This Fall--From 'Orville' to 'Punisher'
A wonderful time of football, things inexplicably getting pumpkin spice flavoring, and way more new TV than anyone could ever possibly watch. Seriously, there are a gajillion channels and streaming networks now, how can anyone dream of knowing what to turn on? Between all the superheroes, strictly-for-adults animated programs, and 1990s reboots out there it's impossible to keep up. But we have some ideas. Below are WIRED's picks for what you should watch (or at least DVR) this season--and one or two suggestions for what you can easily skip. By far the funniest part of this science fiction adventure comedy is when the opening credits say "created by Seth MacFarlane," because longtime Star Trek fans will immediately recognize everything else as the DNA (and proteins, bones, musculature, and central nervous system) of Star Trek: The Next Generation.
- North America > United States > New York (0.05)
- North America > United States > Hawaii (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- Media > Television (1.00)
- Leisure & Entertainment (1.00)
'The Punisher' Will Feature Less Supernatural Elements, More 'Basic Human Emotions'
Netflix's next Marvel TV show is going to be a little different. While The Defenders all have super powers, Frank Castle is a little more ordinary. That means that many parts of the other Marvel shows will not crossover into "The Punisher." "We are stripping down every supernatural element," star Jon Bernthal told Entertainment Weekly. Frank is a character rooted in the most basic human emotions… He's a comic-book character, but he doesn't fly, he doesn't have X-ray vision.
- Media > Television (1.00)
- Leisure & Entertainment (1.00)
Emergence of Social Punishment and Cooperation through Prior Commitments
Han, The Anh (Teesside University)
Social punishment, whereby cooperators punish defectors, has been suggested as an important mechanism that promotes the emergence of cooperation or maintenance of social norms in the context of the one-shot (i.e. non-repeated) interaction. However, whenever antisocial punishment, whereby defectors punish cooperators, is available, this antisocial behavior outperforms social punishment, leading to the destruction of cooperation. In this paper, we use evolutionary game theory to show that this antisocial behavior can be efficiently restrained by relying on prior commitments, wherein agents can arrange, prior to an interaction, agreements regarding posterior compensation by those who dishonor the agreements. We show that, although the commitment mechanism by itself can guarantee a notable level of cooperation, a significantly higher level is achieved when both mechanisms, those of proposing prior commitments and of punishment, are available in co-presence. Interestingly, social punishment prevails and dominates in this system as it can take advantage of the commitment mechanism to cope with antisocial behaviors. That is, establishment of a commitment system helps to pave the way for the evolution of social punishment and abundant cooperation, even in the presence of antisocial punishment.