Goto

Collaborating Authors

 Edelman, Shimon


Polarize, Catalyze, Stabilize: How a minority of norm internalizers amplify group selection and punishment

arXiv.org Artificial Intelligence

Many mechanisms behind the evolution of cooperation, such as reciprocity, indirect reciprocity, and altruistic punishment, require group knowledge of individual actions. But what keeps people cooperating when no one is looking? Conformist norm internalization, the tendency to abide by the behavior of the majority of the group, even when it is individually harmful, could be the answer. In this paper, we analyze a world where (1) there is group selection and punishment by indirect reciprocity but (2) many actions (half) go unobserved, and therefore unpunished. Can norm internalization fill this "observation gap" and lead to high levels of cooperation, even when agents may in principle cooperate only when likely to be caught and punished? Specifically, we seek to understand whether adding norm internalization to the strategy space in a public goods game can lead to higher levels of cooperation when both norm internalization and cooperation start out rare. We found the answer to be positive, but, interestingly, not because norm internalizers end up making up a substantial fraction of the population, nor because they cooperate much more than other agent types. Instead, norm internalizers, by polarizing, catalyzing, and stabilizing cooperation, can increase levels of cooperation of other agent types, while only making up a minority of the population themselves.


On the ethics of constructing conscious AI

arXiv.org Artificial Intelligence

In its pragmatic turn, the new discipline of AI ethics came to be dominated by humanity's collective fear of its creatures, as reflected in an extensive and perennially popular literary tradition. Dr. Frankenstein's monster in the novel by Mary Shelley rising against its creator; the unorthodox golem in H. Leivick's 1920 play going on a rampage; the rebellious robots of Karel \v{C}apek -- these and hundreds of other examples of the genre are the background against which the preoccupation of AI ethics with preventing robots from behaving badly towards people is best understood. In each of these three fictional cases (as well as in many others), the miserable artificial creature -- mercilessly exploited, or cornered by a murderous mob, and driven to violence in self-defense -- has its author's sympathy. In real life, with very few exceptions, things are different: theorists working on the ethics of AI completely ignore the possibility of robots needing protection from their creators. The present book chapter takes up this, less commonly considered, ethical angle of AI.


Verbal behavior without syntactic structures: beyond Skinner and Chomsky

arXiv.org Artificial Intelligence

What does it mean to know language? Since the Chomskian revolution, one popular answer to this question has been: to possess a generative grammar that exclusively licenses certain syntactic structures. Decades later, not even an approximation to such a grammar, for any language, has been formulated; the idea that grammar is universal and innately specified has proved barren; and attempts to show how it could be learned from experience invariably come up short. To move on from this impasse, we must rediscover the extent to which language is like any other human behavior: dynamic, social, multimodal, patterned, and purposive, its purpose being to promote desirable actions (or thoughts) in others and self. Recent psychological, computational, neurobiological, and evolutionary insights into the shaping and structure of behavior may then point us toward a new, viable account of language.


Unsupervised Context Sensitive Language Acquisition from a Large Corpus

Neural Information Processing Systems

We describe a pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structured knowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentences as paths on a graph whose vertices are words (or parts of words). Significant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented by trees composed of significant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected to a standard test of English as a Second Language (ESL) proficiency. The results are encouraging: the model attains a level of performance considered to be "intermediate" for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children.


Unsupervised Context Sensitive Language Acquisition from a Large Corpus

Neural Information Processing Systems

We describe a pattern acquisition algorithm that learns, in an unsupervised fashion,a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structuredknowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentencesas paths on a graph whose vertices are words (or parts of words). Significant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented bytrees composed of significant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected toa standard test of English as a Second Language (ESL) proficiency. Theresults are encouraging: the model attains a level of performance consideredto be "intermediate" for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children.


Automatic Acquisition and Efficient Representation of Syntactic Structures

Neural Information Processing Systems

The distributional principle according to which morphemes that occur in identical contexts belong, in some sense, to the same category [1] has been advanced as a means for extracting syntactic structures from corpus data. We extend this principle by applying it recursively, and by using mutual information for estimating category coherence. The resulting model learns, in an unsupervised fashion, highly structured, distributed representations of syntactic knowledge from corpora. It also exhibits promising behavior in tasks usually thought to require representations anchored in a grammar, such as systematicity.


Automatic Acquisition and Efficient Representation of Syntactic Structures

Neural Information Processing Systems

The distributional principle according to which morphemes that occur in identical contexts belong, in some sense, to the same category [1] has been advanced as a means for extracting syntactic structures from corpus data. We extend this principle by applying it recursively, and by using mutualinformation for estimating category coherence. The resulting model learns, in an unsupervised fashion, highly structured, distributed representations of syntactic knowledge from corpora. It also exhibits promising behavior in tasks usually thought to require representations anchored in a grammar, such as systematicity.


Probabilistic principles in unsupervised learning of visual structure: human data and a model

Neural Information Processing Systems

To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.


Probabilistic principles in unsupervised learning of visual structure: human data and a model

Neural Information Processing Systems

To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.


Probabilistic principles in unsupervised learning of visual structure: human data and a model

Neural Information Processing Systems

To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilitiesof the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response timesfor various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets thatcontained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance oftheir co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.