edelman
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Large language models have the ability to generate text that mimics patterns in their inputs. We introduce a simple Markov Chain sequence modeling task in order to study how this in-context learning capability emerges. In our setting, each example is sampled from a Markov chain drawn from a prior distribution over Markov chains. Transformers trained on this task form \emph{statistical induction heads} which compute accurate next-token probabilities given the bigram statistics of the context. During the course of training, models pass through multiple phases: after an initial stage in which predictions are uniform, they learn to sub-optimally predict using in-context single-token statistics (unigrams); then, there is a rapid phase transition to the correct in-context bigram solution. We conduct an empirical and theoretical investigation of this multi-phase process, showing how successful learning results from the interaction between the transformer's layers, and uncovering evidence that the presence of the simpler unigram solution may delay formation of the final bigram solution. We examine how learning is affected by varying the prior distribution over Markov chains, and consider the generalization of our in-context learning of Markov chains (ICL-MC) task to $n$-grams for $n > 2$.
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Large language models have the ability to generate text that mimics patterns in their inputs. We introduce a simple Markov Chain sequence modeling task in order to study how this in-context learning capability emerges. In our setting, each example is sampled from a Markov chain drawn from a prior distribution over Markov chains. Transformers trained on this task form \emph{statistical induction heads} which compute accurate next-token probabilities given the bigram statistics of the context. During the course of training, models pass through multiple phases: after an initial stage in which predictions are uniform, they learn to sub-optimally predict using in-context single-token statistics (unigrams); then, there is a rapid phase transition to the correct in-context bigram solution.
On the ethics of constructing conscious AI
In its pragmatic turn, the new discipline of AI ethics came to be dominated by humanity's collective fear of its creatures, as reflected in an extensive and perennially popular literary tradition. Dr. Frankenstein's monster in the novel by Mary Shelley rising against its creator; the unorthodox golem in H. Leivick's 1920 play going on a rampage; the rebellious robots of Karel \v{C}apek -- these and hundreds of other examples of the genre are the background against which the preoccupation of AI ethics with preventing robots from behaving badly towards people is best understood. In each of these three fictional cases (as well as in many others), the miserable artificial creature -- mercilessly exploited, or cornered by a murderous mob, and driven to violence in self-defense -- has its author's sympathy. In real life, with very few exceptions, things are different: theorists working on the ethics of AI completely ignore the possibility of robots needing protection from their creators. The present book chapter takes up this, less commonly considered, ethical angle of AI.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (6 more...)
Verbal behavior without syntactic structures: beyond Skinner and Chomsky
What does it mean to know language? Since the Chomskian revolution, one popular answer to this question has been: to possess a generative grammar that exclusively licenses certain syntactic structures. Decades later, not even an approximation to such a grammar, for any language, has been formulated; the idea that grammar is universal and innately specified has proved barren; and attempts to show how it could be learned from experience invariably come up short. To move on from this impasse, we must rediscover the extent to which language is like any other human behavior: dynamic, social, multimodal, patterned, and purposive, its purpose being to promote desirable actions (or thoughts) in others and self. Recent psychological, computational, neurobiological, and evolutionary insights into the shaping and structure of behavior may then point us toward a new, viable account of language.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (12 more...)
Startup simplifies artificial intelligence for small businesses
An Ohio tech startup is helping other small businesses leverage artificial intelligence to their advantage. Cleveland-based Pandata uses digital tools and data scientists to analyze information and establish systems for other small-scale companies. Pandata's team of experts is able to help businesses that don't have the resources to employ their own staff of specialized experts. "Data scientists are expensive, hard to find, don't exist in high concentrations, especially in markets like Cleveland," said Pandata CEO/Founder Cal Al-Dhubaib. "And so, when we can pair these tools with data science experts, we can deliver the same power of the technology that bigger organizations have access to at a fraction of the cost."
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Data Science (0.85)
The brain as a probabilistic transducer: an evolutionarily plausible network architecture for knowledge representation, computation, and behavior
Halpern, Joseph Y., Lotem, Arnon
We offer a general theoretical framework for brain and behavior that is evolutionarily and computationally plausible. The brain in our abstract model is a network of nodes and edges. Although it has some similarities to standard neural network models, as we show, there are some significant differences. Both nodes and edges in our network have weights and activation levels. They act as probabilistic transducers that use a set of relatively simple rules to determine how activation levels and weights are affected by input, generate output, and affect each other. We show that these simple rules enable a learning process that allows the network to represent increasingly complex knowledge, and simultaneously to act as a computing device that facilitates planning, decision-making, and the execution of behavior. By specifying the innate (genetic) components of the network, we show how evolution could endow the network with initial adaptive rules and goals that are then enriched through learning. We demonstrate how the developing structure of the network (which determines what the brain can do and how well) is critically affected by the co-evolved coordination between the mechanisms affecting the distribution of data input and those determining the learning parameters (used in the programs run by nodes and edges). Finally, we consider how the model accounts for various findings in the field of learning and decision making, how it can address some challenging problems in mind and behavior, such as those related to setting goals and self-control, and how it can help understand some cognitive disorders.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Consumer Health (0.93)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.92)
The Pentagon's Army of Nerds
The Pentagon is not the most inviting place for first-time visitors, and it was no different for Chris Lynch. When he rode the escalator out of the Pentagon metro station, Lynch was greeted by guard dogs and security personnel wearing body armor and toting machine guns. He lost cell service upon entering the building and was forced to run through more than a half mile of hallways to make his meeting in the office of the secretary of defense. He showed up late and out of breath, his hoodie and gym shoes soaked with sweat. It was a surreal experience, Lynch told me, and it marked the beginning of "the most delightful detour of my entire life." Lynch had just completed a 45-day posting in the United States Digital Service, an organization formed in 2014 to fill what many officials viewed as a crucial gap in the government's technology expertise. That year, the White House had launched HealthCare.gov to help enroll Americans in government health insurance, but it had been a technological debacle that almost derailed the Affordable Care Act. The website was so buggy that on its first day, only six people were able to sign up through the site. In response, and to prevent similar flops from occurring in the future, the White House created the USDS.
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Edelman's Steps Toward a Conscious Artifact
In February of 2020, I participated in the "On Consciousness" podcast with Bernie Baars and David Edelman. We talked about my work at The Neurosciences Institute (NSI) in La Jolla, California on the Darwin series of Brain-Based Devices, as well as my current research in neurorobotics. Unsurprisingly, the conversation turned to consciousness. I happened to mention that a page from my old lab notebook, which is pinned to a bulletin board in my office at UC Irvine, outlines a roadmap towards the creation of a Conscious Artifact. The key steps in this roadmap were laid out by Gerald Edelman, who was the director of the NSI at the time I was a research fellow there.
- Research Report (0.50)
- Personal (0.48)
Hitting the Books: Do we really want our robots to have consciousness?
From Star Trek's Data and 2001's HAL to Columbus Day's Skippy the Magnificent, pop culture is chock full of fully conscious AI who, in many cases, are more human than the humans they serve alongside. But is all that self-actualization really necessary for these synthetic life forms to carry out their essential duties? In his new book, How to Grow a Robot: Developing Human-Friendly, Social AI, author Mark H. Lee examines the social shortcomings of the today's AI and delves into the promises and potential pitfalls surrounding deep learning techniques, currently believed to be our most effective tool at building robots capable of doing more than a handful of specialized tasks. In the excerpt below, Lee argues that the robots of tomorrow don't necessarily need -- nor should they particularly seek out -- the feelings and experiences that make up the human condition. Although I argue for self-awareness, I do not believe that we need to worry about consciousness.