Goto

Collaborating Authors

 circuit


On Mechanistic Circuits for Extractive Question-Answering

Basu, Samyadeep, Morariu, Vlad, Wang, Zichao, Rossi, Ryan, Zhao, Cherry, Feizi, Soheil, Manjunatha, Varun

arXiv.org Artificial Intelligence

Large language models are increasingly used to process documents and facilitate question-answering on them. In our paper, we extract mechanistic circuits for this real-world language modeling task: context-augmented language modeling for extractive question-answering (QA) tasks and understand the potential benefits of circuits towards downstream applications such as data attribution to context information. We extract circuits as a function of internal model components (e.g., attention heads, MLPs) using causal mediation analysis techniques. Leveraging the extracted circuits, we first understand the interplay between the model's usage of parametric memory and retrieved context towards a better mechanistic understanding of context-augmented language models. We then identify a small set of attention heads in our circuit which performs reliable data attribution by default, thereby obtaining attribution for free in just the model's forward pass. Using this insight, we then introduce ATTNATTRIB, a fast data attribution algorithm which obtains state-of-the-art attribution results across various extractive QA benchmarks. Finally, we show the possibility to steer the language model towards answering from the context, instead of the parametric memory by using the attribution from ATTNATTRIB as an additional signal during the forward pass. Beyond mechanistic understanding, our paper provides tangible applications of circuits in the form of reliable data attribution and model steering.


Announcing the AWS DeepRacer League 2022

#artificialintelligence

Unleash the power of machine learning (ML) through hands-on learning and compete for prizes and glory. The AWS DeepRacer League is the world's first global autonomous racing competition driven by reinforcement learning; bringing together students, professionals, and enthusiasts from almost every continent. I'm Tomasz Ptak, a senior software engineer at Duco, an AWS Machine Learning Hero, an AWS DeepRacer competitor (named Breadcentric), a hobbyist baker, and a leader of the AWS Machine Learning Community on Slack, where we learn, race, and help each other start and grow our adventures in the cloud. It's my pleasure to unveil the exciting details of the upcoming 2022 AWS DeepRacer League season. It's a complete program that has helped over 175,000 individuals from over 700 businesses, educational institutions, and organizations begin their educational journey into machine learning through fun and rivalry.


SAT-based Circuit Local Improvement

Kulikov, Alexander S., Slezkin, Nikita

arXiv.org Artificial Intelligence

Finding exact circuit size is a notorious optimization problem in practice. Whereas modern computers and algorithmic techniques allow to find a circuit of size seven in blink of an eye, it may take more than a week to search for a circuit of size thirteen. One of the reasons of this behavior is that the search space is enormous: the number of circuits of size $s$ is $s^{\Theta(s)}$, the number of Boolean functions on $n$ variables is $2^{2^n}$. In this paper, we explore the following natural heuristic idea for decreasing the size of a given circuit: go through all its subcircuits of moderate size and check whether any of them can be improved by reducing to SAT. This may be viewed as a local search approach: we search for a smaller circuit in a ball around a given circuit. We report the results of experiments with various symmetric functions.


Qualitative Reasoning about Physical Systems with Multiple Perspective

AI Magazine

It was motivated by two observations regarding modeling in general and work in qualitative physics in particular. First, all modelbased reasoning is only as good as the model used (Davis and Hamscher 1988). Second, no single model is adequate or appropriate for a wide range of tasks (Weld 1989). A model of a real-world system is but an abstraction of some aspects of the system. To formulate a model of a physical system for a given task, we inevitably take certain perspectives of the system to capture proper scenarios by deciding what to describe and what to ignore (Hobbs 1985).


Universal Planning: An (Almost) Universally Bad Idea

AI Magazine

To present a sharp criticism of the approach known as universal planning, I begin by giving a precise definition of it. The key idea in this work is that an agent is working to achieve some goal and that to determine what to do next in the pursuit of this goal, the agent finds its current situation in a large table that prescribes the correct action to take. Of course, the action suggested by the table might simply be, "Think about your current situation and decide what to do next." This method is, in many ways, representative of the conventional approach to planning; however, what distinguishes universal plans from conventional plans is that the action suggested by a universal plan is always a primitive one that the agent can execute immediately (Agre and Chapman 1987; Drummond 1988; Kaelbling 1988; Nilsson 1989; Rosenschein and Kaelbling 1986; Schoppers 1987). Several authors have recently suggested that a possible approach to planning in uncertain domains is to analyze all possible situations beforehand and then store information about what to do in each.


Towards the Principled Engineering of Knowledge

AI Magazine

Toward thi: end, knowledge acquisition is sometimes considered a nccessary burden, carried out under protest so that one can gel on with t,he study of cognitive processes in problem solving In this article we argue that, t,he two activities-knowledge acqisitSion and cognitive modeling-are necessarily interwoven, and provide interesting opportunities when t,akcr Logether. Knowledge acquisition shapes cognit,ive modeling because operatzonnl knowledge cont,ains assurnpt,ions nnc directions for its use, t,hat, is, a.11 implicit, processing model In return, problem solving models can profoundly shape knowledge acquisition by providing a framework for the articulation and creation of domain expertise This int,rodllces the theme of this article, t,hat, one can engzneer bodie: of knowledge for various purposes, such as learnability, To the knowledge engineering slogan "knowledge is power," we add "knowledge is an artifact,, worthy of design " The organization of this article is as follows: We first consider the pract,ice of "cognitive advantage " III the t,hird section we suggest some A Shift in Viewpoint from Experts to Clans Over the past decade t,hcrc have been trcmcndous advances in the fabrication of integrated circuits (Robinson, 1980a). Circuits have become smaller and manufacturing costs have dropped dramatically. Design is becoming the dominant cost (Robinson, 19801) with the current round of miniaturization, which goes by the name of VLSI for very large scale integration. This is leading to a substantial int,crest in undcrst,anding design processes.


The VLS Tech-Assist Expert System

AI Magazine

Having convenient access to expert knowledge is important. In the past, we have seen users reinvent solutions because they did not have access to previous experience on the same fault. This lack of available information has led to wasted resources and, in some cases, has generated responses to the fleet that were not accurate enough. The development began in fiscal year 1992, and the area between the solid and dotted lines approximates the cost for development. The peak in fiscal year 1994 represents the end of the operational evaluation and the beginning of production operation.


The Age of Analog Networks

AI Magazine

A large class of systems of biological and technological relevance can be described as analog networks, that is, collections of dynamic devices interconnected by links of varying strength. Some examples of analog networks are genetic regulatory networks, metabolic networks, neural networks, analog electronic circuits, and control systems. Analog networks are typically complex systems that include nonlinear feedback loops and possess temporal dynamics at different time scales. Both the synthesis and reverse engineering of analog networks are recognized as knowledge-intensive activities, for which few systematic techniques exist. In this paper we will discuss the general relevance of the analog network concept and describe an evolutionary approach to the automatic synthesis and the reverse engineering of analog networks.


Engines of the Brain

AI Magazine

Vast information from the neurosciences may enable bottom-up understanding of human intelligence; that is, derivation of function from mechanism. This article describes such a research program: simulation and analysis of the circuits of the brain has led to derivation of a detailed set of elemental and composed operations emerging from individual and combined circuits. The specific hypothesis is forwarded that these operations constitute the "instruction set" of the brain, that is, the basic mental operations from which all complex behavioral and cognitive abilities are constructed, establishing a unified formalism for description of human faculties ranging from perception and learning to reasoning and language, and representing a novel and potentially fruitful research path for the construction of human-level intelligence. Attempts to construct intelligent systems are strongly impeded by the lack of formal specifications of natural intelligence, which is defined solely in terms of observed and measured human (or animal) abilities, so candidate computational descriptions of human-level intelligence are necessarily underconstrained. This simple fact underlies Turing's proposed test for intelligence: lacking any specification to test against, the sole measures at that time were empirical observations of behavior, even though such behaviors may be fitted by multiple different hypotheses and simulated by many different proposed architectures.


Research in Progress

AI Magazine

Research in the area of expert systems has developed from our experience in building consultation programs in a number of application domains (Weiss, Kulikowski, and Safir, 1978; Lindberg et al., 1980; Kulikowski, Weiss, and Galen, 1981; Kulikowski, 1980). The EXPERT system (Weiss and Kulikowski, 1979) is a generalized scheme for building expert reasoning models, exercising them with individual problems, testing and analyzing their performance on large numbers of problem-types, and improving them by knowledge base refinement techniques. The system has been operational on DEC lo/20 computers since 1978; versions also exist on VAX and IBM computers This system has been used by specialists in medicine, biomedical modeling, oil exploration, and chemistry to build models that capture their expertise in problem solving. In 1981 we complet,ed an interesting technology transfer experiment in which a model for the interpretation of serum protein electrophoresis patterns was automatically translated from its EXPERT representation into algorithmic form, and then automatically translated into assembler code for running on a microprocessor (Weiss, KuIikowski, and Galen, 1981). The EXPERT system is unusual among knowledge-based AI systems in that efficiency is a major design goal.