macaw
Ancient Andean parrot trade route stretched over 300 miles
The sophisticated network crossed mountains in Peru and pre-dates the Inca Empire. Breakthroughs, discoveries, and DIY tips sent six days a week. Ancient parrots really got around. A new analysis of their DNA found that humans transported living Amazonian macaw parrots across the Andes mountains to coastal Peru hundreds of years before the Inca Empire. The findings are detailed in a study published today in the journal and reveal a highly sophisticated and long-distance bird trading network across deserts, highlands, and rainforests.
MACAW: A Causal Generative Model for Medical Imaging
Vigneshwaran, Vibujithan, Ohara, Erik, Wilms, Matthias, Forkert, Nils
Although deep learning techniques show promising results for many neuroimaging tasks in research settings, they have not yet found widespread use in clinical scenarios. One of the reasons for this problem is that many machine learning models only identify correlations between the input images and the outputs of interest, which can lead to many practical problems, such as encoding of uninformative biases and reduced explainability. Thus, recent research is exploring if integrating a priori causal knowledge into deep learning models is a potential avenue to identify these problems. This work introduces a new causal generative architecture named Masked Causal Flow (MACAW) for neuroimaging applications. Within this context, three main contributions are described. First, a novel approach that integrates complex causal structures into normalizing flows is proposed. Second, counterfactual prediction is performed to identify the changes in effect variables associated with a cause variable. Finally, an explicit Bayesian inference for classification is derived and implemented, providing an inherent uncertainty estimation. The feasibility of the proposed method was first evaluated using synthetic data and then using MRI brain data from more than 23000 participants of the UK biobank study. The evaluation results show that the proposed method can (1) accurately encode causal reasoning and generate counterfactuals highlighting the structural changes in the brain known to be associated with aging, (2) accurately predict a subject's age from a single 2D MRI slice, and (3) generate new samples assuming other values for subject-specific indicators such as age, sex, and body mass index. The code for a toy dataset is available at the following link: https://github.com/vibujithan/macaw-2D.git.
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning
Tafjord, Oyvind, Mishra, Bhavana Dalvi, Clark, Peter
Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning. Such a capability would allow better understanding of why a model produced the answer it did. Our approach is to recursively combine a trained backward-chaining model, capable of generating a set of premises entailing an answer hypothesis, with a verifier that checks that the model itself believes those premises (and the entailment itself) through self-querying. To our knowledge, this is the first system to generate multistep chains that are both faithful (the answer follows from the reasoning) and truthful (the chain reflects the system's own internal beliefs). In evaluation using two different datasets, users judge that a majority (70%+) of generated chains clearly show how an answer follows from a set of facts - substantially better than a high-performance baseline - while preserving answer accuracy. By materializing model beliefs that systematically support an answer, new opportunities arise for understanding the model's system of belief, and diagnosing and correcting its misunderstandings when an answer is wrong.
General-Purpose Question-Answering with Macaw
While OpenAI's GPT-3 system has proved to be remarkably effective at many tasks, including question-answering (QA), it is still out of reach for many organizations, being only available to approved users for a fee. While there are a few other pretrained QA systems available, none has quite matched GPT-3's few-shot QA performance -- until now. AI2 has just released Macaw (multi-angle question-answering), a versatile, generative question-answering (QA) system that exhibits strong zero-shot performance on a wide range of question types. On a suite of 300 challenge questions, Macaw outperformed GPT-3 by over 10%, even though Macaw is an order of magnitude smaller (11 billion vs. 175 billion parameters). Even better, Macaw is publicly available for free.
Radar Trends to Watch: June 2022
Is thinking of autonomous vehicles as AI systems rather than as robots the next step forward? A new wave of startups is trying techniques such as reinforcement learning to train AVs to drive safely. Generative Flow Networks may be the next major step in building better AI systems. The ethics of building AI bots that mimic real dead people seems like an academic question, until someone does it: using GPT-3, a developer created a bot based on his deceased fiancée. OpenAI objected, stating that building such a bot was a violation of its terms of service.
Global Big Data Conference
OpenAI's impressive AI language model GPT-3 has plenty of things going it, but with 175 billion parameters no one would claim it's particularly streamlined. The Allen Institute for AI (AI2) has demonstrated a model that performs as well or better than GPT-3 on answering questions, but is a tenth the size. Macaw, AI2's model, emerged from research being done at the nonprofit into creating an AI that performs at human levels on standardized tests. "After we got a very high score they moved on to harder questions," said AI2 head Oren Etzioni. "There's this paradox where sometimes the questions that are easiest for people are the hardest for machines -- and the biggest gap was in common sense." For instance, he said, asking "When did Tom Hanks land on the moon?" GPT-3 says 1995, since that's when the film Apollo 13 came out.
AI2 shows off an open, Q&A-focused rival to GPT3 – TechCrunch
OpenAI's impressive AI language model GPT-3 has plenty of things going it, but with 175 billion parameters no one would claim it's particularly streamlined. The Allen Institute for AI (AI2) has demonstrated a model that performs as well or better than GPT-3 on answering questions, but is a tenth the size. Macaw, AI2's model, emerged from research being done at the nonprofit into creating an AI that performs at human levels on standardized tests. "After we got a very high score they moved on to harder questions," said AI2 head Oren Etzioni. "There's this paradox where sometimes the questions that are easiest for people are the hardest for machines -- and the biggest gap was in common sense."
AI models are becoming better at answering questions, but they're not perfect
Did you miss a session from the Future of Work Summit? Let the OSS Enterprise newsletter guide your open source journey! Late last year, the Allen Institute for AI, the research institute founded by the late Microsoft cofounder Paul Allen, quietly open-sourced a large AI language model called Macaw. Unlike other language models that've captured the public's attention recently (see OpenAI's GPT-3), Macaw is fairly limited in what it can do, only answering and generating questions. But the researchers behind Macaw claim that it can outperform GPT-3 on a set of questions, despite being an order of magnitude smaller.
DREAM: Uncovering Mental Models behind Language Models
Gu, Yuling, Mishra, Bhavana Dalvi, Clark, Peter
To what extent do language models (LMs) build "mental models" of a scene when answering situated questions (e.g., questions about a specific ethical dilemma)? While cognitive science has shown that mental models play a fundamental role in human problem-solving, it is unclear whether the high question-answering performance of existing LMs is backed by similar model building - and if not, whether that can explain their well-known catastrophic failures. We observed that Macaw, an existing T5-based LM, when probed provides somewhat useful but inadequate mental models for situational questions (estimated accuracy=43%, usefulness=21%, consistency=42%). We propose DREAM, a model that takes a situational question as input to produce a mental model elaborating the situation, without any additional task specific training data for mental models. It inherits its social commonsense through distant supervision from existing NLP resources. Our analysis shows that DREAM can produce significantly better mental models (estimated accuracy=67%, usefulness=37%, consistency=71%) compared to Macaw. Finally, mental models generated by DREAM can be used as additional context for situational QA tasks. This additional context improves the answer accuracy of a Macaw zero-shot model by between +1% and +4% (absolute) on three different datasets.
Offline Meta-Reinforcement Learning with Advantage Weighting
Mitchell, Eric, Rafailov, Rafael, Peng, Xue Bin, Levine, Sergey, Finn, Chelsea
This paper introduces the offline meta-reinforcement learning (offline meta-RL) problem setting and proposes an algorithm that performs well in this setting. Offline meta-RL is analogous to the widely successful supervised learning strategy of pretraining a model on a large batch of fixed, pre-collected data (possibly from various tasks) and fine-tuning the model to a new task with relatively little data. That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks and adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task. By nature of being offline, algorithms for offline meta-RL can utilize the largest possible pool of training data available and eliminate potentially unsafe or costly data collection during meta-training. This setting inherits the challenges of offline RL, but it differs significantly because offline RL does not generally consider a) transfer to new tasks or b) limited data from the test task, both of which we face in offline meta-RL. Targeting the offline meta-RL setting, we propose Meta-Actor Critic with Advantage Weighting (MACAW). MACAW is an optimization-based meta-learning algorithm that uses simple, supervised regression objectives for both the inner and outer loop of meta-training. On offline variants of common meta-RL benchmarks, we empirically find that this approach enables fully offline meta-reinforcement learning and achieves notable gains over prior methods. Meta-reinforcement learning (meta-RL) has emerged as a promising strategy for tackling the high sample complexity of reinforcement learning algorithms, when the goal is to ultimately learn many tasks. Meta-RL algorithms exploit shared structure among tasks during meta-training, amortizing the cost of learning across tasks and enabling rapid adaptation to new tasks during meta-testing from only a small amount of experience. Yet unlike in supervised learning, where large amounts of pre-collected data can be pooled from many sources to train a single model, existing meta-RL algorithms assume the ability to collect millions of environment interactions online during meta-training.