Goto

Collaborating Authors

 active reading


Learning Facts at Scale with Active Reading

arXiv.org Artificial Intelligence

LLMs are known to store vast amounts of knowledge in their parametric memory. However, learning and recalling facts from this memory is known to be unreliable, depending largely on the prevalence of particular facts in the training data and other factors which are poorly understood. Practitioners are lacking tools which will allow them to ensure that the models learn a given body of knowledge reliably and consistently. To this end, we propose Active Reading: a framework where we train models to study a given set of material with self-generated learning strategies. First, we demonstrate models trained with Active Reading on expert domains absorb significantly more knowledge than vanilla finetuning and other data augmentations. We train expert 8B models that achieve 66% on a Wikipedia-grounded subset of SimpleQA (+313% relative over vanilla finetuning) and 26% on FinanceBench (+160% relative over vanilla finetuning) by applying Active Reading to the source documents for each benchmark. Finally, we show that Active Reading can be utilized at pre-training scale to build more factual models. As a demonstration of this, we release Meta WikiExpert-8B, a Wikipedia-expert model trained on 1 trillion generated tokens, which outcompetes models with hundreds of billions of parameters on factual QA.


Inquire Biology: A Textbook that Answers Questions

AI Magazine

Inquire Biology is a prototype of a new kind of intelligent textbook — one that answers students’ questions, engages their interest, and improves their understanding. Inquire Biology provides unique capabilities via a knowledge representation that captures conceptual knowledge from the textbook and uses inference procedures to answer students’ questions. Students ask questions by typing free-form natural language queries or by selecting passages of text. The system then attempts to answer the question and also generates suggested questions related to the query or selection. The questions supported by the system were chosen to be educationally useful, for example: what is the structure of X? compare X and Y? how does X relate to Y? In user studies, students found this question-answering capability to be extremely useful while reading and while doing problem solving. In an initial controlled experiment, community college students using the Inquire Biology prototype outperformed students using either a hardcopy or conventional E-book version of the same biology textbook. While additional research is needed to fully develop Inquire Biology, the initial prototype clearly demonstrates the promise of applying knowledge representation and question-answering technology to electronic textbooks.