Goto

Collaborating Authors

 Large Language Model


Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering

arXiv.org Artificial Intelligence

There has been a growing effort to replace hand extraction of data from research papers with automated data extraction based on natural language processing, language models, and recently, large language models (LLMs). Although these methods enable efficient extraction of data from large sets of research papers, they require a significant amount of up-front effort, expertise, and coding. In this work we propose the ChatExtract method that can fully automate very accurate data extraction with minimal initial effort and background, using an advanced conversational LLM. ChatExtract consists of a set of engineered prompts applied to a conversational LLM that both identify sentences with data, extract that data, and assure the data's correctness through a series of follow-up questions. These follow-up questions largely overcome known issues with LLMs providing factually inaccurate responses. ChatExtract can be applied with any conversational LLMs and yields very high quality data extraction. In tests on materials data we find precision and recall both close to 90% from the best conversational LLMs, like ChatGPT-4. We demonstrate that the exceptional performance is enabled by the information retention in a conversational model combined with purposeful redundancy and introducing uncertainty through follow-up prompts. These results suggest that approaches similar to ChatExtract, due to their simplicity, transferability, and accuracy are likely to become powerful tools for data extraction in the near future. Finally, databases for critical cooling rates of metallic glasses and yield strengths of high entropy alloys are developed using ChatExtract.


Auditing large language models: a three-layered approach

arXiv.org Artificial Intelligence

Large language models (LLMs) represent a major advance in artificial intelligence (AI) research. However, the widespread use of LLMs is also coupled with significant ethical and social challenges. Previous research has pointed towards auditing as a promising governance mechanism to help ensure that AI systems are designed and deployed in ways that are ethical, legal, and technically robust. However, existing auditing procedures fail to address the governance challenges posed by LLMs, which display emergent capabilities and are adaptable to a wide range of downstream tasks. In this article, we address that gap by outlining a novel blueprint for how to audit LLMs. Specifically, we propose a three-layered approach, whereby governance audits (of technology providers that design and disseminate LLMs), model audits (of LLMs after pre-training but prior to their release), and application audits (of applications based on LLMs) complement and inform each other. We show how audits, when conducted in a structured and coordinated manner on all three levels, can be a feasible and effective mechanism for identifying and managing some of the ethical and social risks posed by LLMs. However, it is important to remain realistic about what auditing can reasonably be expected to achieve. Therefore, we discuss the limitations not only of our three-layered approach but also of the prospect of auditing LLMs at all. Ultimately, this article seeks to expand the methodological toolkit available to technology providers and policymakers who wish to analyse and evaluate LLMs from technical, ethical, and legal perspectives.


BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

arXiv.org Artificial Intelligence

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.


Congress is reportedly limiting staff use of AI models like ChatGPT

Engadget

Congress apparently has strict limits on the use of ChatGPT and similar generative AI tools. Axios claims to have obtained a memo from House of Representatives administrative chief Catherine Szpindor setting narrow conditions for the use of ChatGPT and similar large language AI models in congressional offices. Staff are only allowed to use the paid ChatGPT Plus service due to its tighter privacy controls, and then only for "research and evaluation," Szpindor says. House offices are only allowed to use the chatbot with publicly accessible data even when using Plus, Szpindor adds. The privacy features have to be manually enabled to prevent interactions from feeding data into the AI model.


There Will Never Be Another Second Life

The Atlantic - Technology

The other night, I had an odd conversation with ChatGPT, made somewhat stranger because the AI's answers came out of a humanoid rabbit idly sucking on a juice box. He was standing alone in a virtual novelty store in Second Life, where he had recently been fired. The rabbit, the shop owner explained to me later, was meant to be a clerk, "but he kept trying to sell items that were not for sale." So the rabbit had been demoted to the role of greeter, chatting with customers about the nature of comedy, his own existence, or whatever else they cared to ask. BunnyGPT is among the first bots in the virtual world to have its "mind" wired to OpenAI's large language model.


GPT-4 Stable-Diffusion ?: Enhancing prompt understanding of text-to-image diffusion models with large language models

AIHub

Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, despite their impressive capabilities, diffusion models, such as Stable Diffusion, often struggle to accurately follow the prompts when spatial or common sense reasoning is required. The following figure lists four scenarios in which Stable Diffusion falls short in generating images that accurately correspond to the given prompts, namely negation, numeracy, and attribute assignment, spatial relationships. In contrast, our method, LLM-grounded Diffusion (LMD), delivers much better prompt understanding in text-to-image generation in those scenarios. Figure 1: LLM-grounded Diffusion enhances the prompt understanding ability of text-to-image diffusion models.


Google DeepMind CEO Demis Hassabis Says Its Next Algorithm Will Eclipse ChatGPT

WIRED

In 2016, an artificial intelligence program called AlphaGo from Google's DeepMind AI lab made history by defeating a champion player of the board game Go. Now Demis Hassabis, DeepMind's cofounder and CEO, says his engineers are using techniques from AlphaGo to make an AI system dubbed Gemini that will be more capable than that behind OpenAI's ChatGPT. DeepMind's Gemini, which is still in development, is a large language model that works with text and is similar in nature to GPT-4, which powers ChatGPT. But Hassabis says his team will combine that technology with techniques used in AlphaGo, aiming to give the system new capabilities such as planning or the ability to solve problems. "At a high level you can think of Gemini as combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of the large models," Hassabis says.


Meta's new AI lets people make chatbots. They're using it for sex.

Washington Post - Technology News

As Google and OpenAI have grown more secretive about their most powerful AI models, Meta has emerged as a surprising corporate champion of open-source AI. In February it released LLaMA, a language model that's less powerful than GPT-4, but more customizable and cheaper to run. Meta initially withheld key parts of the model's code and planned to limit access to authorized researchers. But by early March those parts, known as the model's "weights," had leaked onto public forums, making LLaMA freely accessible to all.


ChatGPT-style teddy bears could read bedtime stories, toymaker claims

Daily Mail - Science & tech

Teddy bears that read your children stories sounds like a premise for a horror film โ€“ but one expert says it will become a reality in just five years. Allan Wong, co-founder of toymaker VTech, thinks teddies will be fitted with AI that will offer an alternative to parents reading to their kids. Like a cross between ChatGPT and Furby, the toy would listen to everything the child says and use the data to create personalised bedtime tales just for them. AI-enabled teddies will likely be available in 2028, Wong said, although he admitted the possibilities of smart tech are'a little scary'. Smart toys by created Wong's firm have already been the subject of a Which?


AI-powered personalised medicine could revolutionise healthcare (and no, we're not putting ChatGPT in charge) Mihaela van der Schaar

The Guardian

From the soaring costs of US healthcare to the recurrent NHS crisis, it can often seem that effective and affordable healthcare is impossible. This will only get worse as chronic conditions grow in prevalence and we discover new ways to treat previously fatal diseases. These new treatments tend to be costly, while new approaches can be hard to introduce into healthcare systems that are either resistant to change or fatigued by too much of it. Meanwhile, growing demand for social care is compounding funding pressure and making the allocation of resources even more complicated. Artificial intelligence (AI) is often glibly posed as the answer for services that are already forced to do more with less.