Goto

Collaborating Authors

 Large Language Model


Estimating the Personality of White-Box Language Models

arXiv.org Artificial Intelligence

Technology for open-ended language generation, a key application of artificial intelligence, has advanced to a great extent in recent years. Large-scale language models, which are trained on large corpora of text, are being used in a wide range of applications everywhere, from virtual assistants to conversational bots. While these language models output fluent text, existing research shows that these models can and do capture human biases. Many of these biases, especially those that could potentially cause harm, are being well-investigated. On the other hand, studies that infer and change human personality traits inherited by these models have been scarce or non-existent. Our work seeks to address this gap by exploring the personality traits of several large-scale language models designed for open-ended text generation and the datasets used for training them. We build on the popular Big Five factors and develop robust methods that quantify the personality traits of these models and their underlying datasets. In particular, we trigger the models with a questionnaire designed for personality assessment and subsequently classify the text responses into quantifiable traits using a Zero-shot classifier. Our estimation scheme sheds light on an important anthropomorphic element found in such AI models and can help stakeholders decide how they should be applied as well as how society could perceive them. Additionally, we examined approaches to alter these personalities, adding to our understanding of how AI models can be adapted to specific contexts.


Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

arXiv.org Artificial Intelligence

A medical provider's summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing "patient does not have a fever" when a fever is present) can be detrimental to the outcome of care for the patient. This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.


Can GPT-3 Perform Statutory Reasoning?

arXiv.org Artificial Intelligence

Statutory reasoning is the task of reasoning with facts and statutes, which are rules written in natural language by a legislature. It is a basic legal skill. In this paper we explore the capabilities of the most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA. We consider a variety of approaches, including dynamic few-shot prompting, chain-of-thought prompting, and zero-shot prompting. While we achieve results with GPT-3 that are better than the previous best published results, we also identify several types of clear errors it makes. We investigate why these errors happen. We discover that GPT-3 has imperfect prior knowledge of the actual U.S. statutes on which SARA is based. More importantly, we create simple synthetic statutes, which GPT-3 is guaranteed not to have seen during training. We find GPT-3 performs poorly at answering straightforward questions about these simple synthetic statutes.


CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

arXiv.org Artificial Intelligence

Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning ability on many NLP tasks. A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it. However, it is nontrivial to perform information extraction (IE) tasks with NL-LLMs since the output of the IE task is usually structured and therefore is hard to be converted into plain text. In this paper, we propose to recast the structured output in the form of code instead of natural language and utilize generative LLMs of code (Code-LLMs) such as Codex to perform IE tasks, in particular, named entity recognition and relation extraction. In contrast to NL-LLMs, we show that Code-LLMs can be well-aligned with these IE tasks by designing code-style prompts and formulating these IE tasks as code generation tasks. Experiment results on seven benchmarks show that our method consistently outperforms fine-tuning moderate-size pre-trained models specially designed for IE tasks (e.g., UIE) and prompting NL-LLMs under few-shot settings. We further conduct a series of in-depth analyses to demonstrate the merits of leveraging Code-LLMs for IE tasks.


Solving Regularized Exp, Cosh and Sinh Regression Problems

arXiv.org Artificial Intelligence

In modern machine learning, attention computation is a fundamental task for training large language models such as Transformer, GPT-4 and ChatGPT. In this work, we study exponential regression problem which is inspired by the softmax/exp unit in the attention mechanism in large language models. The standard exponential regression is non-convex. We study the regularization version of exponential regression problem which is a convex problem. We use approximate newton method to solve in input sparsity time. Formally, in this problem, one is given matrix $A \in \mathbb{R}^{n \times d}$, $b \in \mathbb{R}^n$, $w \in \mathbb{R}^n$ and any of functions $\exp, \cosh$ and $\sinh$ denoted as $f$. The goal is to find the optimal $x$ that minimize $ 0.5 \| f(Ax) - b \|_2^2 + 0.5 \| \mathrm{diag}(w) A x \|_2^2$. The straightforward method is to use the naive Newton's method. Let $\mathrm{nnz}(A)$ denote the number of non-zeros entries in matrix $A$. Let $\omega$ denote the exponent of matrix multiplication. Currently, $\omega \approx 2.373$. Let $\epsilon$ denote the accuracy error. In this paper, we make use of the input sparsity and purpose an algorithm that use $\log ( \|x_0 - x^*\|_2 / \epsilon)$ iterations and $\widetilde{O}(\mathrm{nnz}(A) + d^{\omega} )$ per iteration time to solve the problem.


Visual Tuning

arXiv.org Artificial Intelligence

Fine-tuning visual models has been widely shown promising performance on many downstream visual tasks. With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer. Instead, recent advances can achieve superior performance than full-tuning the whole pre-trained parameters by updating far fewer parameters, enabling edge devices and downstream applications to reuse the increasingly large foundation models deployed on the cloud. With the aim of helping researchers get the full picture and future directions of visual tuning, this survey characterizes a large and thoughtful selection of recent works, providing a systematic and comprehensive overview of existing work and models. Specifically, it provides a detailed background of visual tuning and categorizes recent visual tuning techniques into five groups: prompt tuning, adapter tuning, parameter tuning, and remapping tuning. Meanwhile, it offers some exciting research directions for prospective pre-training and various interactions in visual tuning.


GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System

arXiv.org Artificial Intelligence

This technical paper introduces a chatting robot system that utilizes recent advancements in large-scale language models (LLMs) such as GPT-3 and ChatGPT (Fig.1). The system is integrated with a co-speech gesture generation system, which selects appropriate gestures based on the conceptual meaning of speech. Our motivation is to explore ways of utilizing the recent progress in LLMs for practical robotic applications, which benefits the development of both chatbots and LLMs. Specifically, it enables the development of highly responsive chatbot systems by leveraging LLMs and adds visual effects to the user interface of LLMs as an additional value. The source code for the system is available on GitHub for our in-house robot and GitHub for Toyota HSR.


Microsoft: Copilot AI helps you skip meetings, zoom through email

PCWorld

The AI-powered Microsoft 365 Copilot could allow you to skip or even double-book meetings without missing out on what was discussed, "hopscotch" through priority email, and more. Microsoft 365 Copilot, announced in March, unfortunately remains in preview. Microsoft is confident enough of what it can do, though, that it said today that it's charging 600 worldwide customers to try it out as part of a Microsoft 365 Copilot Early Access Pass. In March, corporate vice president Jared Spataro said that Copilot would come to basically all Microsoft 365 apps: Word, PowerPoint, Excel, Teams and more. The company released a number of video demonstrations of how Microsoft 365 Copilot will work in its various apps.


'Death of an Author' Prophesies the Future of AI Novels

WIRED

The first time I played the tabletop game Fiasco, it wasn't the story my friends and I made that blew me away. It was the realization that I had just experienced the limitless possibilities of collaborative writing, that the novels I loved featured just one way their narratives could have played out. Alice could have transformed the Mad Tea Party into Wonderland's first organic tea shop. Don Quixote could have devolved into a windmill-killer for hire. Later I realized the similarities between tabletop games and ways novelists challenge their narrative choices, from literary constraints to automatic writing to William Burroughs' cutup method.


Google's behind in AI. Its big event this week could change that.

Washington Post - Technology News

Showing off new tech to customers, the media and investors is key given the perception from analysts and industry observers that Google fumbled its March launch of the "Bard" chatbot, four months after OpenAI debuted ChatGPT and after Microsoft rebooted its Bing search engine with ChatGPT. For most of its two decades, Google has enjoyed a reputation as the undisputed leader in its core business areas. Google Search has no serious competitors, and Google Maps, Gmail, and the Chrome web browser dominate their product categories so deeply that antitrust authorities in multiple countries have launched investigations or filed lawsuits alleging that the company is breaking competition laws. That dominance allowed the company to grow ever bigger, hiring thousands of new employees in the past few years and expanding into new product areas.