Goto

Collaborating Authors

 Large Language Model


Here is what ChatGPT thinks of people in every US state

Daily Mail - Science & tech

ChatGPT has been accused of being woke and shying away from offensive feedback -- but not when it comes to negative stereotypes about Americans. ChatGPT stated that people in Alabama are'hillbillies', Idahoans are'gun-touting survivalists', Wisconsinites are'heavy drinkers' and people in Iowa are just plain'boring'. When it came to the most populous states, the AI said New Yorkers are rude, Californians are superficial, Texans are pro-gun, Floridians are crazy and Pennsylvanians are unwelcoming to outsiders. However, not all stereotypes were offensive. The bot described Ohioans as down-to-earth, New Mexicans as spiritual, residents in Oregon as hipsters and Nebraskans as friendly.


Goodbye to the Dried Office Mangos

The Atlantic - Technology

Even as the whole of Silicon Valley grapples with historic inflation, a bank crash, and mass layoffs, Google's woes stand apart. The explosion of ChatGPT and artificial intelligence more broadly has produced something of an existential crisis for the company, a "code red" moment for the business. Yes," Sundar Pichai, Google's CEO, told The New York Times. But Google employees are encountering another problem: "They took away the dried mango," says a project manager at Google's San Francisco office, whom I agreed not to name to protect the employee from reprisal. At least at that office, the project manager said, workers are seeing less of long-cherished food items--not just the mango, but also the Maui-onion chips and the fun-size bags of M&Ms.


How Microsoft's Bing Chatbot Came to Be--and Where It's Going Next

WIRED

Jordi Ribas hasn't taken a day off since last September. That month, the Microsoft search and AI chief got the keys to GPT-4, a then secret version of OpenAI's text-generation technology that now powers ChatGPT. As Ribas had with GPT-4's predecessors, the Barcelona native wrote in Spanish and Catalan to test the AI's knowledge of cities like his hometown and nearby Manresa. When quizzed about history, churches, and museums, its responses hit the mark. Then he asked GPT-4 to solve an electronics problem about the current flowing through a circuit.


Cheat Codex

MIT Technology Review

And then I did what an increasing number of us are doing: I turned to ChatGPT, OpenAI's massively mind-blowing generative AI software, to help me out. After training it on some of my previous work, I asked about the use of AI in education. AI is already doing big things in education. By crunching massive amounts of data on student performance, AI algorithms can tailor instruction to fit the needs of individual learners, which can mean big improvements in student outcomes. Chatbots and virtual assistants can provide students with on-the-spot assistance and feedback.


IT firm taps power of ChatGPT with tech-led Tokyo bookstore

The Japan Times

As bookstores struggle to survive across the country, a Tokyo-based IT firm has decided to go against the stream by entering the sector. For Freee, an IT firm that provides cloud-based applications to manage back-office tasks, opening Tomei Shoten (transparent bookstore) in Tokyo's Taito Ward last week marked an opportunity to experiment with an unconventional business strategy of disclosing real-time sales while also learning more about running a small-scale business. Due to the rise of e-books and online shopping, the number of bookstores in Japan has been falling for the past decade or so. There were 11,495 such outlets as of March, down 30% from 16,371 in the same month in 2013, according to the Japan Publishing Organization for Information Infrastructure Development. This could be due to a conflict with your ad-blocking or security software.


GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models

arXiv.org Artificial Intelligence

Providing natural language instructions in prompts is a useful new paradigm for improving task performance of large language models in a zero-shot setting. Recent work has aimed to improve such prompts via manual rewriting or gradient-based tuning. However, manual rewriting is time-consuming and requires subjective interpretation, while gradient-based tuning can be extremely computationally demanding for large models and may not be feasible for API-based models. In this work, we introduce Gradient-free Instructional Prompt Search (GrIPS), a gradient-free, edit-based search approach for improving task instructions for large language models. GrIPS takes in instructions designed for humans and automatically returns an improved, edited prompt, while allowing for API-based tuning. With InstructGPT models, GrIPS improves the average task performance by up to 4.30 percentage points on eight classification tasks from the Natural Instructions dataset (with similar improvements for OPT, BLOOM, and FLAN-T5). We see improvements for both instruction-only prompts and instruction + k-shot examples prompts. Notably, GrIPS outperforms manual rewriting and purely example-based prompts while controlling for the available compute and data budget. Further, performance of GrIPS is comparable to select gradient-based tuning approaches. Qualitatively, we show our edits can simplify instructions and at times make them incoherent but nonetheless improve accuracy. Our code is available at: https://github.com/archiki/GrIPS


Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models

arXiv.org Artificial Intelligence

Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating word sense in a zero-shot setting. To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context. We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance. Building on C-WLT, we introduce a zero-shot approach for WSD, tested on 18 languages from the XL-WSD dataset. Our method outperforms fully supervised baselines on recall for many evaluation languages without additional training or finetuning. This study presents a first step towards understanding how to best leverage the cross-lingual knowledge inside PLMs for robust zero-shot reasoning in any language.


Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

arXiv.org Artificial Intelligence

Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents.~\footnote{Working in progress.}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}


The Closeness of In-Context Learning and Weight Shifting for Softmax Regression

arXiv.org Artificial Intelligence

Large language models (LLMs) are known for their exceptional performance in natural language processing, making them highly effective in many human life-related or even job-related tasks. The attention mechanism in the Transformer architecture is a critical component of LLMs, as it allows the model to selectively focus on specific input parts. The softmax unit, which is a key part of the attention mechanism, normalizes the attention scores. Hence, the performance of LLMs in various NLP tasks depends significantly on the crucial role played by the attention mechanism with the softmax unit. In-context learning, as one of the celebrated abilities of recent LLMs, is an important concept in querying LLMs such as ChatGPT. Without further parameter updates, Transformers can learn to predict based on few in-context examples. However, the reason why Transformers becomes in-context learners is not well understood. Recently, several works [ASA+22,GTLV22,ONR+22] have studied the in-context learning from a mathematical perspective based on a linear regression formulation $\min_x\| Ax - b \|_2$, which show Transformers' capability of learning linear functions in context. In this work, we study the in-context learning based on a softmax regression formulation $\min_{x} \| \langle \exp(Ax), {\bf 1}_n \rangle^{-1} \exp(Ax) - b \|_2$ of Transformer's attention mechanism. We show the upper bounds of the data transformations induced by a single self-attention layer and by gradient-descent on a $\ell_2$ regression loss for softmax prediction function, which imply that when training self-attention-only Transformers for fundamental regression tasks, the models learned by gradient-descent and Transformers show great similarity.


Boosting Theory-of-Mind Performance in Large Language Models via Prompting

arXiv.org Artificial Intelligence

Large language models (LLMs) excel in many tasks in 2023, but they still face challenges in complex reasoning. Theory-of-mind (ToM) tasks, which require understanding agents' beliefs, goals, and mental states, are essential for common-sense reasoning involving humans, making it crucial to enhance LLM performance in this area. This study measures the ToM performance of GPT-4 and three GPT-3.5 variants (Davinci-2, Davinci-3, GPT-3.5-Turbo), and investigates the effectiveness of in-context learning in improving their ToM comprehension. We evaluated prompts featuring two-shot chain of thought reasoning and step-by-step thinking instructions. We found that LLMs trained with Reinforcement Learning from Human Feedback (RLHF) (all models excluding Davinci-2) improved their ToM accuracy via in-context learning. GPT-4 performed best in zero-shot settings, reaching nearly 80% ToM accuracy, but still fell short of the 87% human accuracy on the test set. However, when supplied with prompts for in-context learning, all RLHF-trained LLMs exceeded 80% ToM accuracy, with GPT-4 reaching 100%. These results demonstrate that appropriate prompting enhances LLM ToM reasoning, and they underscore the context-dependent nature of LLM cognitive capacities.