Goto

Collaborating Authors

 Large Language Model


Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling

arXiv.org Artificial Intelligence

Ensembling BERT models often significantly improves accuracy, but at the cost of significantly more computation and memory footprint. In this work, we propose Multi-CLS BERT, a novel ensembling method for CLS-based prediction tasks that is almost as efficient as a single BERT model. Multi-CLS BERT uses multiple CLS tokens with a parameterization and objective that encourages their diversity. Thus instead of fine-tuning each BERT model in an ensemble (and running them all at test time), we need only fine-tune our single Multi-CLS BERT model (and run the one model at test time, ensembling just the multiple final CLS embeddings). To test its effectiveness, we build Multi-CLS BERT on top of a state-of-the-art pretraining method for BERT (Aroca-Ouellette and Rudzicz, 2020). In experiments on GLUE and SuperGLUE we show that our Multi-CLS BERT reliably improves both overall accuracy and confidence estimation. When only 100 training samples are available in GLUE, the Multi-CLS BERT_Base model can even outperform the corresponding BERT_Large model. We analyze the behavior of our Multi-CLS BERT, showing that it has many of the same characteristics and behavior as a typical BERT 5-way ensemble, but with nearly 4-times less computation and memory.


Experimental results from applying GPT-4 to an unpublished formal language

arXiv.org Artificial Intelligence

Can large language models be used to complete mathematical tasks that are traditionally performed either manually or with the aid of theorem provers? To answer this question, a state-of-the-art system, GPT-4, was provided with a concise natural language specification for a previously unpublished formal system and asked to complete a number of tasks, from stating function and type definitions to proving simple theorems and verifying user-supplied proofs. The system completed all tasks successfully, showed extensive domain knowledge, invented helpful new syntax and semantics, and exhibited generalization and inference abilities. So the answer seems to be: yes.


Autoregressive Modeling with Lookahead Attention

arXiv.org Artificial Intelligence

To predict the next token, autoregressive models However, those NP-hard distributions are artificial. For naturally ordinarily examine the past. Could they also benefit occurring sequences, why might one expect lookahead from also examining hypothetical futures? We to help autoregressive modeling? We argue that when the consider a novel Transformer-based autoregressive sequences represent an agent's behavior, an autoregressive architecture that estimates the next-token distribution parameterization is not always the simplest description. If by extrapolating multiple continuations the behavior is goal-directed--for example, an agent trying of the past, according to some proposal distribution, to achieve high reward in a Markov Decision Process--then and attending to these extended strings. This the simplest description may include a characterization of architecture draws insights from classical AI systems the agent's environment and goals. Even if the agent explicitly such as board game players: when making consults an autoregressive policy p(action | state) a local decision, a policy may benefit from exploring at each step, that policy is not arbitrary: while it may appear possible future trajectories and analyzing complex, it was shaped by reinforcement learning or them. On multiple tasks including morphological by natural selection so as to achieve high-reward trajectories.


VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

arXiv.org Artificial Intelligence

The VNHSGE (VietNamese High School Graduation Examination) dataset, developed exclusively for evaluating large language models (LLMs), is introduced in this article. The dataset, which covers nine subjects, was generated from the Vietnamese National High School Graduation Examination and comparable tests. 300 literary essays have been included, and there are over 19,000 multiple-choice questions on a range of topics. The dataset assesses LLMs in multitasking situations such as question answering, text generation, reading comprehension, visual question answering, and more by including both textual data and accompanying images. Using ChatGPT and BingChat, we evaluated LLMs on the VNHSGE dataset and contrasted their performance with that of Vietnamese students to see how well they performed. The results show that ChatGPT and BingChat both perform at a human level in a number of areas, including literature, English, history, geography, and civics education. They still have space to grow, though, especially in the areas of mathematics, physics, chemistry, and biology. The VNHSGE dataset seeks to provide an adequate benchmark for assessing the abilities of LLMs with its wide-ranging coverage and variety of activities. We intend to promote future developments in the creation of LLMs by making this dataset available to the scientific community, especially in resolving LLMs' limits in disciplines involving mathematics and the natural sciences.


ChatGPT has an official app now. You can even talk to it.

Washington Post - Technology News

This week, OpenAI released a new ChatGPT app meant for use on iPhones and iPads. It's free, syncs with your existing chat history and you can even talk to it -- sort of. While you can blurt out things for the app to transcribe and respond to, the experience is pretty basic; in other words, don't expect it to talk back at you like Siri or Alexa.


Many AI tools are a distraction, but you'd better pay attention

Washington Post - Technology News

On Thursday, OpenAI, the company behind ChatGPT, launched a mobile app on iOS that integrates Whisper, an open-source speech-recognition system, enabling voice input. Workers can use ChatGPT for tasks such as idea generation, note summarization and technical topic assistance. In the last couple of months, Microsoft also announced new AI features for its apps in Microsoft Office, including its email provider Outlook, word processor Word and presentation maker PowerPoint. Similarly, Google released its vision and very first features for its workplace suite of tools called Google Workspace. Other workplace software providers that have recently announced AI integrations include Salesforce and Salesforce-owned Slack, Zoom, Box, Adobe and HubSpot, to name a few.


ChatGPT Is Already Obsolete

The Atlantic - Technology

Last week, at Google's annual conference dedicated to new products and technologies, the company announced a change to its premier AI product: The Bard chatbot, like OpenAI's GPT-4, will soon be able to describe images. Although it may seem like a minor update, the enhancement is part of a quiet revolution in how companies, researchers, and consumers develop and use AI--pushing the technology not only beyond remixing written language and into different media, but toward the loftier goal of a rich and thorough comprehension of the world. ChatGPT is six months old, and it's already starting to look outdated. That program and its cousins, known as large language models, mime intelligence by predicting what words are statistically likely to follow one another in a sentence. Researchers have trained these models on ever more text--at this point, every book ever and then some--with the premise that force-feeding machines more words in different configurations will yield better predictions and smarter programs.


Chuck Schumer courts bipartisan lawmakers for AI regulation

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Senate Majority Leader Chuck Schumer is attempting to spearhead legislation that will set parameters around the development and use of artificial intelligence. The arch-Democrat is meeting with Sens. Mike Rounds, Martin Heinrich and Todd Young as part of a bipartisan exploratory group, NPR reported Thursday. "Congress must move quickly," Schumer said Thursday from the Senate floor.


Engadget Podcast: How Apple and Google are highlighting accessibility

Engadget

This week, we're focusing on Global Accessibility Awareness Day (GAAD), an annual event meant to promote the need for accessible tech solutions. Cherlynn returns to tell us what Apple, Google, Adobe and others are doing to make their products more useful for people with disabilities (and, it turns out, many general users too). We also discuss Sam Altman's trip to Congress, and why we're not entirely impressed with the OpenAI CEO's calls for AI regulation. Finally, we explain why the BlackBerry movie is one of the best films about tech ever made (take that, Tetris!). Listen below or subscribe on your podcast app of choice. If you've got suggestions or topics you'd like covered on the show, be sure to email us or drop a note in the comments!


The Morning After: ChatGPT has an official iPhone app

Engadget

It's the first official smartphone app for the chatbot, joining a crowded field of third-party mobile AI software which also taps into the GPT-3.5 and GPT-4 APIs powering ChatGPT. It also allows switching between standard and GPT-4 language models for ChatGPT Plus subscribers, as well as conversation history (synced from your the desktop if you sign in with the same account) and the ability to export data and delete or rename conversations. It's only available in the US for now, but the company says it will expand to additional countries "in the coming weeks." At the same time, there are reports some tech companies are wary of staff using AI chatbots. In early April, The Economist Korea reported three Samsung employees shared confidential information with ChatGPT.)