AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Jawahar, Ganesh, Mukherjee, Subhabrata, Dey, Debadeepta, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Mendes, Caio Cesar Teodoro, de Rosa, Gustavo Henrique, Shah, Shital

arXiv.org Artificial IntelligenceJun-7-2023

Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e.g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models. We study this problem under memory-constrained settings (e.g., edge devices and smartphones), where character-based representation is effective in reducing the overall model size (in terms of parameters). We use WikiText-103 benchmark to simulate broad prompts and demonstrate that character models rival word models in exact match accuracy for the autocomplete task, when controlled for the model size. For instance, we show that a 20M parameter character model performs similar to an 80M parameter word model in the vanilla setting. We further propose novel methods to improve character models by incorporating inductive bias in the form of compositional information and representation transfer from large word models. Datasets and code used in this work are available at https://github.com/UBC-NLP/char_autocomplete.

large language model, machine learning, simulation of human behavior, (21 more...)

arXiv.org Artificial Intelligence

2210.03251

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Marshall Islands (0.04)
North America > United States > Texas > Andrews County > Andrews (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.48)
Media > Film (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (1.00)
(2 more...)

Add feedback

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

He, Pengcheng, Peng, Baolin, Lu, Liyang, Wang, Song, Mei, Jie, Liu, Yang, Xu, Ruochen, Awadalla, Hany Hassan, Shi, Yu, Zhu, Chenguang, Xiong, Wayne, Zeng, Michael, Gao, Jianfeng, Huang, Xuedong

arXiv.org Artificial IntelligenceJun-7-2023

This paper presents Z-Code++, a new pre-trained language model optimized for abstractive text summarization. The model extends the state of the art encoder-decoder model using three techniques. First, we use a two-phase pre-training process to improve model's performance on low-resource summarization tasks. The model is first pre-trained using text corpora for language understanding, and then is continually pre-trained on summarization corpora for grounded text generation. Second, we replace self-attention layers in the encoder with disentangled attention layers, where each word is represented using two vectors that encode its content and position, respectively. Third, we use fusion-in-encoder, a simple yet effective method of encoding long sequences in a hierarchical manner. Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages. Our model is parameter-efficient in that it outperforms the 600x larger PaLM-540B on XSum, and the finetuned 200x larger GPT3-175B on SAMSum. In zero-shot and few-shot settings, our model substantially outperforms the competing models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2208.0977

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Add feedback

The AI Mona Lisa Explains Everything

The Atlantic - TechnologyJun-6-2023, 17:46:40 GMT

The Mona Lisa is small. Less than three feet tall and about two feet wide, it hangs tiny in the biggest exhibition room at France's Louvre Museum. And in the past two or so weeks, some vigilante AI artists have decided that it should be bigger--much bigger. They're making that happen using a beta tool in Adobe Photoshop called "generative fill." It launched late last month and allows users to fill in, augment, or expand an image using AI--think ChatGPT but for Photoshop.

generative ai, mona lisa, museum, (11 more...)

The Atlantic - Technology

Country: Europe > France (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback

Black Mirror written by ChatGPT: creator asked AI to write an episode of his hit Netflix show

Daily Mail - Science & techJun-6-2023, 17:37:51 GMT

The creator of the darkly addictive sci-fi series Black Mirror saw it fitting to ask ChatGPT to conjure up an episode for Season 6 only to find the chatbot'is sh***.' Charlie Brooker, 52, said he typed in'generate Black Mirror episode' and received a story'that sorta mushed' all the other ones together. The first thing Black Mirror creator Charlie Brooker did, when everyone was trying ChatGPT for the first time, was to type in'generate Black Mirror episode.' Speaking to Empire, Brooker found there was no real thought behind the AI-generated script, only that it read'plausibly.' Brooker -- who has been writing most episodes of the haunting, Twilight Zone-esque series since its first 2011 season on UK's Channel 4 -- said that his brush with an AI-generated doppelgänger of his own show did teach him to be less robotic himself. The Black Mirror creator's experience with ChatGPT has encouraged him to make bolder creative choices with future seasons of the dystopian anthology series. One upcoming episode'Beyond The Sea,' starring Josh Hartnett (above) takes place in an alternate 1969 ChatGPT was first unleashed in November, sparking excitement and alarm at its ability to generate convincingly human-like essays, poems, form letters and conversational answers to almost any question. 'I was aware that I had written lots of episodes where someone goes'Oh, I was inside a computer the whole time!''

black mirror, brooker, chatgpt, (14 more...)

Daily Mail - Science & tech

Country: Europe > United Kingdom > England (0.16)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

We Asked AI to Generate News Quizzes Based on TIME's Archives. Test Yourself With the Results

TIME - TechJun-6-2023, 17:04:19 GMT

The news quiz is a tradition at TIME that dates back to 1935. Iterations of the test were used in schools across the country to examine current-affairs knowledge, and it even came in a crossword version. Now, the recent removal of TIME's digital paywall has opened up a century of journalism for everyone, ripe for testing your knowledge about the people who shaped history. Since TIME's archive contains 200 million words, it's a task that's well-suited for the new generation of AI technology, which is able to analyze huge amounts of human-generated text in seconds. So what happens when you turn the power of cutting-edge AI to the task of generating news quizzes based on magazine articles?

instruction, news quiz, quiz, (9 more...)

TIME - Tech

Country: North America > United States > Tennessee (0.05)

Industry:

Education (0.51)
Media (0.38)
Leisure & Entertainment (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

Don't Want Students to Rely on ChatGPT? Have Them Use It

WIREDJun-6-2023, 13:00:00 GMT

When I first caught students attempting to use ChatGPT to write their essays, it felt like an inevitability. My initial reaction was frustration and irritation--not to mention gloom and doom about the slow collapse of higher education--and I suspect most educators feel the same way. But as I thought about how to respond, I realized there could be a teaching opportunity. Many of these essays used sources incorrectly, either quoting from books that did not exist or misrepresenting those that did. When students were starting to use ChatGPT, they seemed to have no idea that it could be wrong.

chatgpt, educator, student, (6 more...)

WIRED

Country: North America > United States > North Carolina (0.05)

Industry:

Education > Educational Setting (0.36)
Media > News (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Apple says new iPhone autocorrect will know you don't mean 'duck'

Washington Post - Technology NewsJun-6-2023, 12:46:25 GMT

"In those moments where you just want to type a ducking word, well, the keyboard will learn it, too," Craig Federighi, Apple's software chief, said during the event in Cupertino, Calif., explaining that the keyboard will fill in the blanks using the same technology that powers ChatGPT and that suggestions would become more personalized. That means, as The Post has reported, the automatic suggestions will be based on the words and phrases you use most and it will also apply to voice dictation.

apple, new iphone autocorrect, suggestion, (1 more...)

Washington Post - Technology News

Country: North America > United States > California > Santa Clara County > Cupertino (0.36)

Technology:

Information Technology > Communications > Mobile (0.40)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Judges likely to take AI rules into their own hands as lawmakers slow to act: experts

FOX NewsJun-6-2023, 11:00:14 GMT

Center for AI Safety Director Dan Hendrycks explains concerns about how the rapid growth of artificial intelligence could impact society. Judges are likely to take concerns over artificial intelligence into their own hands and create their own rules for the tech in courtrooms, experts say. U.S. District Judge Brantley Starr of the Northern District of Texas may have been a pioneer last week when he required lawyers who appear in his courtroom to certify they did not use artificial intelligence programs, such as ChatGPT, to draft their filings without a human checking for accuracy. "We're at least putting lawyers on notice, who might not otherwise be on notice, that they can't just trust those databases," Starr, a Trump appointed judge, told Reuters. "They've got to actually verify it themselves through a traditional database."

artificial intelligence, courtroom, lawyer, (14 more...)

FOX News

Country:

North America > United States > Texas (0.28)
Europe > United Kingdom (0.06)
North America > United States > New York (0.05)

Industry:

Law > Litigation (0.75)
Government > Regional Government > North America Government > United States Government (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.37)

Add feedback

To avoid AI doom, learn from nuclear safety

MIT Technology ReviewJun-6-2023, 08:52:52 GMT

Last week, a group of tech company leaders and AI experts pushed out another open letter, declaring that mitigating the risk of human extinction due to AI should be as much of a global priority as preventing pandemics and nuclear war. So how do companies themselves propose we avoid AI ruin? One suggestion comes from a new paper by researchers from Oxford, Cambridge, the University of Toronto, the University of Montreal, Google DeepMind, OpenAI, Anthropic, several AI research nonprofits, and Turing Prize winner Yoshua Bengio. They suggest that AI developers should evaluate a model's potential to cause "extreme" risks at the very early stages of development, even before starting any training. These risks include the potential for AI models to manipulate and deceive humans, gain access to weapons, or find cybersecurity vulnerabilities to exploit.

large language model, machine learning, natural language, (10 more...)

MIT Technology Review

Country:

North America > Canada > Ontario > Toronto (0.58)
North America > Canada > Quebec > Montreal (0.26)

Industry:

Information Technology (0.61)
Government > Military (0.58)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)

Add feedback

A Watermark for Large Language Models

Kirchenbauer, John, Geiping, Jonas, Wen, Yuxin, Katz, Jonathan, Miers, Ian, Goldstein, Tom

arXiv.org Artificial IntelligenceJun-6-2023

Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens. We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality, and can be detected using an efficient open-source algorithm without access to the language model API or parameters. The watermark works by selecting a randomized set of "green" tokens before a word is generated, and then softly promoting use of green tokens during sampling. We propose a statistical test for detecting the watermark with interpretable p-values, and derive an information-theoretic framework for analyzing the sensitivity of the watermark. We test the watermark using a multi-billion parameter model from the Open Pretrained Transformer (OPT) family, and discuss robustness and security.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2301.10226

Country:

Africa > Middle East > Egypt (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > UAE (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Leisure & Entertainment > Sports > Skiing (0.67)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Add feedback