Large Language Model
Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency
Sharma, Mandar, Muralidhar, Nikhil, Ramakrishnan, Naren
The field of Math-NLP has witnessed significant growth in recent years, motivated by the desire to expand LLM performance to the learning of non-linguistic notions (numerals, and subsequently, arithmetic reasoning). However, non-linguistic skill injection typically comes at a cost for LLMs: it leads to catastrophic forgetting of core linguistic skills, a consequence that often remains unaddressed in the literature. As Math-NLP has been able to create LLMs that can closely approximate the mathematical skills of a grade-schooler or the arithmetic reasoning skills of a calculator, the practicality of these models fail if they concomitantly shed their linguistic capabilities. In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. Our model outperforms the state-of-the-art both on injected non-linguistic skills and on linguistic knowledge retention, and does so with a fraction of the non-linguistic training data (1/4) and zero additional synthetic linguistic training data.
$SmartProbe$: A Virtual Moderator for Market Research Surveys
Seltzer, Josh, Pan, Jiahua, Cheng, Kathy, Sun, Yuxiao, Kolagati, Santosh, Lin, Jimmy, Zong, Shi
Market research surveys are a powerful methodology for understanding consumer perspectives at scale, but are limited by depth of understanding and insights. A virtual moderator can introduce elements of qualitative research into surveys, developing a rapport with survey participants and dynamically asking probing questions, ultimately to elicit more useful information for market researchers. In this work, we introduce ${\tt SmartProbe}$, an API which leverages the adaptive capabilities of large language models (LLMs), and incorporates domain knowledge from market research, in order to generate effective probing questions in any market research survey. We outline the modular processing flow of $\tt SmartProbe$, and evaluate the quality and effectiveness of its generated probing questions. We believe our efforts will inspire industry practitioners to build real-world applications based on the latest advances in LLMs. Our demo is publicly available at https://nexxt.in/smartprobe-demo
STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation
Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.
Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives
Sun, Qiushi, Han, Chengcheng, Chen, Nuo, Zhu, Renyu, Gong, Jingyang, Li, Xiang, Gao, Ming
Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks. However, tuning these models for downstream tasks usually needs exorbitant costs or is unavailable due to commercial considerations. Recently, black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations. However, most existing works have yet fully exploited the potential of gradient-free optimization under the scenario of few-shot learning. In this paper, we describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization. Specifically, our method includes three plug-and-play components: (1) Two-stage derivative-free optimization strategy that facilitates fast convergence and mitigates overfitting; (2) Automatic verbalizer construction with its novel usage under few-shot settings; (3) Better prompt initialization policy based on instruction search and auto-selected demonstration. Extensive experiments across various tasks on natural language understanding and inference demonstrate the effectiveness of our method. Our codes are publicly available at https://github.com/QiushiSun/BBT-RGB.
Watermarking Text Generated by Black-Box Language Models
Yang, Xi, Chen, Kejiang, Zhang, Weiming, Liu, Chang, Qi, Yuang, Zhang, Jie, Fang, Han, Yu, Nenghai
LLMs now exhibit human-like skills in various fields, leading to worries about misuse. Thus, detecting generated text is crucial. However, passive detection methods are stuck in domain specificity and limited adversarial robustness. To achieve reliable detection, a watermark-based method was proposed for white-box LLMs, allowing them to embed watermarks during text generation. The method involves randomly dividing the model vocabulary to obtain a special list and adjusting the probability distribution to promote the selection of words in the list. A detection algorithm aware of the list can identify the watermarked text. However, this method is not applicable in many real-world scenarios where only black-box language models are available. For instance, third-parties that develop API-based vertical applications cannot watermark text themselves because API providers only supply generated text and withhold probability distributions to shield their commercial interests. To allow third-parties to autonomously inject watermarks into generated text, we develop a watermarking framework for black-box language model usage scenarios. Specifically, we first define a binary encoding function to compute a random binary encoding corresponding to a word. The encodings computed for non-watermarked text conform to a Bernoulli distribution, wherein the probability of a word representing bit-1 being approximately 0.5. To inject a watermark, we alter the distribution by selectively replacing words representing bit-0 with context-based synonyms that represent bit-1. A statistical test is then used to identify the watermark. Experiments demonstrate the effectiveness of our method on both Chinese and English datasets. Furthermore, results under re-translation, polishing, word deletion, and synonym substitution attacks reveal that it is arduous to remove the watermark without compromising the original semantics.
AI creator on the risks, opportunities and how it may make humans 'boring'
The entrepreneur is convinced that the scale of what's coming is enormous. He reckons that in 10 years time, his company and fellow AI leaders, ChatGPT and DeepMind, will even be bigger than Google and Facebook. Predictions about technology are as tricky as predictions about politics - educated guesses that could turn out to be totally wrong. But what is clear is that a public conversation about the risks and realities of AI is now underway. We might be on the cusp of sweeping changes too big for any one company, country or politician to manage.
Toyota Leaked Vehicle Data of 2 Million Customers
SafeGraph, the data broker famous for selling location data linked to abortion clinic visits, is now a US military contractor. Documents obtained by WIRED reveal that the company landed an initial contract with the US Air Force and is hoping the Pentagon will buy a tool that SafeGraph says will pinpoint locations not to bomb, like schools and hospitals. Your data is, of course, everywhere--likely including in the training data of generative AI tools like ChatGPT. Fortunately, at least some users can request that OpenAI, which created the tool, delete their data. It's also possible to delete your chat history with ChatGPT.
The ultimate Premier League football team... according to ChatGPT
AI bot ChatGPT has named its ultimate Premier League line-up – but many fans may be surprised by some controversial omissions. MailOnline asked the tool, 'Can you give me your ultimate Premier League football team?' and it gave 11 Premier League winners in a 4-3-3 formation. But some big names are missing from the lineup, including Frank Lampard, Wayne Rooney and Gareth Bale. It even omitted Ryan Giggs – who has more Premier League winners' medals than any other player. Also missing are modern greats including Harry Kane, Mohamed Salah, Kevin De Bruyne and Erling Haaland, who now holds the record for most goals in a single Premier League season.
Ministers not doing enough to control AI, says UK professor
One of the professors at the forefront of artificial intelligence has said ministers are not doing enough to protect against the dangers of super-intelligent machines in the future. In the latest contribution to the debate about the safety of the ever-quickening development of AI, Prof Stuart Russell told the Times that the government was reluctant to regulate the industry despite the concerns that the technology could get out of control and threaten the future of humanity. Russell, a lecturer at the University of California in Berkeley and former adviser to the US and UK governments, told the Times he was concerned that ChatGPT, which was released in November, could become part of a super-intelligent machine that could not be constrained. "How do you maintain power over entities more powerful than you – for ever?" he asked. "If you don't have an answer, then stop doing the research. "The stakes couldn't be higher: if we don't control our own civilisation, we have no say in whether we continue to exist." After the release of ChatGPT to the public last year, which has been used to write prose and has already worried lecturers and teachers over its use in universities and schools, the debate has intensified over its safety in the long-term. Elon Musk, the Tesla founder and Twitter owner, and the Apple co-founder Steve Wozniak, along with 1,000 AI experts, wrote a letter to warn that there was an "out-of-control race" going on at AI labs and called for a pause on the creation of giant-scale AI. The letter warned the labs were developing "ever more powerful digital minds that no one, not even their creators, can understand, predict or reliably control". There is also concern about its wider application. A House of Lords committee this week heard evidence from Sir Lawrence Freedman, a war studies professor, who spoke about the concerns on how AI might be used in future wars. Google's rival, Bard, is due to be released in the EU later this year. Russell himself previously worked for the UN on how to monitor the nuclear test-ban treaty, and was asked to work with Whitehall earlier this year. He said: "The Foreign Office … talked to a lot of people and they concluded that loss of control was a plausible and extremely high-significance outcome." "And then the government came out with a regulatory approach that says: 'Nothing to see here … we'll welcome the AI industry as if we were talking about making cars or something like that'.
Developer creates pro-First Amendment AI to counter ChatGPT's 'political motivations'
ChatGPT has political biases when answering questions, opening the door for competition whose models provide objectivity in their answers, an AI developer said. LOS ANGELES – An AI researcher developed a free speech alternative to ChatGPT and argued that the mainstream model has a liberal bias that prevents it from answering certain questions. "ChatGPT has political motivations, and it's seen through the product," said Arvin Bhangu, who founded the AI model Superintelligence. We've seen where you can ask it give me 10 things Joe Biden has done well and give me 10 things Donald Trump has done well and it refuses to give quality answers for Donald Trump." "Superintelligence is much more in line with the freedom to ask any type of question, so it's much more in line with the First Amendment than ChatGPT," Bhangu said. ChatGPT, an AI chatbot that can write essays, code and more, has been criticized for having politically biased responses.