Large Language Model
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Laurençon, Hugo, Saulnier, Lucile, Wang, Thomas, Akiki, Christopher, del Moral, Albert Villanova, Scao, Teven Le, Von Werra, Leandro, Mou, Chenghao, Ponferrada, Eduardo González, Nguyen, Huu, Frohberg, Jörg, Šaško, Mario, Lhoest, Quentin, McMillan-Major, Angelina, Dupont, Gerard, Biderman, Stella, Rogers, Anna, allal, Loubna Ben, De Toni, Francesco, Pistilli, Giada, Nguyen, Olivier, Nikpoor, Somaieh, Masoud, Maraim, Colombo, Pierre, de la Rosa, Javier, Villegas, Paulo, Thrush, Tristan, Longpre, Shayne, Nagel, Sebastian, Weber, Leon, Muñoz, Manuel, Zhu, Jian, Van Strien, Daniel, Alyafeai, Zaid, Almubarak, Khalid, Vu, Minh Chien, Gonzalez-Dios, Itziar, Soroa, Aitor, Lo, Kyle, Dey, Manan, Suarez, Pedro Ortiz, Gokaslan, Aaron, Bose, Shamik, Adelani, David, Phan, Long, Tran, Hieu, Yu, Ian, Pai, Suhas, Chim, Jenny, Lepercq, Violette, Ilic, Suzana, Mitchell, Margaret, Luccioni, Sasha Alexandra, Jernite, Yacine
As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to assemble the Responsible Open-science Open-collaboration Text Sources (ROOTS) corpus, a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience Large Open-science Open-access Multilingual (BLOOM)(BigScience Workshop, 2022) language model. We further release a large initial subset of the corpus and analyses thereof, and hope to empower large-scale monolingual and multilingual modeling projects with both the data and the processing tools, as well as stimulate research around this large multilingual corpus.
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Wang, Xuezhi, Wei, Jason, Schuurmans, Dale, Le, Quoc, Chi, Ed, Narang, Sharan, Chowdhery, Aakanksha, Zhou, Denny
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), Although language models have demonstrated remarkable success across a range of NLP tasks, their ability to demonstrate reasoning is ...
Deconstructed Generation-Based Zero-Shot Model
Chen, Dubing, Shen, Yuming, Zhang, Haofeng, Torr, Philip H. S.
Recent research on Generalized Zero-Shot Learning (GZSL) has focused primarily on generation-based methods. However, current literature has overlooked the fundamental principles of these methods and has made limited progress in a complex manner. In this paper, we aim to deconstruct the generator-classifier framework and provide guidance for its improvement and extension. We begin by breaking down the generator-learned unseen class distribution into class-level and instance-level distributions. Through our analysis of the role of these two types of distributions in solving the GZSL problem, we generalize the focus of the generation-based approach, emphasizing the importance of (i) attribute generalization in generator learning and (ii) independent classifier learning with partially biased data. We present a simple method based on this analysis that outperforms SotAs on four public GZSL datasets, demonstrating the validity of our deconstruction. Furthermore, our proposed method remains effective even without a generative model, representing a step towards simplifying the generator-classifier structure. Our code is available at \url{https://github.com/cdb342/DGZ}.
Can large language models build causal graphs?
Long, Stephanie, Schuster, Tibor, Piché, Alexandre, Medicine, Department of Family, University, McGill, Mila, null, de Montreal, Université, Research, ServiceNow
Building causal graphs can be a laborious process. To ensure all relevant causal pathways have been captured, researchers often have to discuss with clinicians and experts while also reviewing extensive relevant medical literature. By encoding common and medical knowledge, large language models (LLMs) represent an opportunity to ease this process by automatically scoring edges (i.e., connections between two variables) in potential graphs. LLMs however have been shown to be brittle to the choice of probing words, context, and prompts that the user employs. In this work, we evaluate if LLMs can be a useful tool in complementing causal graph development.
Is ChatGPT Overhyped, or a New Era of AI?
What areas of the broader market are likely to be affected? What products are likely to emerge? Is it just the newest hype cycle similar to VR and crypto? It is a good time to present some of the events since the ChatGPT story broke into the mainstream, as well as to test the technology, and to provide a market angle. "I am ChatGPT, a large language model trained by OpenAI. I am designed to answer a wide range of questions and engage in conversations on a variety of topics. I use natural language processing and machine learning algorithms to understand the context and meaning behind the words you input and provide relevant and accurate responses. While I am not a sentient being and do not have emotions or beliefs, I strive to provide helpful and informative responses to the best of my abilities."
Scammers exploit interest in ChatGPT with sophisticated investment scams - SiliconANGLE
The rise of predictive artificial intelligence and chatbots such as OpenAI Inc.'s ChatGPT has been well-documented, but not so well-documented is a concurrent rise in scams trying to take advantage of the hype in the sector. A new report from researchers at S.C. Bitdefender SRL today shines some light on the rise of highly sophisticated investment scams and how they're trying to use the excitement around ChatGPT to suck in potential victims. The "AI-powered" fraudulent campaigns typically begin with unsolicited emails that have subject lines such as "ChatGPT: New AI bot has everyone going crazy about it" and "New ChatGPTchatbot is make [sic] everyone crazy now – but it'll very soon be as mundane a tool as Google." The emails typically include fake OpenAI and ChatGPT graphics (image above) to make them appear to be legitimate emails. Upon accessing the link in the email, users are directed to a copycat version of ChatGPT, luring them with financial opportunities that pay up to $10,000 per month "on the unique ChatGPT platform."
An AI primer: machine learning, federated learning and more
OpenAI's ChatGPT system has sent the topic of artificial intelligence through the roof. But so many professionals across industries, including healthcare, do not truly understand how AI works – especially how the different forms of AI work. Further, there are a variety of acronyms floating around out there in the tech space: AI (artificial intelligence), ML (machine learning) and now FL (federated learning). But what's the difference between them, and how does each relate to healthcare? To get a primer on this important subject, Healthcare IT News talked with Ittai Dayan, CEO and cofounder of Rhino Health. Rhino Health is a vendor of a platform designed to enable developers and researchers to analyze data, create AI models and deploy them.
Apple blocks ChatGPT-powered email app from the App Store over content rules
Apple has blocked an email app update from entering the App Store because it includes support for ChatGPT, according to a report. While the inclusion of AI isn't strictly the issue at hand, Apple says that the email app's use of it could lead it to generate content that isn't suitable for kids. As a result, the app will need to be given a 17 age rating before it can be released into the app store. The email app, called BlueMail, currently has a 4 age rating. The WSJ (opens in new tab) reports that Apple's concerns are that the use of ChatGPT is unpredictable and that the developer of the BlueMail app would either need to remove it, change the app's age rating, or build in filters that can deal with whatever the AI spits out. "BlueMail's new AI feature uses OpenAI's latest ChatGPT chatbot to help automate the writing of emails using the contents of prior emails and calendar events," the WSJ report notes.
People Fear Being Replaced By AI And ChatGPT: 3 Ways To Lead Well Amidst Anxiety
Will AI replace your job? Work is a state of upheaval--and beyond shifts in where and when people work, changes are occurring in the content of work itself--literally in responsibilities, tasks and assignments. This is fueled by AI and most recently, ChatGPT. People are uncertain about whether they'll be replaced by technology--and at the same time, they're looking for greater meaning from work and more flexibility in how they go about it. But it is possible to reimagine the work experience in the new digital landscape, emphasizing what humans do best and ensuring the work experience is engaging, challenging and secure.
What is Google AI Bard? - KDnuggets
As everybody was going mad about ChatGPT, out of nowhere - Google released their very own experimental AI-powered chatbot - Google Bard. You could see that the competition was heavy, and Google needed a response. But was it a response to ChatGPT, or was Google Bard in the making? So now we know that Google Bard is Google's response to OpenAI's ChatGPT. Let's learn more about it.