Large Language Model
Co-training Improves Prompt-based Learning for Large Language Models
Lang, Hunter, Agrawal, Monica, Kim, Yoon, Sontag, David
We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup. We find that co-training makes it possible to improve the original prompt model and at the same time learn a smaller, downstream task-specific model. In the case where we only have partial access to a prompt model (e.g., output probabilities from GPT-3 (Brown et al., 2020)) we learn a calibration model over the prompt outputs. When we have full access to the prompt model's gradients but full finetuning remains prohibitively expensive (e.g., T0 (Sanh et al., 2021)), we learn a set of soft prompt continuous vectors to iteratively update the prompt model. We find that models trained in this manner can significantly improve performance on challenging datasets where there is currently a large gap between prompt-based learning and fully-supervised models.
Causal Inference Principles for Reasoning about Commonsense Causality
Zhang, Jiayao, Zhang, Hongming, Roth, Dan, Su, Weijie J.
Commonsense causality reasoning (CCR) aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person. Although being of great academic and practical interest, this problem is still shadowed by the lack of a well-posed theoretical framework; existing work usually relies on deep language models wholeheartedly, and is potentially susceptible to confounding co-occurrences. Motivated by classical causal principles, we articulate the central question of CCR and draw parallels between human subjects in observational studies and natural languages to adopt CCR to the potential-outcomes framework, which is the first such attempt for commonsense tasks. We propose a novel framework, ROCK, to Reason O(A)bout Commonsense K(C)ausality, which utilizes temporal signals as incidental supervision, and balances confounding effects using temporal propensities that are analogous to propensity scores. The ROCK implementation is modular and zero-shot, and demonstrates good CCR capabilities on various datasets.
2021: DeepMind's Miracle Year
This essay is a lightly edited version of the 37th issue of the ML4Sci newsletter. You can find more issues here and if you like what you see, feel free to sign up! Year-in-reviews always seemed like an appropriate mix of vogue and cliche, so I figured I'd take my stab at a slightly late two-part series. Google's DeepMind spent 2021 continuing to push out ground-breaking work applying AI to a variety of important scientific problems. This is perhaps one of the best groups to watch for Manhattan-style AI projects. Unrestrained by the incentives and limits of scientific academia (publish or perish, tenuous and short-term federal funding, high labor turnover i.e. grad students), DeepMind is quickly becoming the Bell Labs for AI.
ไบบ้ใจ่ฆๅใใใคใใชใใปใฉ่ช็ถใชๆ็ซ ใๆธใใAIใGPT-3ใใฎๆน่ฏ็AIใInstructGPTใไธ่ฌๅ ฌ้ใ่ฉฉใๅท็ญๅฏ่ฝ
ๆ็ซ ็ๆAIใGPT-3ใใฏใชใณใฉใคใณๆฒ็คบๆฟใงไบบ้ใจใใฌใใซ1้ฑ้ไผ่ฉฑใงใใใปใฉ้ๅๆใฎๅฐใชใๆ็ซ ใ็ๆใงใใใใจใง็ฅใใใฆใใใMicrosoftใฎใใฉใใใใฉใผใ ใซๆก็จใใใใชใฉๅคงใใชๆณจ็ฎใ้ใใฆใใพใใไธๆนใงGPT-3ใซใฏๅใคในใฉใ ๆ็ใชใใคใขในใๅญๅจใใใใจใๆๆใใใใชใฉใ็ๆใใใๆ็ซ ใซๅใใใใใใจใๅใใฃใฆใใพใใใใใชGPT-3ใฎๅญฆ็ฟใขใใซใๆน่ฏใใฆๅใใๆใใคใคๆ็ซ ็ๆ็ฒพๅบฆใๅไธใใใๆ็ซ ็ๆAIใInstructGPTใใฎไธ่ฌๆไพใ2022ๅนด1ๆ27ๆฅใซๅงใพใใพใใใ
Andrea Rios Escudel on LinkedIn: AI in Metaverse, New GPT3 is less toxic, Fresh Examples of AI
First, financial regulators need to ensure that regulatory oversight delivers on the inclusion and intermediation-enhancing benefits of digital finance without compromising traditional regulatory goals such as financial stability, adequate competition, consumer protection and market integrity. Second, there is a pressing need for a system of data governance that allows consumers and business to exercise control over their data through the granting and withholding of consent to the use and transfer of their data. Developing a user-friendly granular consent-based data governance system with low transaction costs is a challenge that, when successfully addressed, will promote the development of virtual banking worldwide. Hong Kong SAR offers one example of an integrated regulatory framework for virtual banks. The licensing and regulatory regime โ also applicable to incumbent banks โ aims to manage the full spectrum of risks arising from any source, including the ownership structure, without compromising development objectives that often rest on technological innovation.
Artificial intelligence (AI): The most trending companies on Twitter in Q4 2021
Verdict has listed five companies that trended the most in Twitter discussions related to AI, using research from GlobalData's Technology Influencer platform. The top companies are the most mentioned companies among Twitter discussions of more than 150 AI experts tracked by GlobalData's Technology Influencer platform during the fourth quarter (Q4) of 2021. Google discovering that AI becomes more aggressive as it becomes more advanced, Google using machine learning to improve chip design, and a new AI developed by Google interpreting and reading sign language aloud were some of the popular discussions on Alphabet Inc in Q4 2021. Mario Pawlowski, CEO of trucking industry news and technology website iTrucker, shared a video on how AI becomes more aggressive as it becomes more advanced. Researchers at technology company Google's AI research company DeepMind conducted a study on AI by developing an AI video game called Gathering.
Can Wikipedia Help Offline Reinforcement Learning?
Reid, Machel, Yamada, Yutaro, Gu, Shixiang Shane
Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of large scale off-the-shelf datasets as well as high variance in transferability among different environments. Recent work has looked at tackling offline RL from the perspective of sequence modeling with improved results as result of the introduction of the Transformer architecture. However, when the model is trained from scratch, it suffers from slow convergence speeds. In this paper, we look to take advantage of this formulation of reinforcement learning as sequence modeling and investigate the transferability of pre-trained sequence models on other domains (vision, language) when finetuned on offline RL tasks (control, games). To this end, we also propose techniques to improve transfer between these domains. Results show consistent performance gains in terms of both convergence speed and reward on a variety of environments, accelerating training by 3-6x and achieving state-of-the-art performance in a variety of tasks using Wikipedia-pretrained and GPT2 language models. We hope that this work not only brings light to the potentials of leveraging generic sequence modeling techniques and pre-trained models for RL, but also inspires future work on sharing knowledge between generative modeling tasks of completely different domains.
GitHub - deepmind/xmanager
XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP). Interaction with experiments is done via XManager's APIs through Python launch scripts. To get started, install XManager, its prerequisites if needed and follow the tutorial or codelab.ipynb to create and run a launch script. Or, alternatively, a PyPI project is also available.
The new version of GPT-3 is much better behaved (and should be less toxic)
Large language models like GPT-3 are trained using vast bodies of text, much it taken from the internet, in which they encounter the best and worst of what people put down in words. That is a problem for today's chatbots and text-generation tools. The models soak up toxic language--from text that is racist and misogynistic or that contains more insidious, baked-in prejudices--as well as falsehoods. OpenAI has made IntructGPT the default model for users of its application programming interface (API)--a service that gives access to the company's language models for a fee. GPT-3 will still be available but OpenAI does not recommend using it.
OpenAI rolls out new text-generating models that it claims are less toxic
Did you miss a session from the Future of Work Summit? Large language models (LLMs) such as OpenAI's GPT-3, which can "write" sentences that read nearly like they were written by a human, can be prompted to perform a range of writing tasks given only a few examples of the tasks. For example, LLMs have been used to create marketing materials and video game levels in addition to recipes, poetry, and movie scripts. But because LLMs learn to write from examples taken from sometimes toxic communities, they can fall victim to parroting misinformation, sexism, ageism, racism, and conspiracies. Efforts have been made to combat toxicity in LLMs -- with mixed results.