AITopics | Wei, Jason

Collaborating Authors

Wei, Jason

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Language Models are Multilingual Chain-of-Thought Reasoners

Shi, Freda, Suzgun, Mirac, Freitag, Markus, Wang, Xuezhi, Srivats, Suraj, Vosoughi, Soroush, Chung, Hyung Won, Tay, Yi, Ruder, Sebastian, Zhou, Denny, Das, Dipanjan, Wei, Jason

arXiv.org Artificial IntelligenceOct-6-2022

We evaluate the reasoning abilities of large language models in multilingual settings. We introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset (Cobbe et al., 2021) into ten typologically diverse languages. We find that the ability to solve MGSM problems via chain-of-thought prompting emerges with increasing model scale, and that models have strikingly strong multilingual reasoning abilities, even in underrepresented languages such as Bengali and Swahili. Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment. The MGSM benchmark is publicly available at https://github.com/google-research/url-nlp.

artificial intelligence, computational linguistic, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.03057

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Chain of Thought Prompting Elicits Reasoning in Large Language Models

Wei, Jason, Wang, Xuezhi, Schuurmans, Dale, Bosma, Maarten, Chi, Ed, Le, Quoc, Zhou, Denny

arXiv.org Artificial IntelligenceJan-27-2022

Although scaling up language model size has reliably improved performance on a range of NLP tasks, even the largest models currently struggle with certain reasoning tasks such as math word problems, symbolic manipulation, and commonsense reasoning. This paper explores the ability of language models to generate a coherent chain of thought -- a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves.

artificial intelligence, natural language, text processing, (15 more...)

arXiv.org Artificial Intelligence

2201.11903

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Media (0.93)
Education (0.92)
Health & Medicine (0.92)
Leisure & Entertainment > Sports > Football (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Survey of Data Augmentation Approaches for NLP

Feng, Steven Y., Gangal, Varun, Wei, Jason, Chandar, Sarath, Vosoughi, Soroush, Mitamura, Teruko, Hovy, Eduard

arXiv.org Artificial IntelligenceMay-7-2021

Data augmentation has recently seen increased interest in NLP due to more work in low-resource domains, new tasks, and the popularity of large-scale neural networks that require large amounts of training data. Despite this recent upsurge, this area is still relatively underexplored, perhaps due to the challenges posed by the discrete nature of language data. In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner. We first introduce and motivate data augmentation for NLP, and then discuss major methodologically representative approaches. Next, we highlight techniques that are used for popular NLP applications and tasks. We conclude by outlining current challenges and directions for future research. Overall, our paper aims to clarify the landscape of existing literature in data augmentation for NLP and motivate additional work in this area.

deep learning, neural network, survey article, (18 more...)

arXiv.org Artificial Intelligence

2105.03075

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (0.65)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Political Bias in Language Models Through Reinforced Calibration

Liu, Ruibo, Jia, Chenyan, Wei, Jason, Xu, Guangxuan, Wang, Lili, Vosoughi, Soroush

arXiv.org Artificial IntelligenceApr-30-2021

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings. In this paper, we describe metrics for measuring political bias in GPT-2 generation and propose a reinforcement learning (RL) framework for mitigating political biases in generated text. By using rewards from word embeddings or a classifier, our RL framework guides debiased generation without having access to the training data or requiring the model to be retrained. In empirical experiments on three attributes sensitive to political bias (gender, location, and topic), our methods reduced bias according to both our metrics and human evaluation, while maintaining readability and semantic coherence.

artificial intelligence, proceedings, us government, (19 more...)

arXiv.org Artificial Intelligence

2104.14795

Country:

Asia (0.67)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.93)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback