Goto

Collaborating Authors

 gpt-3



TheUnreliabilityofExplanationsinFew-shot PromptingforTextualReasoning

Neural Information Processing Systems

However, text-davinci-002 is able to benefit more substantially. We further show that explanations generated by the LLMs may not entail the models' predictions norbefactually grounded intheinput, evenonsimple tasks with extractive explanations. However, these flawed explanations can still be useful as a way to verify LLMs' predictions post-hoc.





KnowGPT: KnowledgeGraphbasedPrompTingfor LargeLanguageModels

Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated remarkable capabilities in many real-world applications. Nonetheless, LLMs are often criticized for their tendencytoproduce hallucinations, wherein themodels fabricate incorrectstatements on tasks beyond their knowledge and perception.


LanguageModelsareFew-ShotLearners

Neural Information Processing Systems

Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous nonsparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks andfew-shot demonstrations specified purelyviatextinteraction withthemodel.


ThinkBig, TeachSmall: DoLanguageModelsDistilOccam'sRazor?

Neural Information Processing Systems

Large language models have recently shown a remarkable ability for few-shot learning, including patterns of algorithmic nature. However, it is still an open question to determine what kind of patterns these models can capture and how manyexamples theyneedintheirprompts.


Query-Based Adversarial Prompt Generation

Neural Information Processing Systems

Recent work has shown it is possible to construct adversarial examples that cause aligned language models to emit harmful strings or perform harmful behavior.Existing attacks work either in the white-box setting (with full access to the model weights), or through: the phenomenon that adversarial examples crafted on one model often remain effective on other models.We improve on prior work with a attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.We validate our attack on GPT-3.5 and OpenAI's safety classifier; we can cause GPT-3.5 to emit harmful strings that current transfer attacks fail at, and we can evade the OpenAI and Llama Guard safety classifiers with nearly 100% probability.


Text Alignment Is An Efficient Unified Model for Massive NLP Tasks

Neural Information Processing Systems

Large language models (LLMs), typically designed as a function of next-word prediction, have excelled across extensive NLP tasks. Despite the generality, next-word prediction is often not an efficient formulation for many of the tasks, demanding an extreme scale of model parameters (10s or 100s of billions) and sometimes yielding suboptimal performance.In practice, it is often desirable to build more efficient models---despite being less versatile, they still apply to a substantial subset of problems, delivering on par or even superior performance with much smaller model sizes.In this paper, we propose text alignment as an efficient unified model for a wide range of crucial tasks involving text entailment, similarity, question answering (and answerability), factual consistency, and so forth. Given a pair of texts, the model measures the degree of alignment between their information.