AITopics | Iyer, Srini

Collaborating Authors

Iyer, Srini

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LEVER: Learning to Verify Language-to-Code Generation with Execution

Ni, Ansong, Iyer, Srini, Radev, Dragomir, Stoyanov, Ves, Yih, Wen-tau, Wang, Sida I., Lin, Xi Victoria

arXiv.org Artificial IntelligenceSep-1-2023

The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation. State-of-the-art approaches in this area combine LLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics cannot well capture the semantic features of the execution results, such as data type and value range, which often indicates the correctness of the program. In this work, we propose LEVER, a simple approach to improve language-to-code generation by learning to verify the generated programs with their execution results. Specifically, we train verifiers to determine whether a program sampled from the LLMs is correct or not based on the natural language input, the program itself and its execution results. The sampled programs are reranked by combining the verification score with the LLM generation probability, and marginalizing over programs with the same execution results. On four datasets across the domains of table QA, math QA and basic Python programming, LEVER consistently improves over the base code LLMs(4.6% to 10.9% with code-davinci-002) and achieves new state-of-the-art results on all of them.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.08468

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection

AlKhamissi, Badr, Ladhak, Faisal, Iyer, Srini, Stoyanov, Ves, Kozareva, Zornitsa, Li, Xian, Fung, Pascale, Mathias, Lambert, Celikyilmaz, Asli, Diab, Mona

arXiv.org Artificial IntelligenceMay-20-2023

Hate speech detection is complex; it relies on commonsense reasoning, knowledge of stereotypes, and an understanding of social nuance that differs from one culture to the next. It is also difficult to collect a large-scale hate speech annotated dataset. In this work, we frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its "constituent" parts. In addition, we see that infusing knowledge from reasoning datasets (e.g. Atomic2020) improves the performance even further. Moreover, we observe that the trained models generalize to out-of-distribution datasets, showing the superiority of task decomposition and knowledge infusion compared to previously used methods. Concretely, our method outperforms the baseline by 17.83% absolute gain in the 16-shot case.

artificial intelligence, few-shot hate speech detection, task decomposition and knowledge infusion, (1 more...)

arXiv.org Artificial Intelligence

2205.12495

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

LIMA: Less Is More for Alignment

Zhou, Chunting, Liu, Pengfei, Xu, Puxin, Iyer, Srini, Sun, Jiao, Mao, Yuning, Ma, Xuezhe, Efrat, Avia, Yu, Ping, Yu, Lili, Zhang, Susan, Ghosh, Gargi, Lewis, Mike, Zettlemoyer, Luke, Levy, Omer

arXiv.org Artificial IntelligenceMay-18-2023

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bard and 65% versus DaVinci003, which was trained with human feedback. Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2305.11206

Country:

North America > United States (0.69)
Africa (0.46)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Banking & Finance > Economy (1.00)
Energy (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

Add feedback

Demystifying Prompts in Language Models via Perplexity Estimation

Gonen, Hila, Iyer, Srini, Blevins, Terra, Smith, Noah A., Zettlemoyer, Luke

arXiv.org Artificial IntelligenceDec-7-2022

Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task. As a result, we devise a method for creating prompts: (1) automatically extend a small seed set of manually written prompts by paraphrasing using GPT3 and backtranslation and (2) choose the lowest perplexity prompts to get significant gains in performance.

correlation, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

2212.04037

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.68)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback