AITopics | Ivgi, Maor

Plotting

Ivgi, Maor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty

Ivgi, Maor, Yoran, Ori, Berant, Jonathan, Geva, Mor

arXiv.org Artificial IntelligenceJul-8-2024

Large language models (LLMs) often exhibit undesirable behaviors, such as hallucinations and sequence repetitions. We propose to view these behaviors as fallbacks that models exhibit under uncertainty, and investigate the connection between them. We categorize fallback behaviors -- sequence repetitions, degenerate text, and hallucinations -- and extensively analyze them in models from the same family that differ by the amount of pretraining tokens, parameter count, or the inclusion of instruction-following training. Our experiments reveal a clear and consistent ordering of fallback behaviors, across all these axes: the more advanced an LLM is (i.e., trained on more tokens, has more parameters, or instruction-tuned), its fallback behavior shifts from sequence repetitions, to degenerate text, and then to hallucinations. Moreover, the same ordering is observed throughout a single generation, even for the best-performing models; as uncertainty increases, models shift from generating hallucinations to producing degenerate text and then sequence repetitions. Lastly, we demonstrate that while common decoding techniques, such as random sampling, might alleviate some unwanted behaviors like sequence repetitions, they increase harder-to-detect hallucinations.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.06071

Country:

Europe > Spain (0.14)
Europe > Ireland (0.14)
North America > United States (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.69)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

In-Context Learning with Long-Context Models: An In-Depth Exploration

Bertsch, Amanda, Ivgi, Maor, Alon, Uri, Berant, Jonathan, Gormley, Matthew R., Neubig, Graham

arXiv.org Artificial IntelligenceApr-30-2024

As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets. We study the behavior of in-context learning (ICL) at this extreme scale on multiple datasets and models. We show that, for many datasets with large label spaces, performance continues to increase with hundreds or thousands of demonstrations. We contrast this with example retrieval and finetuning: example retrieval shows excellent performance at low context lengths but has diminished gains with more demonstrations; finetuning is more data hungry than ICL but can sometimes exceed long-context ICL performance with additional data. We use this ICL setting as a testbed to study several properties of both in-context learning and long-context models. We show that long-context ICL is less sensitive to random input shuffling than short-context ICL, that grouping of same-label examples can negatively impact performance, and that the performance boosts we see do not arise from cumulative gain from encoding many examples together. We conclude that although long-context ICL can be surprisingly effective, most of this gain comes from attending back to similar examples rather than task learning.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.002

Country:

North America > United States (0.68)
Europe (0.67)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

Shaham, Uri, Ivgi, Maor, Efrat, Avia, Berant, Jonathan, Levy, Omer

arXiv.org Machine LearningDec-17-2023

We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new datasets, including two novel information fusing tasks, such as aggregating the percentage of positive reviews. Using ZeroSCROLLS, we conduct a comprehensive evaluation of both open-source and closed large language models, finding that Claude outperforms ChatGPT, and that GPT-4 achieves the highest average score. However, there is still room for improvement on multiple open challenges in ZeroSCROLLS, such as aggregation tasks, where models struggle to pass the naive baseline. As the state of the art is a moving target, we invite researchers to evaluate their ideas on the live ZeroSCROLLS leaderboard.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2305.14196

Country:

North America > United States (0.93)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Industry:

Media (0.46)
Government > Regional Government > North America Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

Ivgi, Maor, Hinder, Oliver, Carmon, Yair

arXiv.org Artificial IntelligenceJul-16-2023

We propose a tuning-free dynamic SGD step size formula, which we call Distance over Gradients (DoG). The DoG step sizes depend on simple empirical quantities (distance from the initial point and norms of gradients) and have no ``learning rate'' parameter. Theoretically, we show that a slight variation of the DoG formula enjoys strong parameter-free convergence guarantees for stochastic convex optimization assuming only \emph{locally bounded} stochastic gradients. Empirically, we consider a broad range of vision and language transfer learning tasks, and show that DoG's performance is close to that of SGD with tuned learning rate. We also propose a per-layer variant of DoG that generally outperforms tuned SGD, approaching the performance of tuned Adam. A PyTorch implementation is available at https://github.com/formll/dog

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2302.12022

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Efficient Long-Text Understanding with Short-Text Models

Ivgi, Maor, Shaham, Uri, Berant, Jonathan

arXiv.org Artificial IntelligenceDec-27-2022

Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles and long documents, due to their quadratic complexity. While a myriad of efficient transformer variants have been proposed, they are typically based on custom implementations that require expensive pretraining from scratch. In this work, we propose SLED: SLiding-Encoder and Decoder, a simple approach for processing long sequences that re-uses and leverages battle-tested short-text pretrained LMs. Specifically, we partition the input into overlapping chunks, encode each with a short-text LM encoder and use the pretrained decoder to fuse information across chunks (fusion-in-decoder). We illustrate through controlled experiments that SLED offers a viable strategy for long text understanding and evaluate our approach on SCROLLS, a benchmark with seven datasets across a wide range of language understanding tasks. We find that SLED is competitive with specialized models that are up to 50x larger and require a dedicated and expensive pretraining step.

artificial intelligence, natural language, sled, (17 more...)

arXiv.org Artificial Intelligence

2208.00748

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Strength High (0.55)
Research Report > Experimental Study (0.55)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

SCROLLS: Standardized CompaRison Over Long Language Sequences

Shaham, Uri, Segal, Elad, Ivgi, Maor, Efrat, Avia, Yoran, Ori, Haviv, Adi, Gupta, Ankit, Xiong, Wenhan, Geva, Mor, Berant, Jonathan, Levy, Omer

arXiv.org Artificial IntelligenceJan-10-2022

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over long texts. We examine existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing information across the input. SCROLLS contains summarization, question answering, and natural language inference tasks, covering multiple domains, including literature, science, business, and entertainment. Initial baselines, including Longformer Encoder-Decoder, indicate that there is ample room for improvement on SCROLLS. We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.

artificial intelligence, dataset, natural language, (19 more...)

arXiv.org Artificial Intelligence

2201.03533

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Industry:

Law (0.68)
Government (0.68)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics

Ghorbani, Amirata, Berenbaum, Dina, Ivgi, Maor, Dafna, Yuval, Zou, James

arXiv.org Machine LearningNov-30-2021

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data is one of the most commonly used modes of data in diverse applications such as healthcare and finance. Much of the existing interpretability methods used for tabular data only report feature-importance scores -- either locally (per example) or globally (per model) -- but they do not provide interpretation or visualization of how the features interact. We address this limitation by introducing Feature Vectors, a new global interpretability method designed for tabular datasets. In addition to providing feature-importance, Feature Vectors discovers the inherent semantic relationship among features via an intuitive feature visualization technique. Our systematic experiments demonstrate the empirical utility of this new method by applying it to several real-world datasets. We further provide an easy-to-use Python package for Feature Vectors.

artificial intelligence, information, machine learning, (18 more...)

arXiv.org Machine Learning

2111.05898

Country:

Europe (0.46)
North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback