AITopics | Feder, Amir

Collaborating Authors

Feder, Amir

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Invariant Learning Characterization of Controlled Text Generation

Zheng, Carolina, Shi, Claudia, Vafa, Keyon, Feder, Amir, Blei, David M.

arXiv.org Artificial IntelligenceMay-31-2023

Controlled generation refers to the problem of creating text that contains stylistic or semantic attributes of interest. Many approaches reduce this problem to training a predictor of the desired attribute. For example, researchers hoping to deploy a large language model to produce non-toxic content may use a toxicity classifier to filter generated text. In practice, the generated text to classify, which is determined by user prompts, may come from a wide range of distributions. In this paper, we show that the performance of controlled generation may be poor if the distributions of text in response to user prompts differ from the distribution the predictor was trained on. To address this problem, we cast controlled generation under distribution shift as an invariant learning problem: the most effective predictor should be invariant across multiple text environments. We then discuss a natural solution that arises from this characterization and propose heuristics for selecting natural environments. We study this characterization and the proposed method empirically using both synthetic and real data. Experiments demonstrate both the challenge of distribution shift in controlled generation and the potential of invariance methods in this setting.

machine learning, natural language, predictor, (19 more...)

arXiv.org Artificial Intelligence

2306.00198

Country:

Europe > Spain (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety (0.93)
Education > Focused Education > Special Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

Elazar, Yanai, Kassner, Nora, Ravfogel, Shauli, Feder, Amir, Ravichander, Abhilasha, Mosbach, Marius, Belinkov, Yonatan, Schütze, Hinrich, Goldberg, Yoav

arXiv.org Artificial IntelligenceMar-24-2023

Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this question by providing a language for describing how training data influences predictions, through a causal framework. Importantly, our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone. Addressing the problem of extracting factual knowledge from pretrained language models (PLMs), we focus on simple data statistics such as co-occurrence counts and show that these statistics do influence the predictions of PLMs, suggesting that such models rely on shallow heuristics. Our causal framework and our results demonstrate the importance of studying datasets and the benefits of causality for understanding NLP models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2207.14251

Country:

Europe (1.00)
North America > Canada > Alberta (0.14)
Asia > Middle East > Republic of Türkiye (0.14)
North America > United States > New Mexico (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.93)
Media > Television (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Feder, Amir, Keith, Katherine A., Manzoor, Emaad, Pryzant, Reid, Sridhar, Dhanya, Wood-Doughty, Zach, Eisenstein, Jacob, Grimmer, Justin, Reichart, Roi, Roberts, Margaret E., Stewart, Brandon M., Veitch, Victor, Yang, Diyi

arXiv.org Artificial IntelligenceJul-30-2022

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the challenges and opportunities in the application of causal inference to the textual domain, with its unique properties. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects with text, encompassing settings where text is used as an outcome, treatment, or to address confounding. In addition, we explore potential uses of causal inference to improve the robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the NLP community.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2109.00725

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Model Compression for Domain Adaptation through Causal Effect Estimation

Rotman, Guy, Feder, Amir, Reichart, Roi

arXiv.org Artificial IntelligenceJan-18-2021

Recent improvements in the predictive quality of natural language processing systems are often dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing methods have not considered the differences in the predictive power of various model components or in the generalizability of the compressed models. To understand the connection between model compression and out-of-distribution generalization, we define the task of compressing language representation models such that they perform best in a domain adaptation setting. We choose to address this problem from a causal perspective, attempting to estimate the \textit{average treatment effect} (ATE) of a model component, such as a single layer, on the model's predictions. Our proposed ATE-guided Model Compression scheme (AMoC), generates many model candidates, differing by the model components that were removed. Then, we select the best candidate through a stepwise regression model that utilizes the ATE to predict the expected performance on the target domain. AMoC outperforms strong baselines on 46 of 60 domain pairs across two text classification tasks, with an average improvement of more than 3\% in F1 above the strongest baseline.

artificial intelligence, natural language, target domain, (18 more...)

arXiv.org Artificial Intelligence

2101.07086

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

CausaLM: Causal Model Explanation Through Counterfactual Language Models

Feder, Amir, Oved, Nadav, Shalit, Uri, Reichart, Roi

arXiv.org Artificial IntelligenceJun-14-2020

Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all ML-based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.

deep learning, neural network, representation, (24 more...)

arXiv.org Artificial Intelligence

2005.13407

Country:

North America > United States > Oregon (0.14)
Europe > United Kingdom > England (0.14)
Europe > Middle East > Malta (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Predicting In-game Actions From the Language of NBA Players

Oved, Nadav, Feder, Amir, Reichart, Roi

arXiv.org Artificial IntelligenceOct-24-2019

Sports competitions are widely researched in computer and social science, with the goal of understanding how players act under uncertainty. While there is an abundance of computational work on player metrics prediction based on past performance, very few attempts to incorporate out-of-game signals have been made. Specifically, it was previously unclear whether linguistic signals gathered from players' interviews can add information which does not appear in performance metrics. To bridge that gap, we define text classification tasks of predicting deviations from mean in NBA players' in-game actions, which are associated with strategic choices, player behavior and risk, using their choice of language prior to the game. We collected a dataset of transcripts from key NBA players' pre-game interviews and their in-game performance metrics, totaling in 5,226 interview-metric pairs. We design neural models for players' action prediction based on increasingly more complex aspects of the language signals in their open-ended interviews. Our models can make their predictions based on the textual signal alone, or on a combination with signals from past-performance metrics. Our text-based models outperform strong baselines trained on performance metrics only, demonstrating the importance of language usage for action prediction. Moreover, the models that employ both textual input and past-performance metrics produced the best results. Finally, as neural networks are notoriously difficult to interpret, we propose a method for gaining further insight into what our models have learned. Particularly, we present an LDA-based analysis, where we interpret model predictions in terms of correlated topics. We find that our best performing textual model is most associated with topics that are intuitively related to each prediction task and that better models yield higher correlation with more informative topics.

deep learning, interview, neural network, (22 more...)

arXiv.org Artificial Intelligence

1910.11292

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Basketball (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback