AITopics | Pavlick, Ellie

Plotting

Pavlick, Ellie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs

Guu, Kelvin, Webson, Albert, Pavlick, Ellie, Dixon, Lucas, Tenney, Ian, Bolukbasi, Tolga

arXiv.org Artificial IntelligenceMar-14-2023

Training data attribution (TDA) methods offer to trace a model's prediction on any given example back to specific influential training examples. Existing approaches do so by assigning a scalar influence score to each training example, under a simplifying assumption that influence is additive. But in reality, we observe that training examples interact in highly non-additive ways due to factors such as inter-example redundancy, training order, and curriculum learning effects. To study such interactions, we propose Simfluence, a new paradigm for TDA where the goal is not to produce a single influence score per example, but instead a training run simulator: the user asks, ``If my model had trained on example $z_1$, then $z_2$, ..., then $z_n$, how would it behave on $z_{test}$?''; the simulator should then output a simulated training run, which is a time series predicting the loss on $z_{test}$ at every step of the simulated run. This enables users to answer counterfactual questions about what their model would have learned under different training curricula, and to directly see where in training that learning would occur. We present a simulator, Simfluence-Linear, that captures non-additive interactions and is often able to predict the spiky trajectory of individual example losses with surprising fidelity. Furthermore, we show that existing TDA methods such as TracIn and influence functions can be viewed as special cases of Simfluence-Linear. This enables us to directly compare methods in terms of their simulation accuracy, subsuming several prior TDA approaches to evaluation. In experiments on large language model (LLM) fine-tuning, we show that our method predicts loss trajectories with much higher accuracy than existing TDA methods (doubling Spearman's correlation and reducing mean-squared error by 75%) across several tasks, models, and training methods.

artificial intelligence, machine learning, simulator, (17 more...)

arXiv.org Artificial Intelligence

2303.08114

Country: Europe (0.28)

Genre:

Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Linearly Mapping from Image to Text Space

Merullo, Jack, Castricato, Louis, Eickhoff, Carsten, Pavlick, Ellie

arXiv.org Artificial IntelligenceMar-9-2023

The extent to which text-only language models (LMs) learn to represent features of the non-linguistic world is an open question. Prior work has shown that pretrained LMs can be taught to caption images when a vision model's parameters are optimized to encode images in the language space. We test a stronger hypothesis: that the conceptual representations learned by frozen text-only models and vision-only models are similar enough that this can be achieved with a linear map. We show that the image representations from vision models can be transferred as continuous prompts to frozen LMs by training only a single linear projection. Using these to prompt the LM achieves competitive performance on captioning and visual question answering tasks compared to models that tune both the image encoder and text decoder (such as the MAGMA model). We compare three image encoders with increasing amounts of linguistic supervision seen during pretraining: BEIT (no linguistic information), NF-ResNET (lexical category information), and CLIP (full natural language descriptions). We find that all three encoders perform equally well at transferring visual property information to the language model (e.g., whether an animal is large or small), but that image encoders pretrained with linguistic supervision more saliently encode category information (e.g., distinguishing hippo vs. elephant) and thus perform significantly better on benchmark language-and-vision tasks. Our results indicate that LMs encode conceptual information structurally similarly to vision-based models, even those that are solely trained on images. Code is available here: https://github.com/jmerullo/limber

artificial intelligence, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2209.15162

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment > Sports > Tennis (0.68)
Health & Medicine (0.67)
Consumer Products & Services (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)
(5 more...)

Add feedback

Comparing Trajectory and Vision Modalities for Verb Representation

Ebert, Dylan, Sun, Chen, Pavlick, Ellie

arXiv.org Artificial IntelligenceMar-8-2023

Three-dimensional trajectories, or the 3D position and rotation of objects over time, have been shown to encode key aspects of verb semantics (e.g., the meanings of roll vs. slide). However, most multimodal models in NLP use 2D images as representations of the world. Given the importance of 3D space in formal models of verb semantics, we expect that these 2D images would result in impoverished representations that fail to capture nuanced differences in meaning. This paper tests this hypothesis directly in controlled experiments. We train self-supervised image and trajectory encoders, and then evaluate them on the extent to which each learns to differentiate verb concepts. Contrary to our initial expectations, we find that 2D visual modalities perform similarly well to 3D trajectories. While further work should be conducted on this question, our initial findings challenge the conventional wisdom that richer environment representations necessarily translate into better representation learning for language.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2303.12737

Country: North America > United States (0.14)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Add feedback

Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex

Lovering, Charles, Forde, Jessica Zosa, Konidaris, George, Pavlick, Ellie, Littman, Michael L.

arXiv.org Artificial IntelligenceNov-26-2022

AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex. While researchers and game commentators have suggested that AlphaZero uses concepts that humans consider important, it is unclear how these concepts are captured in the network. We investigate AlphaZero's internal representations in the game of Hex using two evaluation techniques from natural language processing (NLP): model probing and behavioral tests. In doing so, we introduce new evaluation tools to the RL community, and illustrate how evaluations other than task performance can be used to provide a more complete picture of a model's strengths and weaknesses. Our analyses in the game of Hex reveal interesting patterns and generate some testable hypotheses about how such models learn in general. For example, we find that MCTS discovers concepts before the neural network learns to encode them. We also find that concepts related to short-term end-game planning are best encoded in the final layers of the model, whereas concepts related to long-term planning are encoded in the middle layers of the model.

alphazero, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.14673

Country:

Europe (0.46)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Chess (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unit Testing for Concepts in Neural Networks

Lovering, Charles, Pavlick, Ellie

arXiv.org Artificial IntelligenceNov-25-2022

Many complex problems are naturally understood in terms of symbolic concepts. For example, our concept of "cat" is related to our concepts of "ears" and "whiskers" in a non-arbitrary way. Fodor (1998) proposes one theory of concepts, which emphasizes symbolic representations related via constituency structures. Whether neural networks are consistent with such a theory is open for debate. We propose unit tests for evaluating whether a system's behavior is consistent with several key aspects of Fodor's criteria. Using a simple visual concept learning task, we evaluate several modern neural architectures against this specification. We find that models succeed on tests of groundedness, modularlity, and reusability of concepts, but that important questions about causality remain open. Resolving these will require new methods for analyzing models' internal states.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2208.10244

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Does Vision-and-Language Pretraining Improve Lexical Grounding?

Yun, Tian, Sun, Chen, Pavlick, Ellie

arXiv.org Artificial IntelligenceSep-21-2021

Linguistic representations derived from text alone have been criticized for their lack of grounding, i.e., connecting words to their meanings in the physical world. Vision-and-Language (VL) models, trained jointly on text and image or video data, have been offered as a response to such criticisms. However, while VL pretraining has shown success on multimodal tasks such as visual question answering, it is not yet known how the internal linguistic representations themselves compare to their text-only counterparts. This paper compares the semantic representations learned via VL vs. text-only pretraining for two recent VL models using a suite of analyses (clustering, probing, and performance on a commonsense question answering task) in a language-only setting. We find that the multimodal models fail to significantly outperform the text-only variants, suggesting that future work is required if multimodal pretraining is to be pursued as a means of improving NLP in general.

artificial intelligence, representation, text processing, (18 more...)

arXiv.org Artificial Intelligence

2109.10246

Country:

Europe (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Robot Object Retrieval with Contextual Natural Language Queries

Nguyen, Thao, Gopalan, Nakul, Patel, Roma, Corsaro, Matt, Pavlick, Ellie, Tellex, Stefanie

arXiv.org Artificial IntelligenceJun-23-2020

Natural language object retrieval is a highly useful yet challenging task for robots in human-centric environments. Previous work has primarily focused on commands specifying the desired object's type such as "scissors" and/or visual attributes such as "red," thus limiting the robot to only known object classes. We develop a model to retrieve objects based on descriptions of their usage. The model takes in a language command containing a verb, for example "Hand me something to cut," and RGB images of candidate objects and selects the object that best satisfies the task specified by the verb. Our model directly predicts an object's appearance from the object's use specified by a verb phrase. We do not need to explicitly specify an object's class label. Our approach allows us to predict high level concepts like an object's utility based on the language query. Based on contextual information present in the language commands, our model can generalize to unseen object classes and unknown nouns in the commands. Our model correctly selects objects out of sets of five candidates to fulfill natural language commands, and achieves an average accuracy of 62.3% on a held-out test set of unseen ImageNet object classes and 53.0% on unseen object classes and unknown nouns. Our model also achieves an average accuracy of 54.7% on unseen YCB object classes, which have a different image distribution from ImageNet objects. We demonstrate our model on a KUKA LBR iiwa robot arm, enabling the robot to retrieve objects based on natural language descriptions of their usage. We also present a new dataset of 655 verb-object pairs denoting object usage over 50 verbs and 216 object classes.

deep learning, language command, neural network, (21 more...)

arXiv.org Artificial Intelligence

2006.13253

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Planning with State Abstractions for Non-Markovian Task Specifications

Oh, Yoonseon, Patel, Roma, Nguyen, Thao, Huang, Baichuan, Pavlick, Ellie, Tellex, Stefanie

arXiv.org Artificial IntelligenceMay-28-2019

Often times, we specify tasks for a robot using temporal language that can also span different levels of abstraction. The example command ``go to the kitchen before going to the second floor'' contains spatial abstraction, given that ``floor'' consists of individual rooms that can also be referred to in isolation ("kitchen", for example). There is also a temporal ordering of events, defined by the word "before". Previous works have used Linear Temporal Logic (LTL) to interpret temporal language (such as "before"), and Abstract Markov Decision Processes (AMDPs) to interpret hierarchical abstractions (such as "kitchen" and "second floor"), separately. To handle both types of commands at once, we introduce the Abstract Product Markov Decision Process (AP-MDP), a novel approach capable of representing non-Markovian reward functions at different levels of abstractions. The AP-MDP framework translates LTL into its corresponding automata, creates a product Markov Decision Process (MDP) of the LTL specification and the environment MDP, and decomposes the problem into subproblems to enable efficient planning with abstractions. AP-MDP performs faster than a non-hierarchical method of solving LTL problems in over 95% of tasks, and this number only increases as the size of the environment domain increases. We also present a neural sequence-to-sequence model trained to translate language commands into LTL expression, and a new corpus of non-Markovian language commands spanning different levels of abstraction. We test our framework with the collected language commands on a drone, demonstrating that our approach enables a robot to efficiently solve temporal commands at different levels of abstraction.

abstraction, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1905.12096

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Add feedback

Extracting Structured Information via Automatic + Human Computation

Pavlick, Ellie (University of Pennsylvania) | Callison-Burch, Chris (University of Pennsylvania)

AAAI ConferencesNov-1-2015

We present a system for extracting structured information from unstructured text using a combination of information retrieval, natural language processing, machine learning, and crowdsourcing. We test our pipeline by building a structured database of gun violence incidents in the United States. The results of our pilot study demonstrate that the proposed methodology is a viable way of collecting large-scale, up-to-date data for public health, public policy, and social science research.

crowdsourcing, database, law enforcement, (22 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country: North America > United States (0.92)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (1.00)
Government (0.91)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.38)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

Poetry of the Crowd: A Human Computation Algorithm to Convert Prose into Rhyming Verse

Chen, Quanze (University of Pennsylvania) | Lei, Chenyang (University of Pennsylvania) | Xu, Wei (University of Pennsylvania) | Pavlick, Ellie (University of Pennsylvania) | Callison-Burch, Chris (University of Pennsylvania)

AAAI ConferencesOct-31-2014

Poetry composition is a very complex task that requires a poet to satisfy multiple constraints concurrently. We believe that the task can be augmented by combining the creative abilities of humans with computational algorithms that efficiently constrain and permute available choices. We present a hybrid method for generating poetry from prose that combines crowdsourcing with natural language processing (NLP) machinery. We test the ability of crowd workers to accomplish the technically challenging and creative task of composing poems.

artificial intelligence, human computation algorithm, natural language, (14 more...)

AAAI Conferences

Second AAAI Conference on Human Computation and Crowdsourcing

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.15)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback