AITopics | Srivastava, Shashank

Plotting

Srivastava, Shashank

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MaNtLE: Model-agnostic Natural Language Explainer

Menon, Rakesh R., Zaman, Kerem, Srivastava, Shashank

arXiv.org Artificial IntelligenceMay-22-2023

Understanding the internal reasoning behind the predictions of machine learning systems is increasingly vital, given their rising adoption and acceptance. While previous approaches, such as LIME, generate algorithmic explanations by attributing importance to input features for individual examples, recent research indicates that practitioners prefer examining language explanations that explain sub-groups of examples. In this paper, we introduce MaNtLE, a model-agnostic natural language explainer that analyzes multiple classifier predictions and generates faithful natural language explanations of classifier rationale for structured classification tasks. MaNtLE uses multi-task training on thousands of synthetic classification tasks to generate faithful explanations. Simulated user studies indicate that, on average, MaNtLE-generated explanations are at least 11% more faithful compared to LIME and Anchors explanations across three tasks. Human evaluations demonstrate that users can better predict model behavior using explanations from MaNtLE compared to other techniques

explanation, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.12995

Country:

Oceania > Australia (0.14)
North America > United States (0.14)
Europe > Spain (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Add feedback

Identifying and Manipulating the Personality Traits of Language Models

Caron, Graham, Srivastava, Shashank

arXiv.org Artificial IntelligenceDec-20-2022

Psychology research has long explored aspects of human personality such as extroversion, agreeableness and emotional stability. Categorizations like the `Big Five' personality traits are commonly used to assess and diagnose personality types. In this work, we explore the question of whether the perceived personality in language models is exhibited consistently in their language generation. For example, is a language model such as GPT2 likely to respond in a consistent way if asked to go out to a party? We also investigate whether such personality traits can be controlled. We show that when provided different types of contexts (such as personality descriptions, or answers to diagnostic questions about personality traits), language models such as BERT and GPT2 can consistently identify and reflect personality markers in those contexts. This behavior illustrates an ability to be manipulated in a highly predictable way, and frames them as tools for identifying personality traits and controlling personas in applications such as dialog systems. We also contribute a crowd-sourced data-set of personality descriptions of human subjects paired with their `Big Five' personality assessment data, and a data-set of personality descriptions collated from Reddit.

machine learning, natural language, personality trait, (21 more...)

arXiv.org Artificial Intelligence

2212.10276

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.68)
Media > News (0.39)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LaSQuE: Improved Zero-Shot Classification from Explanations Through Quantifier Modeling and Curriculum Learning

Ghosh, Sayan, Menon, Rakesh R, Srivastava, Shashank

arXiv.org Artificial IntelligenceDec-18-2022

A hallmark of human intelligence is the ability to learn new concepts purely from language. Several recent approaches have explored training machine learning models via natural language supervision. However, these approaches fall short in leveraging linguistic quantifiers (such as 'always' or 'rarely') and mimicking humans in compositionally learning complex tasks. Here, we present LaSQuE, a method that can learn zero-shot classifiers from language explanations by using three new strategies - (1) modeling the semantics of linguistic quantifiers in explanations (including exploiting ordinal strength relationships, such as 'always' > 'likely'), (2) aggregating information from multiple explanations using an attention-based mechanism, and (3) model training via curriculum learning. With these strategies, LaSQuE outperforms prior work, showing an absolute gain of up to 7% in generalizing to unseen real-world classification tasks.

explanation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.09104

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Improving and Simplifying Pattern Exploiting Training

Tam, Derek, Menon, Rakesh R, Bansal, Mohit, Srivastava, Shashank, Raffel, Colin

arXiv.org Artificial IntelligenceMar-22-2021

Recently, pre-trained language models (LMs) have achieved strong performance when fine-tuned on difficult benchmarks like SuperGLUE. However, performance can suffer when there are very few labeled examples available for fine-tuning. Pattern Exploiting Training (PET) is a recent approach that leverages patterns for few-shot learning. However, PET uses task-specific unlabeled data. In this paper, we focus on few shot learning without any unlabeled data and introduce ADAPET, which modifies PET's objective to provide denser supervision during fine-tuning. As a result, ADAPET outperforms PET on SuperGLUE without any task-specific unlabeled data. Our code can be found at https://github.com/rrmenon10/ADAPET.

artificial intelligence, machine learning, verbalizer, (15 more...)

arXiv.org Artificial Intelligence

2103.11955

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

PRover: Proof Generation for Interpretable Reasoning over Rules

Saha, Swarnadeep, Ghosh, Sayan, Srivastava, Shashank, Bansal, Mohit

arXiv.org Artificial IntelligenceOct-6-2020

Recent work by Clark et al. (2020) shows that transformers can act as 'soft theorem provers' by answering questions over explicitly provided knowledge in natural language. In our work, we take a step closer to emulating formal theorem provers, by proposing PROVER, an interpretable transformer-based model that jointly answers binary questions over rule-bases and generates the corresponding proofs. Our model learns to predict nodes and edges corresponding to proof graphs in an efficient constrained training paradigm. During inference, a valid proof, satisfying a set of global constraints is generated. We conduct experiments on synthetic, hand-authored, and human-paraphrased rule-bases to show promising results for QA and proof generation, with strong generalization performance. First, PROVER generates proofs with an accuracy of 87%, while retaining or improving performance on the QA task, compared to RuleTakers (up to 6% improvement on zero-shot evaluation). Second, when trained on questions requiring lower depths of reasoning, it generalizes significantly better to higher depths (up to 15% improvement). Third, PROVER obtains near perfect QA accuracy of 98% using only 40% of the training data. However, generating proofs for questions requiring higher depths of reasoning becomes challenging, and the accuracy drops to 65% for 'depth 5', indicating significant scope for future work. Our code and models are publicly available at https://github.com/swarnaHub/PRover

accuracy, constraint-based reasoning, logic programming, (18 more...)

arXiv.org Artificial Intelligence

2010.0283

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(2 more...)

Add feedback

Inferring Interpersonal Relations in Narrative Summaries

Srivastava, Shashank (Carnegie Mellon University) | Chaturvedi, Snigdha (University of Maryland, College Park) | Mitchell, Tom (Carnegie Mellon University)

AAAI ConferencesApr-19-2016

Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a general model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We additionally provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.

inductive learning, neural network, relation, (21 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Ohio (0.14)
North America > United States > Maryland (0.14)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Modeling Evolving Relationships Between Characters in Literary Novels

Chaturvedi, Snigdha (University of Maryland, College Park) | Srivastava, Shashank (Carnegie Mellon University) | III, Hal Daume (University of Maryland, College Park) | Dyer, Chris (Carnegie Mellon University)

AAAI ConferencesApr-19-2016

Studying characters plays a vital role in computationally representing and interpreting narratives. Unlike previous work, which has focused on inferring character roles, we focus on the problem of modeling their relationships. Rather than assuming a fixed relationship for a character pair, we hypothesize that relationships temporally evolve with the progress of the narrative, and formulate the problem of relationship modeling as a structured prediction problem. We propose a semi-supervised framework to learn relationship sequences from fully as well as partially labeled data. We present a Markovian model capable of accumulating historical beliefs about the relationship and status changes. We use a set of rich linguistic and semantically motivated features that incorporate world knowledge to investigate the textual content of narrative. We empirically demonstrate that such a framework outperforms competitive baselines.

artificial intelligence, sequence, text processing, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

Europe (0.94)
North America > United States > Maryland > Baltimore (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback