AITopics | Srikumar, Vivek

Collaborating Authors

Srikumar, Vivek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Pyatkin, Valentina, Hwang, Jena D., Srikumar, Vivek, Lu, Ximing, Jiang, Liwei, Choi, Yejin, Bhagavatula, Chandra

arXiv.org Artificial IntelligenceMay-30-2023

Context is everything, even in commonsense moral reasoning. Changing contexts can flip the moral judgment of an action; "Lying to a friend" is wrong in general, but may be morally acceptable if it is intended to protect their life. We present ClarifyDelphi, an interactive system that learns to ask clarification questions (e.g., why did you lie to your friend?) in order to elicit additional salient contexts of a social or moral situation. We posit that questions whose potential answers lead to diverging moral judgments are the most informative. Thus, we propose a reinforcement learning framework with a defeasibility reward that aims to maximize the divergence between moral judgments of hypothetical answers to a question. Human evaluation demonstrates that our system generates more relevant, informative and defeasible questions compared to competitive baselines. Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition (i.e., the diverse contexts in which moral rules can be bent), and we hope that research in this direction can assist both cognitive and computational investigations of moral judgments.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2212.10409

Country: North America > United States (0.68)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Don't Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text

Gupta, Ashim, Blum, Carter Wood, Choji, Temma, Fei, Yingjie, Shah, Shalin, Vempala, Alakananda, Srikumar, Vivek

arXiv.org Artificial IntelligenceMay-25-2023

Can language models transform inputs to protect text classifiers against adversarial attacks? In this work, we present ATINTER, a model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classifier. Our experiments on four datasets and five attack mechanisms reveal that ATINTER is effective at providing better adversarial robustness than existing defense approaches, without compromising task accuracy. For example, on sentiment classification using the SST-2 dataset, our method improves the adversarial accuracy over the best existing defense approach by more than 4% with a smaller decrease in task accuracy (0.5% vs 2.5%). Moreover, we show that ATINTER generalizes across multiple downstream tasks and classifiers without having to explicitly retrain it for those settings. Specifically, we find that when ATINTER is trained to remove adversarial perturbations for the sentiment classification task on the SST-2 dataset, it even transfers to a semantically different task of news classification (on AGNews) and improves the adversarial robustness by more than 10%.

computational linguistic, natural language, text classification, (17 more...)

arXiv.org Artificial Intelligence

2305.16444

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.91)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.87)

Add feedback

AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization

Paranjape, Bhargavi, Dasigi, Pradeep, Srikumar, Vivek, Zettlemoyer, Luke, Hajishirzi, Hannaneh

arXiv.org Artificial IntelligenceDec-8-2022

Models trained via empirical risk minimization (ERM) are known to rely on spurious correlations between labels and task-independent input features, resulting in poor generalization to distributional shifts. Group distributionally robust optimization (G-DRO) can alleviate this problem by minimizing the worst-case loss over a set of pre-defined groups over training data. G-DRO successfully improves performance of the worst-group, where the correlation does not hold. However, G-DRO assumes that the spurious correlations and associated worst groups are known in advance, making it challenging to apply it to new tasks with potentially multiple unknown spurious correlations. We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization -- an end-to-end approach that jointly identifies error-prone groups and improves accuracy on them. AGRO equips G-DRO with an adversarial slicing model to find a group assignment for training examples which maximizes worst-case loss over the discovered groups. On the WILDS benchmark, AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches used with G-DRO. AGRO also improves out-of-distribution performance on SST2, QQP, and MS-COCO -- datasets where potential spurious correlations are as yet uncharacterized. Human evaluation of ARGO groups shows that they contain well-defined, yet previously unstudied spurious correlations that lead to model errors.

artificial intelligence, correlation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2212.00921

Country: North America > United States (0.92)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.92)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

Gupta, Vivek, Bhat, Riyaz A., Ghosal, Atreya, Srivastava, Manish, Singh, Maneesh, Srikumar, Vivek

arXiv.org Artificial IntelligenceAug-1-2021

While neural models routinely report state-of-the-art performance across NLP tasks involving reasoning, their outputs are often observed to not properly use and reason on the evidence presented to them in the inputs. A model that reasons properly is expected to attend to the right parts of the input, be self-consistent in its predictions across examples, avoid spurious patterns in inputs, and to ignore biasing from its underlying pre-trained language model in a nuanced, context-sensitive fashion (e.g. handling counterfactuals). Do today's models do so? In this paper, we study this question using the problem of reasoning on tabular data. The tabular nature of the input is particularly suited for the study as it admits systematic probes targeting the properties listed above. Our experiments demonstrate that a BERT-based model representative of today's state-of-the-art fails to properly reason on the following counts: it often (a) misses the relevant evidence, (b) suffers from hypothesis and knowledge biases, and, (c) relies on annotation artifacts and knowledge from pre-trained language models as primary evidence rather than relying on reasoning on the premises in the tabular input.

artificial intelligence, hypothesis, natural language, (18 more...)

arXiv.org Artificial Intelligence

2108.00578

Country:

Oceania > Australia (0.14)
Europe > Denmark (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

Evaluating Relaxations of Logic for Neural Networks: A Comprehensive Study

Grespan, Mattia Medina, Gupta, Ashim, Srikumar, Vivek

arXiv.org Artificial IntelligenceJul-28-2021

Symbolic knowledge can provide crucial inductive bias for training neural models, especially in low data regimes. A successful strategy for incorporating such knowledge involves relaxing logical statements into sub-differentiable losses for optimization. In this paper, we study the question of how best to relax logical expressions that represent labeled examples and knowledge about a problem; we focus on sub-differentiable t-norm relaxations of logic. We present theoretical and empirical criteria for characterizing which relaxation would perform best in various scenarios. In our theoretical study driven by the goal of preserving tautologies, the Lukasiewicz t-norm performs best. However, in our empirical analysis on the text chunking and digit recognition tasks, the product t-norm achieves best predictive performance. We analyze this apparent discrepancy, and conclude with a list of best practices for defining loss functions via logic.

deep learning, logic programming, relaxation, (21 more...)

arXiv.org Artificial Intelligence

2107.13646

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Database Workload Characterization with Query Plan Encoders

Paul, Debjyoti, Cao, Jie, Li, Feifei, Srikumar, Vivek

arXiv.org Artificial IntelligenceMay-25-2021

Smart databases are adopting artificial intelligence (AI) technologies to achieve {\em instance optimality}, and in the future, databases will come with prepackaged AI models within their core components. The reason is that every database runs on different workloads, demands specific resources, and settings to achieve optimal performance. It prompts the necessity to understand workloads running in the system along with their features comprehensively, which we dub as workload characterization. To address this workload characterization problem, we propose our query plan encoders that learn essential features and their correlations from query plans. Our pretrained encoders capture the {\em structural} and the {\em computational performance} of queries independently. We show that our pretrained encoders are adaptable to workloads that expedite the transfer learning process. We performed independent assessments of structural encoder and performance encoders with multiple downstream tasks. For the overall evaluation of our query plan encoders, we architect two downstream tasks (i) query latency prediction and (ii) query classification. These tasks show the importance of feature-based workload characterization. We also performed extensive experiments on individual encoders to verify the effectiveness of representation learning and domain adaptability.

artificial intelligence, encoder, information retrieval query processing, (17 more...)

arXiv.org Artificial Intelligence

2105.12287

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Incorporating External Knowledge to Enhance Tabular Reasoning

Neeraja, J., Gupta, Vivek, Srikumar, Vivek

arXiv.org Artificial IntelligenceApr-9-2021

Reasoning about tabular information presents unique challenges to modern NLP approaches which largely rely on pre-trained contextualized embeddings of text. In this paper, we study these challenges through the problem of tabular natural language inference. We propose easy and effective modifications to how information is presented to a model for this task. We show via systematic experiments that these strategies substantially improve tabular inference performance.

artificial intelligence, fluorine, text processing, (18 more...)

arXiv.org Artificial Intelligence

2104.04243

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

BERT & Family Eat Word Salad: Experiments with Text Understanding

Gupta, Ashim, Kvernadze, Giorgi, Srikumar, Vivek

arXiv.org Artificial IntelligenceJan-9-2021

In this paper, we study the response of large models from the BERT family to incoherent inputs that should confuse any model that claims to understand natural language. We define simple heuristics to construct such examples. Our experiments show that state-of-the-art models consistently fail to recognize them as ill-formed, and instead produce high confidence predictions on them. Finally, we show that if models are explicitly trained to recognize invalid inputs, they can be robust to such attacks without a drop in performance.

crowdsourcing, neural network, transformation, (22 more...)

arXiv.org Artificial Intelligence

2101.03453

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Communications > Social Media > Crowdsourcing (0.46)

Add feedback

OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

Dev, Sunipa, Li, Tao, Phillips, Jeff M, Srikumar, Vivek

arXiv.org Artificial IntelligenceJun-30-2020

Language representations are known to carry stereotypical biases and, as a result, lead to biased predictions in downstream tasks. While existing methods are effective at mitigating biases by linear projection, such methods are too aggressive: they not only remove bias, but also erase valuable information from word embeddings. We develop new measures for evaluating specific information retention that demonstrate the tradeoff between bias removal and information retention. To address this challenge, we propose OSCaR (Orthogonal Subspace Correction and Rectification), a bias-mitigating method that focuses on disentangling biased associations between concepts instead of removing concepts wholesale. Our experiments on gender biases show that OSCaR is a well-balanced approach that ensures that semantic information is retained in the embeddings and bias is also effectively mitigated.

health & medicine, information, text processing, (17 more...)

arXiv.org Artificial Intelligence

2007.00049

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

INFOTABS: Inference on Tables as Semi-structured Data

Gupta, Vivek, Mehta, Maitrey, Nokhiz, Pegah, Srikumar, Vivek

arXiv.org Artificial IntelligenceMay-12-2020

In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them. We argue that such data can prove as a testing ground for understanding how we reason about information. To study this, we introduce a new dataset called INFOTABS, comprising of human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes. Our analysis shows that the semi-structured, multi-domain and heterogeneous nature of the premises admits complex, multi-faceted reasoning. Experiments reveal that, while human annotators agree on the relationships between a table-hypothesis pair, several standard modeling strategies are unsuccessful at the task, suggesting that reasoning about tables can pose a difficult modeling challenge.

commonsense reasoning, hypothesis, neural network, (25 more...)

arXiv.org Artificial Intelligence

2005.06117

Country:

North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.67)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.93)
Information Technology > Communications > Social Media (0.88)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback