Goto

Collaborating Authors

 Sen, Prithviraj


Learning variant product relationship and variation attributes from e-commerce website structures

arXiv.org Artificial Intelligence

We introduce VARM, variant relationship matcher strategy, to identify pairs of variant products in e-commerce catalogs. Traditional definitions of entity resolution are concerned with whether product mentions refer to the same underlying product. However, this fails to capture product relationships that are critical for e-commerce applications, such as having similar, but not identical, products listed on the same webpage or share reviews. Here, we formulate a new type of entity resolution in variant product relationships to capture these similar e-commerce product links. In contrast with the traditional definition, the new definition requires both identifying if two products are variant matches of each other and what are the attributes that vary between them. To satisfy these two requirements, we developed a strategy that leverages the strengths of both encoding and generative AI models. First, we construct a dataset that captures webpage product links, and therefore variant product relationships, to train an encoding LLM to predict variant matches for any given pair of products. Second, we use RAG prompted generative LLMs to extract variation and common attributes amongst groups of variant products. To validate our strategy, we evaluated model performance using real data from one of the world's leading e-commerce retailers. The results showed that our strategy outperforms alternative solutions and paves the way to exploiting these new type of product relationships.


Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning

arXiv.org Artificial Intelligence

Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. This is because of their advantages ranging from inherent interpretability, the lesser requirement of training data, and being generalizable in scenarios with unseen data. Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. Our experiments on established text-based game benchmarks show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better generalization to unseen test games and learning from fewer training interactions.


Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations

arXiv.org Artificial Intelligence

Human-annotated labels and explanations are critical for training explainable NLP models. However, unlike human-annotated labels whose quality is easier to calibrate (e.g., with a majority vote), human-crafted free-form explanations can be quite subjective. Before blindly using them as ground truth to train ML models, a vital question needs to be asked: How do we evaluate a human-annotated explanation's quality? In this paper, we build on the view that the quality of a human-annotated explanation can be measured based on its helpfulness (or impairment) to the ML models' performance for the desired NLP tasks for which the annotations were collected. In comparison to the commonly used Simulatability score, we define a new metric that can take into consideration the helpfulness of an explanation for model performance at both fine-tuning and inference. With the help of a unified dataset format, we evaluated the proposed metric on five datasets (e.g., e-SNLI) against two model architectures (T5 and BART), and the results show that our proposed metric can objectively evaluate the quality of human-annotated explanations, while Simulatability falls short.


A Closer Look at the Calibration of Differentially Private Learners

arXiv.org Artificial Intelligence

Modern deep learning models tend to memorize their training data in order to generalize better [1, 2], posing great privacy challenges in the form of training data leakage or membership inference attacks [3, 4, 5]. To address these concerns, differential privacy (DP) has become a popular paradigm for providing rigorous privacy guarantees when performing data analysis and statistical modeling based on private data. In practice, a commonly used DP algorithm to train machine learning (ML) models is DP-SGD [6]. The algorithm involves clipping per-example gradients and injecting noises into parameter updates during the optimization process. Despite that DP-SGD can give strong privacy guarantees, prior works have identified that this privacy comes at a cost of other aspects of trustworthy ML, such as degrading accuracy and causing disparate impact [2, 7, 8]. These tradeoffs pose a challenge for privacy-preserving ML, as it forces practitioners to make difficult decisions on how to weigh privacy against other key aspects of trustworthiness. In this work, we expand the study of privacy-related tradeoffs by characterizing and proposing mitigations for the privacy-calibration tradeoff. The tradeoff is significant as accessing model uncertainty is important for deploying models in safety-critical scenarios like healthcare and law where explainability [9] and risk control [10] are needed in addition to privacy [11]. 1


Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks

arXiv.org Artificial Intelligence

Inductive logic programming (ILP) (Muggleton 1996) has We propose first-order extensions of LNNs that can been of long-standing interest where the goal is to learn tackle ILP. Since vanilla backpropagation is insufficient for logical rules from labeled data. Since rules are explicitly constraint optimization, we propose flexible learning algorithms symbolic, they provide certain advantages over black box capable of handling a variety of (linear) inequality and models. For instance, learned rules can be inspected, understood equality constraints. We experiment with diverse benchmarks and verified forming a convenient means of storing for ILP including gridworld and knowledge base completion learned knowledge. Consequently, a number of approaches (KBC) that call for learning of different kinds of rules have been proposed to address ILP including, but not limited and show how our approach can tackle both effectively. In to, statistical relational learning (Getoor and Taskar 2007) fact, our KBC results represents a 4-16% relative improvement and more recently, neuro-symbolic methods.


Combining Rules and Embeddings via Neuro-Symbolic AI for Knowledge Base Completion

arXiv.org Artificial Intelligence

Recent interest in Knowledge Base Completion (KBC) has led to a plethora of approaches based on reinforcement learning, inductive logic programming and graph embeddings. In particular, rule-based KBC has led to interpretable rules while being comparable in performance with graph embeddings. Even within rule-based KBC, there exist different approaches that lead to rules of varying quality and previous work has not always been precise in highlighting these differences. Another issue that plagues most rule-based KBC is the non-uniformity of relation paths: some relation sequences occur in very few paths while others appear very frequently. In this paper, we show that not all rule-based KBC models are the same and propose two distinct approaches that learn in one case: 1) a mixture of relations and the other 2) a mixture of paths. When implemented on top of neuro-symbolic AI, which learns rules by extending Boolean logic to real-valued logic, the latter model leads to superior KBC accuracy outperforming state-of-the-art rule-based KBC by 2-10% in terms of mean reciprocal rank. Furthermore, to address the non-uniformity of relation paths, we combine rule-based KBC with graph embeddings thus improving our results even further and achieving the best of both worlds.


LNN-EL: A Neuro-Symbolic Approach to Short-text Entity Linking

arXiv.org Artificial Intelligence

Entity linking (EL), the task of disambiguating mentions in text by linking them to entities in a knowledge graph, is crucial for text understanding, question answering or conversational systems. Entity linking on short text (e.g., single sentence or question) poses particular challenges due to limited context. While prior approaches use either heuristics or black-box neural methods, here we propose LNN-EL, a neuro-symbolic approach that combines the advantages of using interpretable rules based on first-order logic with the performance of neural learning. Even though constrained to using rules, LNN-EL performs competitively against SotA black-box neural approaches, with the added benefits of extensibility and transferability. In particular, we show that we can easily blend existing rule templates given by a human expert, with multiple types of features (priors, BERT encodings, box embeddings, etc), and even scores resulting from previous EL methods, thus improving on such methods. For instance, on the LC-QuAD-1.0 dataset, we show more than $4$\% increase in F1 score over previous SotA. Finally, we show that the inductive bias offered by using logic results in learned rules that transfer well across datasets, even without fine tuning, while maintaining high accuracy.


Deep Indexed Active Learning for Matching Heterogeneous Entity Representations

arXiv.org Artificial Intelligence

Given two large lists of records, the task in entity resolution (ER) is to find the pairs from the Cartesian product of the lists that correspond to the same real world entity. Typically, passive learning methods on tasks like ER require large amounts of labeled data to yield useful models. Active Learning is a promising approach for ER in low resource settings. However, the search space, to find informative samples for the user to label, grows quadratically for instance-pair tasks making active learning hard to scale. Previous works, in this setting, rely on hand-crafted predicates, pre-trained language model embeddings, or rule learning to prune away unlikely pairs from the Cartesian product. This blocking step can miss out on important regions in the product space leading to low recall. We propose DIAL, a scalable active learning approach that jointly learns embeddings to maximize recall for blocking and accuracy for matching blocked pairs. DIAL uses an Index-By-Committee framework, where each committee member learns representations based on powerful transformer models. We highlight surprising differences between the matcher and the blocker in the creation of the training data and the objective used to train their parameters. Experiments on five benchmark datasets and a multilingual record matching dataset show the effectiveness of our approach in terms of precision, recall and running time. Code is available at https://github.com/ArjitJ/DIAL


Logic Embeddings for Complex Query Answering

arXiv.org Artificial Intelligence

Answering logical queries over incomplete knowledge bases is challenging because: 1) it calls for implicit link prediction, and 2) brute force answering of existential first-order logic queries is exponential in the number of existential variables. Recent work of query embeddings provides fast querying, but most approaches model set logic with closed regions, so lack negation. Query embeddings that do support negation use densities that suffer drawbacks: 1) only improvise logic, 2) use expensive distributions, and 3) poorly model answer uncertainty. In this paper, we propose Logic Embeddings, a new approach to embedding complex queries that uses Skolemisation to eliminate existential variables for efficient querying. It supports negation, but improves on density approaches: 1) integrates well-studied t-norm logic and directly evaluates satisfiability, 2) simplifies modeling with truth values, and 3) models uncertainty with truth bounds. Logic Embeddings are competitively fast and accurate in query answering over large, incomplete knowledge graphs, outperform on negation queries, and in particular, provide improved modeling of answer uncertainty as evidenced by a superior correlation between answer set size and embedding entropy.


A Survey of the State of Explainable AI for Natural Language Processing

arXiv.org Artificial Intelligence

Recent years have seen important advances in the quality of state-of-the-art models, but this has come at the expense of models becoming less interpretable. This survey presents an overview of the current state of Explainable AI (XAI), considered within the domain of Natural Language Processing (NLP). We discuss the main categorization of explanations, as well as the various ways explanations can be arrived at and visualized. We detail the operations and explainability techniques currently available for generating explanations for NLP model predictions, to serve as a resource for model developers in the community. Finally, we point out the current gaps and encourage directions for future work in this important research area.