Goto

Collaborating Authors

 relationship


Exploiting the Relationship Between Kendall's Rank Correlation and Cosine Similarity for Attribution Protection

Neural Information Processing Systems

Model attributions are important in deep neural networks as they aid practitioners in understanding the models, but recent studies reveal that attributions can be easily perturbed by adding imperceptible noise to the input. The non-differentiable Kendall's rank correlation is a key performance index for attribution protection. In this paper, we first show that the expected Kendall's rank correlation is positively correlated to cosine similarity and then indicate that the direction of attribution is the key to attribution robustness. Based on these findings, we explore the vector space of attribution to explain the shortcomings of attribution defense methods using $\ell_p$ norm and propose integrated gradient regularizer (IGR), which maximizes the cosine similarity between natural and perturbed attributions. Our analysis further exposes that IGR encourages neurons with the same activation states for natural samples and the corresponding perturbed samples. Our experiments on different models and datasets confirm our analysis on attribution protection and demonstrate a decent improvement in adversarial robustness.


Improving Calibration through the Relationship with Adversarial Robustness

Neural Information Processing Systems

Neural networks lack adversarial robustness, i.e., they are vulnerable to adversarial examples that through small perturbations to inputs cause incorrect predictions. Further, trust is undermined when models give miscalibrated predictions, i.e., the predicted probability is not a good indicator of how much we should trust our model. In this paper, we study the connection between adversarial robustness and calibration and find that the inputs for which the model is sensitive to small perturbations (are easily attacked) are more likely to have poorly calibrated predictions. Based on this insight, we examine if calibration can be improved by addressing those adversarially unrobust inputs. To this end, we propose Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS) that integrates the correlations of adversarial robustness and calibration into training by adaptively softening labels for an example based on how easily it can be attacked by an adversary. We find that our method, taking the adversarial robustness of the in-distribution data into consideration, leads to better calibration over the model even under distributional shifts. In addition, AR-AdaLS can also be applied to an ensemble model to further improve model calibration.


Unsupervised Topic Models are Data Mixers for Pre-training Language Models

Peng, Jiahui, Zhuang, Xinlin, Jiantao, Qiu, Ma, Ren, Yu, Jing, Bai, Tianyi, He, Conghui

arXiv.org Artificial Intelligence

The performance of large language models (LLMs) is significantly affected by the quality and composition of their pre-training data, which is inherently diverse, spanning various domains, sources, and topics. Effectively integrating these heterogeneous data sources is crucial for optimizing LLM performance. Previous research has predominantly concentrated on domain-based data mixing, often neglecting the nuanced topic-level characteristics of the data. To address this gap, we propose a simple yet effective topic-based data mixing strategy that utilizes fine-grained topics generated through our topic modeling method, DataWeave. DataWeave employs a multi-stage clustering process to group semantically similar documents and utilizes LLMs to generate detailed topics, thereby facilitating a more nuanced understanding of dataset composition. Our strategy employs heuristic methods to upsample or downsample specific topics, which significantly enhances LLM performance on downstream tasks, achieving superior results compared to previous, more complex data mixing approaches. Furthermore, we confirm that the topics Science and Relationships are particularly effective, yielding the most substantial performance improvements. We will make our code and datasets publicly available.


Reviews: Extracting Relationships by Multi-Domain Matching

Neural Information Processing Systems

Title: Extracting Relationships by Multi-Domain Matching Summary Assuming that a corpus is compiled from many sources belonging to different to domains, of which only a strict subset of domains is suitable to learn how to do prediction in a target domain, this paper proposes a novel approach (called Multiple Domain Matching Network (MDMN)) that aims at learning which domains share strong statistical relationships, and which source domains are best at supporting to learn the target domain prediction tasks. While many approaches to multiple-domain adaptation aim to match the feature-space distribution of *every* source domain to that of the target space, this paper suggests to not only map the distribution between sources and target, but also *within* source domains. The latter allows for identifying subsets of source domains that share a strong statistical relationship. Strengths Paper provides a theoretical analysis that yields a tighter bound on the weighted multi-source discrepancy. Weaknesses Tighter bound on multi-source discrepancy depends on the assumption that source domains that are less relevant for the target domain have lower weights.


QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction

Huang, Xiang, Cheng, Sitao, Huang, Shanshan, Shen, Jiayu, Xu, Yong, Zhang, Chaoyun, Qu, Yuzhong

arXiv.org Artificial Intelligence

Employing Large Language Models (LLMs) for semantic parsing has achieved remarkable success. However, we find existing methods fall short in terms of reliability and efficiency when hallucinations are encountered. In this paper, we address these challenges with a framework called QueryAgent, which solves a question step-by-step and performs step-wise self-correction. We introduce an environmental feedback-based self-correction method called ERASER. Unlike traditional approaches, ERASER leverages rich environmental feedback in the intermediate steps to perform selective and differentiated self-correction only when necessary. Experimental results demonstrate that QueryAgent notably outperforms all previous few-shot methods using only one example on GrailQA and GraphQ by 7.0 and 15.0 F1. Moreover, our approach exhibits superiority in terms of efficiency, including runtime, query overhead, and API invocation costs. By leveraging ERASER, we further improve another baseline (i.e., AgentBench) by approximately 10 points, revealing the strong transferability of our approach.


Banking's One-to-One Future is Finally Possible

#artificialintelligence

Almost a quarter century ago, a book was written about how organizations would focus on share of customer as opposed to share of market, building a personalized collaboration driven by big data. With advanced analytics, banking may finally getting close to realizing this vision. In 1993, a then revolutionary book, "The One to One Future: Building Relationships One Customer at a Time" was published, proposing the idea that as technology makes it affordable to track individual customers, marketing shifts from finding customers for products to finding products for customers. According to the authors, Don Peppers and Martha Rogers, Ph.D., a company could use technology to gather information about, and to communicate directly with, individuals to form a commercial bond. The book became a bestseller, and was on every marketer's bookshelf … almost a quarter century ago.


Toward Automated Discovery in the Biological Sciences

AI Magazine

Knowledge discovery programs in the biological sciences require flexibility in the use of symbolic data and semantic information. Because of the volume of nonnumeric, as well as numeric, data, the programs must be able to explore a large space of possibly interesting relationships to discover those that are novel and interesting. Thus, the framework for the discovery program must facilitate proposing and selecting the next task to perform and performing the selected tasks. The framework we describe, called the agenda-and justificationbased framework, has several properties that are desirable in semiautonomous discovery systems: It provides a mechanism for estimating the plausibility of tasks, it uses heuristics to propose and perform tasks, and it facilitates the encoding of general discovery strategies and the use of background knowledge. The complexity of the data and the underlying mechanisms argue for providing computer assistance to biologists.


Qualitative Spatial Reasoning about Sketch Maps

AI Magazine

Sketch maps are an important spatial representation used in many geospatial-reasoning tasks. This article describes techniques we have developed that enable software to perform humanlike reasoning about sketch maps. We illustrate the utility of these techniques in the context of nuSketch Battlespace, a research system that has been successfully used in a variety of experiments. After an overview of the nuSketch approach and nuSketch Battlespace, we outline the representations of glyphs and sketches and the nuSketch spatial reasoning architecture. We describe the use of qualitative topology and Voronoi diagrams to construct spatial representations, and explain how these facilities are combined with analogical reasoning to provide a simple form of enemy intent hypothesis generation.


A Visual Qualitative Modeling Environment for Middle-School Students

AI Magazine

Learning how to create, test, and revise models is a central skill in scientific reasoning. We argue that qualitative modeling provides an appropriate level of representation for helping middle-school students learn to become modelers. We describe Vmodel, a system we have created that uses visual representations and that enables middle-school students to create qualitative models. Software coaches use simple analyses of model structure plus qualitative simulation to provide feedback and explanations. This system has been used in several studies in Chicago public school classrooms, using curricula developed in collaboration with teachers.


Meaning and Links

AI Magazine

This article presents some fundamental ideas about representing knowledge and dealing with meaning in computer representations. I will describe the issues as I currently understand them and describe how they came about, how they fit together, what problems they solve, and some of the things that the resulting framework can do. The ideas apply not just to graph-structured "node-and-link" representations, sometimes called semantic networks, but also to representations referred to variously as frames with slots, entities with relationships, objects with attributes, tables with columns, and records with fields and to the classes and variables of object-oriented data structures. I will start by describing some background experiences and thoughts that preceded the writing of my 1975 paper, "What's in a Link," which introduced many of these issues. After that, I will present some of the key ideas from that paper with a discussion of how some of those ideas have matured since then.