AITopics | Wei, Dennis

Collaborating Authors

Wei, Dennis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Agentic AI Needs a Systems Theory

Miehling, Erik, Ramamurthy, Karthikeyan Natesan, Varshney, Kush R., Riemer, Matthew, Bouneffouf, Djallel, Richards, John T., Dhurandhar, Amit, Daly, Elizabeth M., Hind, Michael, Sattigeri, Prasanna, Wei, Dennis, Rawat, Ambrish, Gajcin, Jasmina, Geyer, Werner

arXiv.org Artificial IntelligenceFeb-28-2025

The endowment of AI with reasoning capabilities and some degree of agency is widely viewed as a path toward more capable and generalizable systems. Our position is that the current development of agentic AI requires a more holistic, systems-theoretic perspective in order to fully understand their capabilities and mitigate any emergent risks. The primary motivation for our position is that AI development is currently overly focused on individual model capabilities, often ignoring broader emergent behavior, leading to a significant underestimation in the true capabilities and associated risks of agentic AI. We describe some fundamental mechanisms by which advanced capabilities can emerge from (comparably simpler) agents simply due to their interaction with the environment and other agents. Informed by an extensive amount of existing literature from various fields, we outline mechanisms for enhanced agent cognition, emergent causal reasoning ability, and metacognitive awareness. We conclude by presenting some key open challenges and guidance for the development of agentic AI. We emphasize that a systems-level perspective is essential for better understanding, and purposefully shaping, agentic AI systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.00237

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods

Wei, Dennis, Padhi, Inkit, Ghosh, Soumya, Dhurandhar, Amit, Ramamurthy, Karthikeyan Natesan, Chang, Maria

arXiv.org Machine LearningDec-5-2024

Training data attribution (TDA) is the task of attributing model behavior to elements in the training data. This paper draws attention to the common setting where one has access only to the final trained model, and not the training algorithm or intermediate information from training. To serve as a gold standard for TDA in this "final-model-only" setting, we propose further training, with appropriate adjustment and averaging, to measure the sensitivity of the given model to training instances. We then unify existing gradient-based methods for TDA by showing that they all approximate the further training gold standard in different ways. We investigate empirically the quality of these gradient-based approximations to further training, for tabular, image, and text datasets and models. We find that the approximation quality of first-order methods is sometimes high but decays with the amount of further training. In contrast, the approximations given by influence function methods are more stable but surprisingly lower in quality.

artificial intelligence, cosine similarity, machine learning, (16 more...)

arXiv.org Machine Learning

2412.03906

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Identifying Sub-networks in Neural Networks via Functionally Similar Representations

Gao, Tian, Dhurandhar, Amit, Ramamurthy, Karthikeyan Natesan, Wei, Dennis

arXiv.org Artificial IntelligenceOct-21-2024

Mechanistic interpretability aims to provide human-understandable insights into the inner workings of neural network models by examining their internals. Existing approaches typically require significant manual effort and prior knowledge, with strategies tailored to specific tasks. In this work, we take a step toward automating the understanding of the network by investigating the existence of distinct sub-networks. Specifically, we explore a novel automated and task-agnostic approach based on the notion of functionally similar representations within neural networks, reducing the need for human intervention. Our method identifies similar and dissimilar layers in the network, revealing potential sub-components. We achieve this by proposing, for the first time to our knowledge, the use of Gromov-Wasserstein distance, which overcomes challenges posed by varying distributions and dimensionalities across intermediate representations--issues that complicate direct layer-to-layer comparisons. Through experiments on algebraic and language tasks, we observe the emergence of sub-groups within neural network layers corresponding to functional abstractions. Additionally, we find that different training strategies influence the positioning of these sub-groups. Our approach offers meaningful insights into the behavior of neural networks with minimal human and computational cost.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2410.16484

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs

Kadhe, Swanand Ravindra, Ahmed, Farhan, Wei, Dennis, Baracaldo, Nathalie, Padhi, Inkit

arXiv.org Artificial IntelligenceJun-17-2024

Large language models (LLMs) have shown to pose social and ethical risks such as generating toxic language or facilitating malicious use of hazardous knowledge. Machine unlearning is a promising approach to improve LLM safety by directly removing harmful behaviors and knowledge. In this paper, we propose "SPlit, UNlearn, MerGE" (SPUNGE), a framework that can be used with any unlearning method to amplify its effectiveness. SPUNGE leverages data attributes during unlearning by splitting unlearning data into subsets based on specific attribute values, unlearning each subset separately, and merging the unlearned models. We empirically demonstrate that SPUNGE significantly improves the performance of two recent unlearning methods on state-of-the-art LLMs while maintaining their general capabilities on standard academic benchmarks.

large language model, natural language, toxicity, (18 more...)

arXiv.org Artificial Intelligence

2406.1178

Country:

North America > Canada (0.14)
North America > United States > Minnesota (0.14)
Europe > Italy (0.14)
Asia > China (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Achintalwar, Swapnaja, Garcia, Adriana Alvarado, Anaby-Tavor, Ateret, Baldini, Ioana, Berger, Sara E., Bhattacharjee, Bishwaranjan, Bouneffouf, Djallel, Chaudhury, Subhajit, Chen, Pin-Yu, Chiazor, Lamogha, Daly, Elizabeth M., DB, Kirushikesh, de Paula, Rogério Abreu, Dognin, Pierre, Farchi, Eitan, Ghosh, Soumya, Hind, Michael, Horesh, Raya, Kour, George, Lee, Ja Young, Madaan, Nishtha, Mehta, Sameep, Miehling, Erik, Murugesan, Keerthiram, Nagireddy, Manish, Padhi, Inkit, Piorkowski, David, Rawat, Ambrish, Raz, Orna, Sattigeri, Prasanna, Strobelt, Hendrik, Swaminathan, Sarathkrishna, Tillmann, Christoph, Trivedi, Aashka, Varshney, Kush R., Wei, Dennis, Witherspooon, Shalisha, Zalmanovici, Marcel

arXiv.org Artificial IntelligenceJun-13-2024

Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we present our ongoing efforts to create and deploy a library of detectors: compact and easy-to-build classification models that provide labels for various harms. In addition to the detectors themselves, we discuss a wide range of uses for these detector models - from acting as guardrails to enabling effective AI governance. We also deep dive into inherent challenges in their development and discuss future work aimed at making the detectors more reliable and broadening their scope.

detector, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.06009

Country:

Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(2 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Information Technology (0.69)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Interventional Causal Discovery in a Mixture of DAGs

Varıcı, Burak, Katz-Rogozhnikov, Dmitriy, Wei, Dennis, Sattigeri, Prasanna, Tajer, Ali

arXiv.org Machine LearningJun-12-2024

Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled by one directed acyclic graph (DAG). Causal discovery from mixtures is fundamentally more challenging than single-DAG causal discovery. Two major difficulties stem from (i) inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. This paper addresses these challenges and aims to identify edges that exist in at least one component DAG of the mixture, referred to as true edges. First, it establishes matching necessary and sufficient conditions on the size of interventions required to identify the true edges. Next, guided by the necessity results, an adaptive algorithm is designed that learns all true edges using ${\cal O}(n^2)$ interventions, where $n$ is the number of nodes. Remarkably, the size of the interventions is optimal if the underlying mixture model does not contain cycles across its components. More generally, the gap between the intervention size used by the algorithm and the optimal size is quantified. It is shown to be bounded by the cyclic complexity number of the mixture model, defined as the size of the minimal intervention that can break the cycles in the mixture, which is upper bounded by the number of cycles among the ancestors of a node.

artificial intelligence, intervention, machine learning, (15 more...)

arXiv.org Machine Learning

2406.08666

Country:

North America > United States > Maryland (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Trust Regions for Explanations via Black-Box Probabilistic Certification

Dhurandhar, Amit, Haldar, Swagatam, Wei, Dennis, Ramamurthy, Karthikeyan Natesan

arXiv.org Artificial IntelligenceJun-5-2024

Given the black box nature of machine learning models, a plethora of explainability methods have been developed to decipher the factors behind individual decisions. In this paper, we introduce a novel problem of black box (probabilistic) explanation certification. We ask the question: Given a black box model with only query access, an explanation for an example and a quality metric (viz. fidelity, stability), can we find the largest hypercube (i.e., $\ell_{\infty}$ ball) centered at the example such that when the explanation is applied to all examples within the hypercube, (with high probability) a quality criterion is met (viz. fidelity greater than some value)? Being able to efficiently find such a \emph{trust region} has multiple benefits: i) insight into model behavior in a \emph{region}, with a \emph{guarantee}; ii) ascertained \emph{stability} of the explanation; iii) \emph{explanation reuse}, which can save time, energy and money by not having to find explanations for every example; and iv) a possible \emph{meta-metric} to compare explanation methods. Our contributions include formalizing this problem, proposing solutions, providing theoretical guarantees for these solutions that are computable, and experimentally showing their efficacy on synthetic and real data.

data mining, explanation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.11168

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.81)

Industry:

Transportation > Air (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

Facilitating Human-LLM Collaboration through Factuality Scores and Source Attributions

Do, Hyo Jin, Ostrand, Rachel, Weisz, Justin D., Dugan, Casey, Sattigeri, Prasanna, Wei, Dennis, Murugesan, Keerthiram, Geyer, Werner

arXiv.org Artificial IntelligenceMay-30-2024

While humans increasingly rely on large language models (LLMs), they are susceptible to generating inaccurate or false information, also known as "hallucinations". Technical advancements have been made in algorithms that detect hallucinated content by assessing the factuality of the model's responses and attributing sections of those responses to specific source documents. However, there is limited research on how to effectively communicate this information to users in ways that will help them appropriately calibrate their trust toward LLMs. To address this issue, we conducted a scenario-based study (N=104) to systematically compare the impact of various design strategies for communicating factuality and source attribution on participants' ratings of trust, preferences, and ease in validating response accuracy. Our findings reveal that participants preferred a design in which phrases within a response were color-coded based on the computed factuality scores. Additionally, participants increased their trust ratings when relevant sections of the source material were highlighted or responses were annotated with reference numbers corresponding to those sources, compared to when they received no annotation in the source material. Our study offers practical design guidelines to facilitate human-LLM collaboration and it promotes a new human role to carefully evaluate and take responsibility for their use of LLM outputs.

artificial intelligence, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2405.20434

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry: Media > News (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Selective Explanations

Paes, Lucas Monteiro, Wei, Dennis, Calmon, Flavio P.

arXiv.org Artificial IntelligenceMay-29-2024

Feature attribution methods explain black-box machine learning (ML) models by assigning importance scores to input features. These methods can be computationally expensive for large ML models. To address this challenge, there has been increasing efforts to develop amortized explainers, where a machine learning model is trained to predict feature attribution scores with only one inference. Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations. In this paper, we propose selective explanations, a novel feature attribution method that (i) detects when amortized explainers generate low-quality explanations and (ii) improves these explanations using a technique called explanations with initial guess. Our selective explanation method allows practitioners to specify the fraction of samples that receive explanations with initial guess, offering a principled way to bridge the gap between amortized explainers and their high-quality counterparts.

artificial intelligence, explanation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2405.19562

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers

Mozannar, Hussein, Chen, Valerie, Alsobay, Mohammed, Das, Subhro, Zhao, Sebastian, Wei, Dennis, Nagireddy, Manish, Sattigeri, Prasanna, Talwalkar, Ameet, Sontag, David

arXiv.org Artificial IntelligenceApr-3-2024

Evaluation of large language models (LLMs) for code has primarily relied on static benchmarks, including HumanEval (Chen et al., 2021), which measure the ability of LLMs to generate complete code that passes unit tests. As LLMs are increasingly used as programmer assistants, we study whether gains on existing benchmarks translate to gains in programmer productivity when coding with LLMs, including time spent coding. In addition to static benchmarks, we investigate the utility of preference metrics that might be used as proxies to measure LLM helpfulness, such as code acceptance or copy rates. To do so, we introduce RealHumanEval, a web interface to measure the ability of LLMs to assist programmers, through either autocomplete or chat support. We conducted a user study (N=213) using RealHumanEval in which users interacted with six LLMs of varying base model performance. Despite static benchmarks not incorporating humans-in-the-loop, we find that improvements in benchmark performance lead to increased programmer productivity; however gaps in benchmark versus human performance are not proportional -- a trend that holds across both forms of LLM support. In contrast, we find that programmer preferences do not correlate with their actual performance, motivating the need for better, human-centric proxy signals. We also open-source RealHumanEval to enable human-centric evaluation of new models and the study data to facilitate efforts to improve code models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.02806

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback