AITopics | Luss, Ronny

Collaborating Authors

Luss, Ronny

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparsity May Be All You Need: Sparse Random Parameter Adaptation

Rios, Jesus, Dognin, Pierre, Luss, Ronny, Ramamurthy, Karthikeyan N.

arXiv.org Artificial IntelligenceFeb-21-2025

Full fine-tuning of large language models for alignment and task adaptation has become prohibitively expensive as models have grown in size. Parameter-Efficient Fine-Tuning (PEFT) methods aim at significantly reducing the computational and memory resources needed for fine-tuning these models by only training on a small number of parameters instead of all model parameters. Currently, the most popular PEFT method is the Low-Rank Adaptation (LoRA), which freezes the parameters of the model to be fine-tuned and introduces a small set of trainable parameters in the form of low-rank matrices. We propose simply reducing the number of trainable parameters by randomly selecting a small proportion of the model parameters to train on. In this paper, we compare the efficiency and performance of our proposed approach with PEFT methods, including LoRA, as well as full parameter fine-tuning.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.15975

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

Add feedback

CELL your Model: Contrastive Explanation Methods for Large Language Models

Luss, Ronny, Miehling, Erik, Dhurandhar, Amit

arXiv.org Artificial IntelligenceJun-17-2024

The advent of black-box deep neural network classification models has sparked the need to explain their decisions. However, in the case of generative AI such as large language models (LLMs), there is no class prediction to explain. Rather, one can ask why an LLM output a particular response to a given prompt. In this paper, we answer this question by proposing, to the best of our knowledge, the first contrastive explanation methods requiring simply black-box/query access. Our explanations suggest that an LLM outputs a reply to a given prompt because if the prompt was slightly modified, the LLM would have given a different response that is either less preferable or contradicts the original response. The key insight is that contrastive explanations simply require a distance function that has meaning to the user and not necessarily a real valued representation of a specific response (viz. class label). We offer two algorithms for finding contrastive explanations: i) A myopic algorithm, which although effective in creating contrasts, requires many model calls and ii) A budgeted algorithm, our main algorithmic contribution, which intelligently creates contrasts adhering to a query budget, necessary for longer contexts. We show the efficacy of these methods on diverse natural language tasks such as open-text generation, automated red teaming, and explaining conversational degradation.

explanation, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.11785

Country: North America > United States (0.28)

Genre: Research Report (0.69)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models

Dhurandhar, Amit, Pedapati, Tejaswini, Luss, Ronny, Dan, Soham, Lozano, Aurelie, Das, Payel, Kollias, Georgios

arXiv.org Artificial IntelligenceJun-5-2024

Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks. However, expensive training as well as inference remains a significant impediment to their widespread applicability. While enforcing sparsity at various levels of the model architecture has found promise in addressing scaling and efficiency issues, there remains a disconnect between how sparsity affects network topology. Inspired by brain neuronal networks, we explore sparsity approaches through the lens of network topology. Specifically, we exploit mechanisms seen in biological networks, such as preferential attachment and redundant synapse pruning, and show that principled, model-agnostic sparsity approaches are performant and efficient across diverse NLP tasks, spanning both classification (such as natural language inference) and generation (summarization, machine translation), despite our sole objective not being optimizing performance. NeuroPrune is competitive with (or sometimes superior to) baselines on performance and can be up to $10$x faster in terms of training time for a given level of sparsity, simultaneously exhibiting measurable improvements in inference time in many cases.

large language model, machine learning, sparsity, (20 more...)

arXiv.org Artificial Intelligence

2404.01306

Country:

Europe (0.67)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Multi-Level Explanations for Generative Language Models

Paes, Lucas Monteiro, Wei, Dennis, Do, Hyo Jin, Strobelt, Hendrik, Luss, Ronny, Dhurandhar, Amit, Nagireddy, Manish, Ramamurthy, Karthikeyan Natesan, Sattigeri, Prasanna, Geyer, Werner, Ghosh, Soumya

arXiv.org Artificial IntelligenceMar-21-2024

Perturbation-based explanation methods such as LIME and SHAP are commonly applied to text classification. This work focuses on their extension to generative language models. To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms. To handle text output, we introduce the notion of scalarizers for mapping text to real numbers and investigate multiple possibilities. To handle long inputs, we take a multi-level approach, proceeding from coarser levels of granularity to finer ones, and focus on algorithms with linear scaling in model queries. We conduct a systematic evaluation, both automated and human, of perturbation-based attribution methods for summarization and context-grounded question answering. The results show that our framework can provide more locally faithful explanations of generated outputs.

artificial intelligence, natural language, scalarizer, (18 more...)

arXiv.org Artificial Intelligence

2403.14459

Country:

North America > Canada (0.14)
Europe > Belgium (0.14)
Africa > Rwanda (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.46)

Add feedback

Contextual Moral Value Alignment Through Context-Based Aggregation

Dognin, Pierre, Rios, Jesus, Luss, Ronny, Padhi, Inkit, Riemer, Matthew D, Liu, Miao, Sattigeri, Prasanna, Nagireddy, Manish, Varshney, Kush R., Bouneffouf, Djallel

arXiv.org Artificial IntelligenceMar-19-2024

Developing value-aligned AI agents is a complex undertaking and an ongoing challenge in the field of AI. Specifically within the domain of Large Language Models (LLMs), the capability to consolidate multiple independently trained dialogue agents, each aligned with a distinct moral value, into a unified system that can adapt to and be aligned with multiple moral values is of paramount importance. In this paper, we propose a system that does contextual moral value alignment based on contextual aggregation. Here, aggregation is defined as the process of integrating a subset of LLM responses that are best suited to respond to a user input, taking into account features extracted from the user's input. The proposed system shows better results in term of alignment to human value compared to the state of the art.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2403.12805

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

Local Explanations for Reinforcement Learning

Luss, Ronny, Dhurandhar, Amit, Liu, Miao

arXiv.org Artificial IntelligenceFeb-7-2022

Many works in explainable AI have focused on explaining black-box classification models. Explaining deep reinforcement learning (RL) policies in a manner that could be understood by domain users has received much less attention. In this paper, we propose a novel perspective to understanding RL policies based on identifying important states from automatically learned meta-states. The key conceptual difference between our approach and many previous ones is that we form meta-states based on locality governed by the expert policy dynamics rather than based on similarity of actions, and that we do not assume any particular knowledge of the underlying topology of the state space. Theoretically, we show that our algorithm to find meta-states converges and the objective that selects important states from each meta-state is submodular leading to efficient high quality greedy selection. Experiments on four domains (four rooms, door-key, minipacman, and pong) and a carefully conducted user study illustrate that our perspective leads to better understanding of the policy. We conjecture that this is a result of our meta-states being more intuitive in that the corresponding important states are strong indicators of tractable intermediate goals that are easier for humans to interpret and follow.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2202.03597

Genre:

Questionnaire & Opinion Survey (0.88)
Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Auto-Transfer: Learning to Route Transferrable Representations

Murugesan, Keerthiram, Sadashivaiah, Vijay, Luss, Ronny, Shanmugam, Karthikeyan, Chen, Pin-Yu, Dhurandhar, Amit

arXiv.org Artificial IntelligenceFeb-3-2022

Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labelled data can be difficult to obtain in many applications. Existing approaches typically constrain the target deep neural network (DNN) feature representations to be close to the source DNNs feature representations, which can be limiting. We, in this paper, propose a novel adversarial multi-armed bandit approach which automatically learns to route source representations to appropriate target representations following which they are combined in meaningful ways to produce accurate target models. We see upwards of 5% accuracy improvements compared with the stateof-the-art knowledge transfer methods on four benchmark (target) image datasets CUB200, Stanford Dogs, MIT67 and Stanford40 where the source dataset is ImageNet. We qualitatively analyze the goodness of our transfer scheme by showing individual examples of the important features our target network focuses on in different layers compared with the (closest) competitors. We also observe that our improvement over other methods is higher for smaller target datasets making it an effective tool for small data applications that may benefit from transfer learning. Deep learning models have become increasingly good at learning from large amounts of labeled data. However, it is often difficult and expensive to collect sufficient a amount of labeled data for training a deep neural network (DNN). Transfer learning utilizes the knowledge from information-rich source tasks to learn a specific (often information-poor) target task. There are several ways to transfer knowledge from source task to target task (Pan & Yang, 2009), but the most widely used approach is fine-tuning (Sharif Razavian et al., 2014) where the target DNN being trained is initialized with the weights/representations of a source (often large) DNN (e.g. ResNet (He et al., 2016)) that has been pre-trained on a large dataset (e.g. In spite of its popularity, fine-tuning may not be ideal when the source and target tasks/networks are heterogeneous i.e. differing feature spaces or distributions (Ryu et al., 2020; Tsai et al., 2020).

artificial intelligence, machine learning, representation, (20 more...)

arXiv.org Artificial Intelligence

2202.01011

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI Explainability 360: Impact and Design

Arya, Vijay, Bellamy, Rachel K. E., Chen, Pin-Yu, Dhurandhar, Amit, Hind, Michael, Hoffman, Samuel C., Houde, Stephanie, Liao, Q. Vera, Luss, Ronny, Mojsilovic, Aleksandra, Mourad, Sami, Pedemonte, Pablo, Raghavendra, Ramya, Richards, John, Sattigeri, Prasanna, Shanmugam, Karthikeyan, Singh, Moninder, Varshney, Kush R., Wei, Dennis, Zhang, Yunfeng

arXiv.org Artificial IntelligenceSep-24-2021

We also introduced a taxonomy to The increasing use of artificial intelligence (AI) systems in navigate the space of explanation methods, not only the ten high stakes domains has been coupled with an increase in societal in the toolkit but also the broader literature on explainable demands for these systems to provide explanations for AI. The taxonomy was intended to be usable by consumers their outputs. This societal demand has already resulted in with varied backgrounds to choose an appropriate explanation new regulations requiring explanations (Goodman and Flaxman method for their application. AIX360 differs from other 2016; Wachter, Mittelstadt, and Floridi 2017; Selbst open source explainability toolkits (see Arya et al. (2020) and Powles 2017; Pasternak 2019). Explanations can allow for a list) in two main ways: 1) its support for a broad and users to gain insight into the system's decision-making process, diverse spectrum of explainability methods, implemented in which is a key component in calibrating appropriate a common architecture, and 2) its educational material as trust and confidence in AI systems (Doshi-Velez and Kim discussed below.

deep learning, explanation, neural network, (21 more...)

arXiv.org Artificial Intelligence

2109.12151

Country: North America > United States > Illinois (0.14)

Genre:

Instructional Material (0.67)
Research Report (0.50)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)
Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Let the CAT out of the bag: Contrastive Attributed explanations for Text

Chemmengath, Saneem, Azad, Amar Prakash, Luss, Ronny, Dhurandhar, Amit

arXiv.org Artificial IntelligenceSep-16-2021

Contrastive explanations for understanding the behavior of black box models has gained a lot of attention recently as they provide potential for recourse. In this paper, we propose a method Contrastive Attributed explanations for Text (CAT) which provides contrastive explanations for natural language text data with a novel twist as we build and exploit attribute classifiers leading to more semantically meaningful explanations. To ensure that our contrastive generated text has the fewest possible edits with respect to the original text, while also being fluent and close to a human generated contrastive, we resort to a minimal perturbation approach regularized using a BERT language model and attribute classifiers trained on available attributes. We show through qualitative examples and a user study that our method not only conveys more insight because of these attributes, but also leads to better quality (contrastive) text. Moreover, quantitatively we show that our method is more efficient than other state-of-the-art methods with it also scoring higher on benchmark metrics such as flip rate, (normalized) Levenstein distance, fluency and content preservation.

artificial intelligence, explanation, natural language, (19 more...)

arXiv.org Artificial Intelligence

2109.07983

Country:

Asia (0.46)
Europe (0.46)
North America > United States > Indiana (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Media (0.94)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Towards Better Model Understanding with Path-Sufficient Explanations

Luss, Ronny, Dhurandhar, Amit

arXiv.org Artificial IntelligenceSep-13-2021

Feature based local attribution methods are amongst the most prevalent in explainable artificial intelligence (XAI) literature. Going beyond standard correlation, recently, methods have been proposed that highlight what should be minimally sufficient to justify the classification of an input (viz. pertinent positives). While minimal sufficiency is an attractive property, the resulting explanations are often too sparse for a human to understand and evaluate the local behavior of the model, thus making it difficult to judge its overall quality. To overcome these limitations, we propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input. We validate these claims, both qualitatively and quantitatively, with experiments that show the benefit of PSEM across all three modalities (image, tabular and text). A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.

explanation, neural network, us government, (22 more...)

arXiv.org Artificial Intelligence

2109.06181

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (0.93)
Government > Military (0.68)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback