AITopics | Sharma, Arnab Sen

Collaborating Authors

Sharma, Arnab Sen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare

Ahsan, Hiba, Sharma, Arnab Sen, Amir, Silvio, Bau, David, Wallace, Byron C.

arXiv.org Artificial IntelligenceFeb-18-2025

We know from prior work that LLMs encode social biases, and that this manifests in clinical tasks. In this work we adopt tools from mechanistic interpretability to unveil sociodemographic representations and biases within LLMs in the context of healthcare. Specifically, we ask: Can we identify activations within LLMs that encode sociodemographic information (e.g., gender, race)? We find that gender information is highly localized in middle MLP layers and can be reliably manipulated at inference time via patching. Such interventions can surgically alter generated clinical vignettes for specific conditions, and also influence downstream clinical predictions which correlate with gender, e.g., patient risk of depression. We find that representation of patient race is somewhat more distributed, but can also be intervened upon, to a degree. To our knowledge, this is the first application of mechanistic interpretability methods to LLMs for healthcare.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.13319

Country:

North America > United States > Minnesota (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Health Care Technology (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.74)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Function Vectors in Large Language Models

Todd, Eric, Li, Millicent L., Sharma, Arnab Sen, Mueller, Aaron, Wallace, Byron C., Bau, David

arXiv.org Artificial IntelligenceOct-23-2023

We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Taken together, our findings suggest that LLMs contain internal abstractions of general-purpose functions that can be invoked in a variety of contexts.

artificial intelligence, function vector, large language model, (1 more...)

arXiv.org Artificial Intelligence

2310.15213

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Linearity of Relation Decoding in Transformer Language Models

Hernandez, Evan, Sharma, Arnab Sen, Haklay, Tal, Meng, Kevin, Wattenberg, Martin, Andreas, Jacob, Belinkov, Yonatan, Bau, David

arXiv.org Artificial IntelligenceAug-17-2023

Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations: relations between words and their synonyms, entities and their attributes, etc. We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation. Linear relation representations may be obtained by constructing a first-order approximation to the LM from a single prompt, and they exist for a variety of factual, commonsense, and linguistic relations. However, we also identify many cases in which LM predictions capture relational knowledge accurately, but this knowledge is not linearly encoded in their representations. Our results thus reveal a simple, interpretable, but heterogeneously deployed knowledge representation strategy in transformer LMs.

machine learning, natural language, relation, (18 more...)

arXiv.org Artificial Intelligence

2308.09124

Country:

Europe (0.46)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Mass-Editing Memory in a Transformer

Meng, Kevin, Sharma, Arnab Sen, Andonian, Alex, Belinkov, Yonatan, Bau, David

arXiv.org Artificial IntelligenceAug-1-2023

Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge. However, this line of work is predominantly limited to updating single associations. We develop MEMIT, a method for directly updating a language model with many memories, demonstrating experimentally that it can scale up to thousands of associations for GPT-J (6B) and GPT-NeoX (20B), exceeding prior work by orders of magnitude. Our code and data are at https://memit.baulab.info.

knowledge management, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2210.07229

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback