AITopics | Dhamala, Jwala

Collaborating Authors

Dhamala, Jwala

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity

Ovalle, Anaelia, Mehrabi, Ninareh, Goyal, Palash, Dhamala, Jwala, Chang, Kai-Wei, Zemel, Richard, Galstyan, Aram, Gupta, Rahul

arXiv.org Artificial IntelligenceDec-21-2023

A large body of NLP research has documented the ways gender biases manifest and amplify within large language models (LLMs), though this research has predominantly operated within a gender binary-centric context. A growing body of work has identified the harmful limitations of this gender-exclusive framing; many LLMs cannot correctly and consistently refer to persons outside the gender binary, especially if they use neopronouns. While data scarcity has been identified as a possible culprit, the precise mechanisms through which it influences LLM misgendering remain underexplored. Our work addresses this gap by studying data scarcity's role in subword tokenization and, consequently, the formation of LLM word representations. We uncover how the Byte-Pair Encoding (BPE) tokenizer, a backbone for many popular LLMs, contributes to neopronoun misgendering through out-of-vocabulary behavior. We introduce pronoun tokenization parity (PTP), a novel approach to reduce LLM neopronoun misgendering by preserving a token's functional structure. We evaluate PTP's efficacy using pronoun consistency-based metrics and a novel syntax-based metric. Through several controlled experiments, finetuning LLMs with PTP improves neopronoun consistency from 14.5% to 58.4%, highlighting the significant role tokenization plays in LLM pronoun consistency.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.11779

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

JAB: Joint Adversarial Prompting and Belief Augmentation

Mehrabi, Ninareh, Goyal, Palash, Ramakrishna, Anil, Dhamala, Jwala, Ghosh, Shalini, Zemel, Richard, Chang, Kai-Wei, Galstyan, Aram, Gupta, Rahul

arXiv.org Artificial IntelligenceNov-15-2023

With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance. Here we introduce a joint framework in which we simultaneously probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation using iterative feedback loops. This framework utilizes an automated red teaming approach to probe the target model, along with a belief augmenter to generate instructions for the target model to improve its robustness to those adversarial probes. Importantly, the adversarial model and the belief generator leverage the feedback from past interactions to improve the effectiveness of the adversarial prompts and beliefs, respectively. In our experiments, we demonstrate that such a framework can reduce toxic content generation both in dynamic cases where an adversary directly interacts with a target model and static cases where we use a static benchmark dataset to evaluate our model.

adversarial example, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.09473

Country:

North America > United States (0.68)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

Ovalle, Anaelia, Goyal, Palash, Dhamala, Jwala, Jaggers, Zachary, Chang, Kai-Wei, Galstyan, Aram, Zemel, Richard, Gupta, Rahul

arXiv.org Artificial IntelligenceJun-1-2023

Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3593013.3594078

2305.09941

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Multi-VALUE: A Framework for Cross-Dialectal English NLP

Ziems, Caleb, Held, William, Yang, Jingfeng, Dhamala, Jwala, Gupta, Rahul, Yang, Diyi

arXiv.org Artificial IntelligenceMay-29-2023

Dialect differences caused by regional, social, and economic factors cause performance discrepancies for many groups of language technology users. Inclusive and equitable language technology must critically be dialect invariant, meaning that performance remains constant over dialectal shifts. Current systems often fall short of this ideal since they are designed and tested on a single dialect: Standard American English (SAE). We introduce a suite of resources for evaluating and achieving English dialect invariance. The resource is called Multi-VALUE, a controllable rule-based translation system spanning 50 English dialects and 189 unique linguistic features. Multi-VALUE maps SAE to synthetic forms of each dialect. First, we use this system to stress tests question answering, machine translation, and semantic parsing. Stress tests reveal significant performance disparities for leading models on non-standard dialects. Second, we use this system as a data augmentation technique to improve the dialect robustness of existing systems. Finally, we partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task. To execute the transformation code, run model checkpoints, and download both synthetic and gold-standard dialectal benchmark datasets, see http://value-nlp.org.

artificial intelligence, computational linguistic, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.08011

Country:

North America > United States > Minnesota (0.28)
Europe > United Kingdom > England (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

Add feedback

Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

Mehrabi, Ninareh, Goyal, Palash, Verma, Apurv, Dhamala, Jwala, Kumar, Varun, Hu, Qian, Chang, Kai-Wei, Zemel, Richard, Galstyan, Aram, Gupta, Rahul

arXiv.org Artificial IntelligenceNov-17-2022

Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benchmark dataset covering different types of ambiguities that occur in these systems. We then propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with human intention in the presence of ambiguities.

ambiguity, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.12503

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

Dhamala, Jwala, Sun, Tony, Kumar, Varun, Krishna, Satyapriya, Pruksachatkun, Yada, Chang, Kai-Wei, Gupta, Rahul

arXiv.org Artificial IntelligenceJan-27-2021

Recent advances in deep learning techniques have enabled machines to generate cohesive open-ended text when prompted with a sequence of words as context. While these models now empower many downstream applications from conversation bots to automatic storytelling, they have been shown to generate texts that exhibit social biases. To systematically study and benchmark social biases in open-ended language generation, we introduce the Bias in Open-Ended Language Generation Dataset (BOLD), a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology. We also propose new automated metrics for toxicity, psycholinguistic norms, and text gender polarity to measure social biases in open-ended text generation from multiple angles. An examination of text generated from three popular language models reveals that the majority of these models exhibit a larger social bias than human-written Wikipedia text across all domains. With these results we highlight the need to benchmark biases in open-ended language generation and caution users of language generation models on downstream tasks to be cognizant of these embedded prejudices.

deep learning, law enforcement, proportion, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3442188.3445924

2101.11718

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
Media > Film (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Geometry-Dependent and Physics-Based Inverse Image Reconstruction

Jiang, Xiajun, Ghimire, Sandesh, Dhamala, Jwala, Li, Zhiyuan, Gyawali, Prashnna Kumar, Wang, Linwei

arXiv.org Artificial IntelligenceJul-18-2020

Deep neural networks have shown great potential in image reconstruction problems in Euclidean space. However, many reconstruction problems involve imaging physics that are dependent on the underlying non-Euclidean geometry. In this paper, we present a new approach to learn inverse imaging that exploit the underlying geometry and physics. We first introduce a non-Euclidean encoding-decoding network that allows us to describe the unknown and measurement variables over their respective geometrical domains. We then learn the geometry-dependent physics in between the two domains by explicitly modeling it via a bipartite graph over the graphical embedding of the two geometry. We applied the presented network to reconstructing electrical activity on the heart surface from body-surface potential. In a series of generalization tasks with increasing difficulty, we demonstrated the improved ability of the presented network to generalize across geometrical changes underlying the data in comparison to its Euclidean alternatives.

artificial intelligence, geometry, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-59725-2_47

2007.09522

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Education > Curriculum > Subject-Specific Education (0.52)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Quantifying the Uncertainty in Model Parameters Using Gaussian Process-Based Markov Chain Monte Carlo: An Application to Cardiac Electrophysiological Models

Dhamala, Jwala, Sapp, John L., Horácek, B. Milan, Wang, Linwei

arXiv.org Machine LearningJun-2-2020

Estimation of patient-specific model parameters is important for personalized modeling, although sparse and noisy clinical data can introduce significant uncertainty in the estimated parameter values. This importance source of uncertainty, if left unquantified, will lead to unknown variability in model outputs that hinder their reliable adoptions. Probabilistic estimation model parameters, however, remains an unresolved challenge because standard Markov Chain Monte Carlo sampling requires repeated model simulations that are computationally infeasible. A common solution is to replace the simulation model with a computationally-efficient surrogate for a faster sampling. However, by sampling from an approximation of the exact posterior probability density function (pdf) of the parameters, the efficiency is gained at the expense of sampling accuracy. In this paper, we address this issue by integrating surrogate modeling into Metropolis Hasting (MH) sampling of the exact posterior pdfs to improve its acceptance rate. It is done by first quickly constructing a Gaussian process (GP) surrogate of the exact posterior pdfs using deterministic optimization. This efficient surrogate is then used to modify commonly-used proposal distributions in MH sampling such that only proposals accepted by the surrogate will be tested by the exact posterior pdf for acceptance/rejection, reducing unnecessary model simulations at unlikely candidates. Synthetic and real-data experiments using the presented method show a significant gain in computational efficiency without compromising the accuracy. In addition, insights into the non-identifiability and heterogeneity of tissue properties can be gained from the obtained posterior distributions.

cardiology, gp surrogate, vascular disease, (16 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-319-59050-9_18

2006.01983

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

Bayesian Optimization on Large Graphs via a Graph Convolutional Generative Model: Application in Cardiac Model Personalization

Dhamala, Jwala, Ghimire, Sandesh, Sapp, John L., Horacek, B. Milan, Wang, Linwei

arXiv.org Machine LearningMay-18-2020

Personalization of cardiac models involves the optimization of organ tissue properties that vary spatially over the non-Euclidean geometry model of the heart. To represent the high-dimensional (HD) unknown of tissue properties, most existing works rely on a low-dimensional (LD) partitioning of the geometrical model. While this exploits the geometry of the heart, it is of limited expressiveness to allow partitioning that is small enough for effective optimization. Recently, a variational auto-encoder (VAE) was utilized as a more expressive generative model to embed the HD optimization into the LD latent space. Its Euclidean nature, however, neglects the rich geometrical information in the heart. In this paper, we present a novel graph convolutional VAE to allow generative modeling of non-Euclidean data, and utilize it to embed Bayesian optimization of large graphs into a small latent space. This approach bridges the gap of previous works by introducing an expressive generative model that is able to incorporate the knowledge of spatial proximity and hierarchical compositionality of the underlying geometry. It further allows transferring of the learned features across different geometries, which was not possible with a regular VAE. We demonstrate these benefits of the presented method in synthetic and real data experiments of estimating tissue excitability in a cardiac electrophysiological model.

neural network, optimization, vascular disease, (16 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-32245-8_51

1907.01406

Country: North America (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.83)

Add feedback

High-dimensional Bayesian Optimization of Personalized Cardiac Model Parameters via an Embedded Generative Model

Dhamala, Jwala, Ghimire, Sandesh, Sapp, John L., Horácek, B. Milan, Wang, Linwei

arXiv.org Machine LearningMay-15-2020

The estimation of patient-specific tissue properties in the form of model parameters is important for personalized physiological models. However, these tissue properties are spatially varying across the underlying anatomical model, presenting a significance challenge of high-dimensional (HD) optimization at the presence of limited measurement data. A common solution to reduce the dimension of the parameter space is to explicitly partition the anatomical mesh, either into a fixed small number of segments or a multi-scale hierarchy. This anatomy-based reduction of parameter space presents a fundamental bottleneck to parameter estimation, resulting in solutions that are either too low in resolution to reflect tissue heterogeneity, or too high in dimension to be reliably estimated within feasible computation. In this paper, we present a novel concept that embeds a generative variational auto-encoder (VAE) into the objective function of Bayesian optimization, providing an implicit low-dimensional (LD) search space that represents the generative code of the HD spatially-varying tissue properties. In addition, the VAE-encoded knowledge about the generative code is further used to guide the exploration of the search space. The presented method is applied to estimating tissue excitability in a cardiac electrophysiological model. Synthetic and real-data experiments demonstrate its ability to improve the accuracy of parameter estimation with more than 10x gain in efficiency.

neural network, optimization, vascular disease, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-00934-2_56

2005.07804

Country: North America (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback