Goto

Collaborating Authors

 Dhamala, Jwala


Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity

arXiv.org Artificial Intelligence

A large body of NLP research has documented the ways gender biases manifest and amplify within large language models (LLMs), though this research has predominantly operated within a gender binary-centric context. A growing body of work has identified the harmful limitations of this gender-exclusive framing; many LLMs cannot correctly and consistently refer to persons outside the gender binary, especially if they use neopronouns. While data scarcity has been identified as a possible culprit, the precise mechanisms through which it influences LLM misgendering remain underexplored. Our work addresses this gap by studying data scarcity's role in subword tokenization and, consequently, the formation of LLM word representations. We uncover how the Byte-Pair Encoding (BPE) tokenizer, a backbone for many popular LLMs, contributes to neopronoun misgendering through out-of-vocabulary behavior. We introduce pronoun tokenization parity (PTP), a novel approach to reduce LLM neopronoun misgendering by preserving a token's functional structure. We evaluate PTP's efficacy using pronoun consistency-based metrics and a novel syntax-based metric. Through several controlled experiments, finetuning LLMs with PTP improves neopronoun consistency from 14.5% to 58.4%, highlighting the significant role tokenization plays in LLM pronoun consistency.


JAB: Joint Adversarial Prompting and Belief Augmentation

arXiv.org Artificial Intelligence

With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance. Here we introduce a joint framework in which we simultaneously probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation using iterative feedback loops. This framework utilizes an automated red teaming approach to probe the target model, along with a belief augmenter to generate instructions for the target model to improve its robustness to those adversarial probes. Importantly, the adversarial model and the belief generator leverage the feedback from past interactions to improve the effectiveness of the adversarial prompts and beliefs, respectively. In our experiments, we demonstrate that such a framework can reduce toxic content generation both in dynamic cases where an adversary directly interacts with a target model and static cases where we use a static benchmark dataset to evaluate our model.


"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

arXiv.org Artificial Intelligence

Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.


Multi-VALUE: A Framework for Cross-Dialectal English NLP

arXiv.org Artificial Intelligence

Dialect differences caused by regional, social, and economic factors cause performance discrepancies for many groups of language technology users. Inclusive and equitable language technology must critically be dialect invariant, meaning that performance remains constant over dialectal shifts. Current systems often fall short of this ideal since they are designed and tested on a single dialect: Standard American English (SAE). We introduce a suite of resources for evaluating and achieving English dialect invariance. The resource is called Multi-VALUE, a controllable rule-based translation system spanning 50 English dialects and 189 unique linguistic features. Multi-VALUE maps SAE to synthetic forms of each dialect. First, we use this system to stress tests question answering, machine translation, and semantic parsing. Stress tests reveal significant performance disparities for leading models on non-standard dialects. Second, we use this system as a data augmentation technique to improve the dialect robustness of existing systems. Finally, we partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task. To execute the transformation code, run model checkpoints, and download both synthetic and gold-standard dialectal benchmark datasets, see http://value-nlp.org.


Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

arXiv.org Artificial Intelligence

Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benchmark dataset covering different types of ambiguities that occur in these systems. We then propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with human intention in the presence of ambiguities.


BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

arXiv.org Artificial Intelligence

Recent advances in deep learning techniques have enabled machines to generate cohesive open-ended text when prompted with a sequence of words as context. While these models now empower many downstream applications from conversation bots to automatic storytelling, they have been shown to generate texts that exhibit social biases. To systematically study and benchmark social biases in open-ended language generation, we introduce the Bias in Open-Ended Language Generation Dataset (BOLD), a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology. We also propose new automated metrics for toxicity, psycholinguistic norms, and text gender polarity to measure social biases in open-ended text generation from multiple angles. An examination of text generated from three popular language models reveals that the majority of these models exhibit a larger social bias than human-written Wikipedia text across all domains. With these results we highlight the need to benchmark biases in open-ended language generation and caution users of language generation models on downstream tasks to be cognizant of these embedded prejudices.


Learning Geometry-Dependent and Physics-Based Inverse Image Reconstruction

arXiv.org Artificial Intelligence

Deep neural networks have shown great potential in image reconstruction problems in Euclidean space. However, many reconstruction problems involve imaging physics that are dependent on the underlying non-Euclidean geometry. In this paper, we present a new approach to learn inverse imaging that exploit the underlying geometry and physics. We first introduce a non-Euclidean encoding-decoding network that allows us to describe the unknown and measurement variables over their respective geometrical domains. We then learn the geometry-dependent physics in between the two domains by explicitly modeling it via a bipartite graph over the graphical embedding of the two geometry. We applied the presented network to reconstructing electrical activity on the heart surface from body-surface potential. In a series of generalization tasks with increasing difficulty, we demonstrated the improved ability of the presented network to generalize across geometrical changes underlying the data in comparison to its Euclidean alternatives.


Quantifying the Uncertainty in Model Parameters Using Gaussian Process-Based Markov Chain Monte Carlo: An Application to Cardiac Electrophysiological Models

arXiv.org Machine Learning

Estimation of patient-specific model parameters is important for personalized modeling, although sparse and noisy clinical data can introduce significant uncertainty in the estimated parameter values. This importance source of uncertainty, if left unquantified, will lead to unknown variability in model outputs that hinder their reliable adoptions. Probabilistic estimation model parameters, however, remains an unresolved challenge because standard Markov Chain Monte Carlo sampling requires repeated model simulations that are computationally infeasible. A common solution is to replace the simulation model with a computationally-efficient surrogate for a faster sampling. However, by sampling from an approximation of the exact posterior probability density function (pdf) of the parameters, the efficiency is gained at the expense of sampling accuracy. In this paper, we address this issue by integrating surrogate modeling into Metropolis Hasting (MH) sampling of the exact posterior pdfs to improve its acceptance rate. It is done by first quickly constructing a Gaussian process (GP) surrogate of the exact posterior pdfs using deterministic optimization. This efficient surrogate is then used to modify commonly-used proposal distributions in MH sampling such that only proposals accepted by the surrogate will be tested by the exact posterior pdf for acceptance/rejection, reducing unnecessary model simulations at unlikely candidates. Synthetic and real-data experiments using the presented method show a significant gain in computational efficiency without compromising the accuracy. In addition, insights into the non-identifiability and heterogeneity of tissue properties can be gained from the obtained posterior distributions.


Bayesian Optimization on Large Graphs via a Graph Convolutional Generative Model: Application in Cardiac Model Personalization

arXiv.org Machine Learning

Personalization of cardiac models involves the optimization of organ tissue properties that vary spatially over the non-Euclidean geometry model of the heart. To represent the high-dimensional (HD) unknown of tissue properties, most existing works rely on a low-dimensional (LD) partitioning of the geometrical model. While this exploits the geometry of the heart, it is of limited expressiveness to allow partitioning that is small enough for effective optimization. Recently, a variational auto-encoder (VAE) was utilized as a more expressive generative model to embed the HD optimization into the LD latent space. Its Euclidean nature, however, neglects the rich geometrical information in the heart. In this paper, we present a novel graph convolutional VAE to allow generative modeling of non-Euclidean data, and utilize it to embed Bayesian optimization of large graphs into a small latent space. This approach bridges the gap of previous works by introducing an expressive generative model that is able to incorporate the knowledge of spatial proximity and hierarchical compositionality of the underlying geometry. It further allows transferring of the learned features across different geometries, which was not possible with a regular VAE. We demonstrate these benefits of the presented method in synthetic and real data experiments of estimating tissue excitability in a cardiac electrophysiological model.


High-dimensional Bayesian Optimization of Personalized Cardiac Model Parameters via an Embedded Generative Model

arXiv.org Machine Learning

The estimation of patient-specific tissue properties in the form of model parameters is important for personalized physiological models. However, these tissue properties are spatially varying across the underlying anatomical model, presenting a significance challenge of high-dimensional (HD) optimization at the presence of limited measurement data. A common solution to reduce the dimension of the parameter space is to explicitly partition the anatomical mesh, either into a fixed small number of segments or a multi-scale hierarchy. This anatomy-based reduction of parameter space presents a fundamental bottleneck to parameter estimation, resulting in solutions that are either too low in resolution to reflect tissue heterogeneity, or too high in dimension to be reliably estimated within feasible computation. In this paper, we present a novel concept that embeds a generative variational auto-encoder (VAE) into the objective function of Bayesian optimization, providing an implicit low-dimensional (LD) search space that represents the generative code of the HD spatially-varying tissue properties. In addition, the VAE-encoded knowledge about the generative code is further used to guide the exploration of the search space. The presented method is applied to estimating tissue excitability in a cardiac electrophysiological model. Synthetic and real-data experiments demonstrate its ability to improve the accuracy of parameter estimation with more than 10x gain in efficiency.