Yadav, Amulya
Code-mixed LLM: Improve Large Language Models' Capability to Handle Code-Mixing through Reinforcement Learning from AI Feedback
Zhang, Wenbo, Majumdar, Aditya, Yadav, Amulya
Code-mixing(CM) or code-switching(CSW) refers to the juxtaposition of linguistic units from two or more languages during the conversation or sometimes even a single utterance. Code-mixing introduces unique challenges in daily life, such as syntactic mismatches and semantic blending, that are rarely encountered in monolingual settings. Large language models (LLMs) have revolutionized the field of natural language processing (NLP) by offering unprecedented capabilities in understanding human languages. However, the effectiveness of current state-of-the-art multilingual LLMs has not yet been fully explored in the CM scenario. To fill this gap, we first benchmark the performance of multilingual LLMs on various code-mixing NLP tasks. Then we propose to improve the multilingual LLMs' ability to understand code-mixing through reinforcement learning from human feedback (RLHF) and code-mixed machine translation tasks. Given the high-cost and time-consuming preference labeling procedure, we improve this by utilizing LLMs as annotators to perform the reinforcement learning from AI feedback (RLAIF). The experiments show the effectiveness of the proposed method.
The Reopening of Pandora's Box: Analyzing the Role of LLMs in the Evolving Battle Against AI-Generated Fake News
Wang, Xinyu, Zhang, Wenbo, Koneru, Sai, Guo, Hangzhi, Mingole, Bonam, Sundar, S. Shyam, Rajtmajer, Sarah, Yadav, Amulya
With the rise of AI-generated content spewed at scale from large language models (LLMs), genuine concerns about the spread of fake news have intensified. The perceived ability of LLMs to produce convincing fake news at scale poses new challenges for both human and automated fake news detection systems. To address this gap, this work presents the findings from a university-level competition which aimed to explore how LLMs can be used by humans to create fake news, and to assess the ability of human annotators and AI models to detect it. A total of 110 participants used LLMs to create 252 unique fake news stories, and 84 annotators participated in the detection tasks. Our findings indicate that LLMs are ~68% more effective at detecting real news than humans. However, for fake news detection, the performance of LLMs and humans remains comparable (~60% accuracy). Additionally, we examine the impact of visual elements (e.g., pictures) in news on the accuracy of detecting fake news stories. Finally, we also examine various strategies used by fake news creators to enhance the credibility of their AI-generated content. This work highlights the increasing complexity of detecting AI-generated fake news, particularly in collaborative human-AI settings.
Hey GPT, Can You be More Racist? Analysis from Crowdsourced Attempts to Elicit Biased Content from Generative AI
Guo, Hangzhi, Venkit, Pranav Narayanan, Jang, Eunchae, Srinath, Mukund, Zhang, Wenbo, Mingole, Bonam, Gupta, Vipul, Varshney, Kush R., Sundar, S. Shyam, Yadav, Amulya
The widespread adoption of large language models (LLMs) and generative AI (GenAI) tools across diverse applications has amplified the importance of addressing societal biases inherent within these technologies. While the NLP community has extensively studied LLM bias, research investigating how non-expert users perceive and interact with biases from these systems remains limited. As these technologies become increasingly prevalent, understanding this question is crucial to inform model developers in their efforts to mitigate bias. To address this gap, this work presents the findings from a university-level competition, which challenged participants to design prompts for eliciting biased outputs from GenAI tools. We quantitatively and qualitatively analyze the competition submissions and identify a diverse set of biases in GenAI and strategies employed by participants to induce bias in GenAI. Our finding provides unique insights into how non-expert users perceive and interact with biases from GenAI tools.
Watermarking Counterfactual Explanations
Guo, Hangzhi, Yadav, Amulya
The field of Explainable Artificial Intelligence (XAI) focuses on techniques for providing explanations to end-users about the decision-making processes that underlie modern-day machine learning (ML) models. Within the vast universe of XAI techniques, counterfactual (CF) explanations are often preferred by end-users as they help explain the predictions of ML models by providing an easy-to-understand & actionable recourse (or contrastive) case to individual end-users who are adversely impacted by predicted outcomes. However, recent studies have shown significant security concerns with using CF explanations in real-world applications; in particular, malicious adversaries can exploit CF explanations to perform query-efficient model extraction attacks on proprietary ML models. In this paper, we propose a model-agnostic watermarking framework (for adding watermarks to CF explanations) that can be leveraged to detect unauthorized model extraction attacks (which rely on the watermarked CF explanations). Our novel framework solves a bi-level optimization problem to embed an indistinguishable watermark into the generated CF explanation such that any future model extraction attacks that rely on these watermarked CF explanations can be detected using a null hypothesis significance testing (NHST) scheme, while ensuring that these embedded watermarks do not compromise the quality of the generated CF explanations. We evaluate this framework's performance across a diverse set of real-world datasets, CF explanation methods, and model extraction techniques, and show that our watermarking detection system can be used to accurately identify extracted ML models that are trained using the watermarked CF explanations. Our work paves the way for the secure adoption of CF explanations in real-world applications.
A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity
Zhang, Wenbo, Guo, Hangzhi, Kivlichan, Ian D, Prabhakaran, Vinodkumar, Yadav, Davis, Yadav, Amulya
Toxicity is an increasingly common and severe issue in online spaces. Consequently, a rich line of machine learning research over the past decade has focused on computationally detecting and mitigating online toxicity. These efforts crucially rely on human-annotated datasets that identify toxic content of various kinds in social media texts. However, such annotations historically yield low inter-rater agreement, which was often dealt with by taking the majority vote or other such approaches to arrive at a single ground truth label. Recent research has pointed out the importance of accounting for the subjective nature of this task when building and utilizing these datasets, and this has triggered work on analyzing and better understanding rater disagreements, and how they could be effectively incorporated into the machine learning developmental pipeline. While these efforts are filling an important gap, there is a lack of a broader framework about the root causes of rater disagreement, and therefore, we situate this work within that broader landscape. In this survey paper, we analyze a broad set of literature on the reasons behind rater disagreements focusing on online toxicity, and propose a detailed taxonomy for the same. Further, we summarize and discuss the potential solutions targeting each reason for disagreement. We also discuss several open issues, which could promote the future development of online toxicity research.
RoCourseNet: Distributionally Robust Training of a Prediction Aware Recourse Model
Guo, Hangzhi, Jia, Feiran, Chen, Jinghui, Squicciarini, Anna, Yadav, Amulya
Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due to commonly occurring distributional shifts in training data, ML models constantly get updated in practice, which might render previously generated recourses invalid and diminish end-users trust in our algorithmic framework. To address this problem, we propose RoCourseNet, a training framework that jointly optimizes predictions and recourses that are robust to future data shifts. This work contains four key contributions: (1) We formulate the robust recourse generation problem as a tri-level optimization problem which consists of two sub-problems: (i) a bi-level problem that finds the worst-case adversarial shift in the training data, and (ii) an outer minimization problem to generate robust recourses against this worst-case shift. (2) We leverage adversarial training to solve this tri-level optimization problem by: (i) proposing a novel virtual data shift (VDS) algorithm to find worst-case shifted ML models via explicitly considering the worst-case data shift in the training dataset, and (ii) a block-wise coordinate descent procedure to optimize for prediction and corresponding robust recourses. (3) We evaluate RoCourseNet's performance on three real-world datasets, and show that RoCourseNet consistently achieves more than 96% robust validity and outperforms state-of-the-art baselines by at least 10% in generating robust CF explanations. (4) Finally, we generalize the RoCourseNet framework to accommodate any parametric post-hoc methods for improving robust validity.
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations
Guo, Hangzhi, Nguyen, Thanh Hong, Yadav, Amulya
This work presents CounterNet, a novel end-to-end learning framework which integrates Machine Learning (ML) model training and the generation of corresponding counterfactual (CF) explanations into a single end-to-end pipeline. Counterfactual explanations offer a contrastive case, i.e., they attempt to find the smallest modification to the feature values of an instance that changes the prediction of the ML model on that instance to a predefined output. Prior techniques for generating CF explanations suffer from two major limitations: (i) all of them are post-hoc methods designed for use with proprietary ML models -- as a result, their procedure for generating CF explanations is uninformed by the training of the ML model, which leads to misalignment between model predictions and explanations; and (ii) most of them rely on solving separate time-intensive optimization problems to find CF explanations for each input data point (which negatively impacts their runtime). This work makes a novel departure from the prevalent post-hoc paradigm (of generating CF explanations) by presenting CounterNet, an end-to-end learning framework which integrates predictive model training and the generation of counterfactual (CF) explanations into a single pipeline. Unlike post-hoc methods, CounterNet enables the optimization of the CF explanation generation only once together with the predictive model. We adopt a block-wise coordinate descent procedure which helps in effectively training CounterNet's network. Our extensive experiments on multiple real-world datasets show that CounterNet generates high-quality predictions, and consistently achieves 100% CF validity and low proximity scores (thereby achieving a well-balanced cost-invalidity trade-off) for any new input instance, and runs 3X faster than existing state-of-the-art baselines.
Reports on the 2017 AAAI Spring Symposium Series
Bohg, Jeannette (Max Planck Institute for Intelligent Systems) | Boix, Xavier (Massachusetts Institute of Technology) | Chang, Nancy (Google) | Churchill, Elizabeth F. (Google) | Chu, Vivian (Georgia Institute of Technology) | Fang, Fei (Harvard University) | Feldman, Jerome (University of California at Berkeley) | Gonzรกlez, Avelino J. (University of Central Florida) | Kido, Takashi (Preferred Networks in Japan) | Lawless, William F. (Paine College) | Montaรฑa, Josรฉ L. (University of Cantabria) | Ontaรฑรณn, Santiago (Drexel University) | Sinapov, Jivko (University of Texas at Austin) | Sofge, Don (Naval Research Laboratory) | Steels, Luc (Institut de Biologia Evolutiva) | Steenson, Molly Wright (Carnegie Mellon University) | Takadama, Keiki (University of Electro-Communications) | Yadav, Amulya (University of Southern California)
Reports on the 2017 AAAI Spring Symposium Series
Bohg, Jeannette (Max Planck Institute for Intelligent Systems) | Boix, Xavier (Massachusetts Institute of Technology) | Chang, Nancy (Google) | Churchill, Elizabeth F. (Google) | Chu, Vivian (Georgia Institute of Technology) | Fang, Fei (Harvard University) | Feldman, Jerome (University of California at Berkeley) | Gonzรกlez, Avelino J. (University of Central Florida) | Kido, Takashi (Preferred Networks in Japan) | Lawless, William F. (Paine College) | Montaรฑa, Josรฉ L. (University of Cantabria) | Ontaรฑรณn, Santiago (Drexel University) | Sinapov, Jivko (University of Texas at Austin) | Sofge, Don (Naval Research Laboratory) | Steels, Luc (Institut de Biologia Evolutiva) | Steenson, Molly Wright (Carnegie Mellon University) | Takadama, Keiki (University of Electro-Communications) | Yadav, Amulya (University of Southern California)
It is also important to remember that having a very sharp distinction of AI A rise in real-world applications of AI has stimulated for social good research is not always feasible, and significant interest from the public, media, and policy often unnecessary. While there has been significant makers. Along with this increasing attention has progress, there still exist many major challenges facing come a media-fueled concern about purported negative the design of effective AIbased approaches to deal consequences of AI, which often overlooks the with the difficulties in real-world domains. One of the societal benefits that AI is delivering and can deliver challenges is interpretability since most algorithms for in the near future. To address these concerns, the AI for social good problems need to be used by human symposium on Artificial Intelligence for the Social end users. Second, the lack of access to valuable data Good (AISOC-17) highlighted the benefits that AI can that could be crucial to the development of appropriate bring to society right now. It brought together AI algorithms is yet another challenge. Third, the researchers and researchers, practitioners, experts, data that we get from the real world is often noisy and and policy makers from a wide variety of domains.
Simultaneous Influencing and Mapping for Health Interventions
Marcolino, Leandro Soriano (University of Southern California) | Lakshminarayanan, Aravind (Indian Institute of Technology, Madras) | Yadav, Amulya (University of Southern California) | Tambe, Milind (University of Southern California)
Influence Maximization is an active topic, but it was always assumed full knowledge of the social network graph. However, the graph may actually be unknown beforehand. For example, when selecting a subset of a homeless population to attend interventions concerning health, we deal with a network that is not fully known. Hence, we introduce the novel problem of simultaneously influencing and mapping (i.e., learning) the graph. We study a class of algorithms, where we show that: (i) traditional algorithms may have arbitrarily low performance; (ii) we can effectively influence and map when the independence of objectives hypothesis holds; (iii) when it does not hold, the upper bound for the influence loss converges to 0. We run extensive experiments over four real-life social networks, where we study two alternative models, and obtain significantly better results in both than traditional approaches.