Goto

Collaborating Authors

 Large Language Model


Help! My Political Beliefs Were Altered by a Chatbot!

WSJ.com: WSJD - Technology

When we ask ChatGPT or another bot to draft a memo, email, or presentation, we think these artificial-intelligence assistants are doing our bidding. A growing body of research shows that they also can change our thinking--without our knowing. One of the latest studies in this vein, from researchers spread across the globe, found that when subjects were asked to use an AI to help them write an essay, that AI could nudge them to write an essay either for or against a particular view, depending on the bias of the algorithm. Performing this exercise also measurably influenced the subjects' opinions on the topic, after the exercise.


Toward Artificial Empathy for Human-Centered Design: A Framework

arXiv.org Artificial Intelligence

In the early stages of the design process, designers explore opportunities by discovering unmet needs and developing innovative concepts as potential solutions. From a human-centered design perspective, designers must develop empathy with people to truly understand their needs. However, developing empathy is a complex and subjective process that relies heavily on the designer's empathic capability. Therefore, the development of empathic understanding is intuitive, and the discovery of underlying needs is often serendipitous. This paper aims to provide insights from artificial intelligence research to indicate the future direction of AI-driven human-centered design, taking into account the essential role of empathy. Specifically, we conduct an interdisciplinary investigation of research areas such as data-driven user studies, empathic understanding development, and artificial empathy. Based on this foundation, we discuss the role that artificial empathy can play in human-centered design and propose an artificial empathy framework for human-centered design. Building on the mechanisms behind empathy and insights from empathic design research, the framework aims to break down the rather complex and subjective concept of empathy into components and modules that can potentially be modeled computationally. Furthermore, we discuss the expected benefits of developing such systems and identify current research gaps to encourage future research efforts.


Bridging History with AI A Comparative Evaluation of GPT 3.5, GPT4, and GoogleBARD in Predictive Accuracy and Fact Checking

arXiv.org Artificial Intelligence

The rapid proliferation of information in the digital era underscores the importance of accurate historical representation and interpretation. While artificial intelligence has shown promise in various fields, its potential for historical fact-checking and gap-filling remains largely untapped. This study evaluates the performance of three large language models LLMs GPT 3.5, GPT 4, and GoogleBARD in the context of predicting and verifying historical events based on given data. A novel metric, Distance to Reality (DTR), is introduced to assess the models' outputs against established historical facts. The results reveal a substantial potential for AI in historical studies, with GPT 4 demonstrating superior performance. This paper underscores the need for further research into AI's role in enriching our understanding of the past and bridging historical knowledge gaps.


Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources

arXiv.org Artificial Intelligence

Machine learning (ML) in healthcare presents numerous opportunities for enhancing patient care, population health, and healthcare providers' workflows. However, the real-world clinical and cost benefits remain limited due to challenges in data privacy, heterogeneous data sources, and the inability to fully leverage multiple data modalities. In this perspective paper, we introduce "patchwork learning" (PL), a novel paradigm that addresses these limitations by integrating information from disparate datasets composed of different data modalities (e.g., clinical free-text, medical images, omics) and distributed across separate and secure sites. PL allows the simultaneous utilization of complementary data sources while preserving data privacy, enabling the development of more holistic and generalizable ML models. We present the concept of patchwork learning and its current implementations in healthcare, exploring the potential opportunities and applicable data sources for addressing various healthcare challenges. PL leverages bridging modalities or overlapping feature spaces across sites to facilitate information sharing and impute missing data, thereby addressing related prediction tasks. We discuss the challenges associated with PL, many of which are shared by federated and multimodal learning, and provide recommendations for future research in this field. By offering a more comprehensive approach to healthcare data integration, patchwork learning has the potential to revolutionize the clinical applicability of ML models. This paradigm promises to strike a balance between personalization and generalizability, ultimately enhancing patient experiences, improving population health, and optimizing healthcare providers' workflows. Introduction Machine learning (ML) in healthcare is a rapidly evolving field, presenting numerous opportunities for progress. Active and passive patient data collection, both during and outside medical care, can be utilized to address health challenges. As a result, ML has become an essential tool for processing and analyzing these data in various domains, including natural language processing, computer vision, and more. ML systems have demonstrated their potential to enhance patient experiences, improve population health, reduce per capita healthcare costs, and optimize healthcare providers' workflows Data privacy is a major challenge facing the use of ML in healthcare, as it restricts the potential for pooling electronic health record (EHR) data from multiple sites. While single modality models exist (e.g., clinical notes, lab tests, omics, or medical images), systems that simultaneously leverage multiple modalities are relatively scarce. MML combines disparate data sources to capitalize on complementary information, thereby improving performance.


PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices

arXiv.org Artificial Intelligence

Pipeline parallelism enables efficient training of Large Language Models (LLMs) on large-scale distributed accelerator clusters. Yet, pipeline bubbles during startup and tear-down reduce the utilization of accelerators. Although efficient pipeline schemes with micro-batching and bidirectional pipelines have been proposed to maximize utilization, a significant number of bubbles cannot be filled using synchronous forward and backward passes. To address this problem, we suggest that extra work be assigned to the bubbles to gain auxiliary benefits in LLM training. As an example in this direction, we propose PipeFisher, which assigns the work of K-FAC, a second-order optimization method based on the Fisher information matrix, to the bubbles to accelerate convergence. In Phase 1 pretraining of BERT-Base and -Large models, PipeFisher reduces the (simulated) training time to 50-75% compared to training with a first-order optimizer by greatly improving the accelerator utilization and benefiting from the improved convergence by K-FAC.


Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics

arXiv.org Artificial Intelligence

In this study, we investigate the capacity of large language models (LLMs), specifically GPT-3.5, to operationalise natural language descriptions of cooperative, competitive, altruistic, and self-interested behavior in social dilemmas. Our focus is on the iterated Prisoner's Dilemma, a classic example of a non-zero-sum interaction, but our broader research program encompasses a range of experimental economics scenarios, including the ultimatum game, dictator game, and public goods game. Using a within-subject experimental design, we instantiated LLM-generated agents with various prompts that conveyed different cooperative and competitive stances. We then assessed the agents' level of cooperation in the iterated Prisoner's Dilemma, taking into account their responsiveness to the cooperative or defection actions of their partners. Our results provide evidence that LLMs can translate natural language descriptions of altruism and selfishness into appropriate behaviour to some extent, but exhibit limitations in adapting their behavior based on conditioned reciprocity. The observed pattern of increased cooperation with defectors and decreased cooperation with cooperators highlights potential constraints in the LLM's ability to generalize its knowledge about human behavior in social dilemmas. We call upon the research community to further explore the factors contributing to the emergent behavior of LLM-generated agents in a wider array of social dilemmas, examining the impact of model architecture, training parameters, and various partner strategies on agent behavior. As more advanced LLMs like GPT-4 become available, it is crucial to investigate whether they exhibit similar limitations or are capable of more nuanced cooperative behaviors, ultimately fostering the development of AI systems that better align with human values and social norms.


Beyond the Safeguards: Exploring the Security Risks of ChatGPT

arXiv.org Artificial Intelligence

The increasing popularity of large language models (LLMs) such as ChatGPT has led to growing concerns about their safety, security risks, and ethical implications. This paper aims to provide an overview of the different types of security risks associated with ChatGPT, including malicious text and code generation, private data disclosure, fraudulent services, information gathering, and producing unethical content. We present an empirical study examining the effectiveness of ChatGPT's content filters and explore potential ways to bypass these safeguards, demonstrating the ethical implications and security risks that persist in LLMs even when protections are in place. Based on a qualitative analysis of the security implications, we discuss potential strategies to mitigate these risks and inform researchers, policymakers, and industry professionals about the complex security challenges posed by LLMs like ChatGPT. This study contributes to the ongoing discussion on the ethical and security implications of LLMs, underscoring the need for continued research in this area.


Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge

arXiv.org Artificial Intelligence

Large language models (LLMs) have been widely studied for their ability to store and utilize positive knowledge. However, negative knowledge, such as "lions don't live in the ocean", is also ubiquitous in the world but rarely mentioned explicitly in the text. What do LLMs know about negative knowledge? This work examines the ability of LLMs to negative commonsense knowledge. We design a constrained keywords-to-sentence generation task (CG) and a Boolean question-answering task (QA) to probe LLMs. Our experiments reveal that LLMs frequently fail to generate valid sentences grounded in negative commonsense knowledge, yet they can correctly answer polar yes-or-no questions. We term this phenomenon the belief conflict of LLMs. Our further analysis shows that statistical shortcuts and negation reporting bias from language modeling pre-training cause this conflict.


Natural Language Reasoning, A Survey

arXiv.org Artificial Intelligence

This survey paper proposes a clearer view of natural language reasoning in the field of Natural Language Processing (NLP), both conceptually and practically. Conceptually, we provide a distinct definition for natural language reasoning in NLP, based on both philosophy and NLP scenarios, discuss what types of tasks require reasoning, and introduce a taxonomy of reasoning. Practically, we conduct a comprehensive literature review on natural language reasoning in NLP, mainly covering classical logical reasoning, natural language inference, multi-hop question answering, and commonsense reasoning. The paper also identifies and views backward reasoning, a powerful paradigm for multi-step reasoning, and introduces defeasible reasoning as one of the most important future directions in natural language reasoning research. We focus on single-modality unstructured natural language text, excluding neuro-symbolic techniques and mathematical reasoning.


Scalable Educational Question Generation with Pre-trained Language Models

arXiv.org Artificial Intelligence

The automatic generation of educational questions will play a key role in scaling online education, enabling self-assessment at scale when a global population is manoeuvring their personalised learning journeys. We develop \textit{EduQG}, a novel educational question generation model built by adapting a large language model. Our extensive experiments demonstrate that \textit{EduQG} can produce superior educational questions by further pre-training and fine-tuning a pre-trained language model on the scientific text and science question data.