Goto

Collaborating Authors

 Personal


'A piece of performance poetry': an absurd, decade-old Twitter account can teach us a lot about AI

The Guardian

More than a decade before an AI-powered chatbot could do your homework, help you make dinner or pass the bar exam, there was @Horse_ebooks. The primitive predecessor to today's chatbot renaissance began as a Twitter account in 2010, tweeting automated excerpts from ebooks that, decontextualized, took on unexpected and strangely poetic meanings. Purportedly a spambot, the account surfaced quotes from ebooks that went viral for their absurdist fragments โ€“ phrases like "Hello saxophone," "COULD THIS BE THE", and "Today we are lucky to be talking". It amassed more than 200,000 followers at its peak and now, despite being inactive for a decade, the account still holds 131,000 followers. Its most memorable quip โ€“ "everything happens so much" โ€“ still resonates today.


Fairness Certification for Natural Language Processing and Large Language Models

arXiv.org Artificial Intelligence

Natural Language Processing (NLP) plays an important role in our daily lives, particularly due to the enormous progress of Large Language Models (LLM). However, NLP has many fairness-critical use cases, e.g., as an expert system in recruitment or as an LLM-based tutor in education. Since NLP is based on human language, potentially harmful biases can diffuse into NLP systems and produce unfair results, discriminate against minorities or generate legal issues. Hence, it is important to develop a fairness certification for NLP approaches. We follow a qualitative research approach towards a fairness certification for NLP. In particular, we have reviewed a large body of literature on algorithmic fairness, and we have conducted semi-structured expert interviews with a wide range of experts from that area. We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories. Our criteria offer a foundation for operationalizing and testing processes to certify fairness, both from the perspective of the auditor and the audited organization.


Nobel Prize Winner Cautions on Rush Into STEM After Rise of AI

TIME - Tech

A Nobel Prize-winning labor market economist has cautioned younger generations against piling into studying science, technology, engineering, and mathematics (STEM) subjects, saying "empathetic" and creative skills may thrive in a world dominated by artificial intelligence. Christopher Pissarides, professor of economics at the London School of Economics, said that workers in certain IT jobs risk sowing their "own seeds of self-destruction" by advancing AI that will eventually take the same jobs in the future. While Pissarides is an optimist on AI's overall impact on the jobs market, he raised concerns for those taking STEM subjects hoping to ride the coattails of the technological advances. He said that despite rapid growth in the demand for STEM skills currently, jobs requiring more traditional face-to-face skills, such as in hospitality and healthcare, will still dominate the jobs market. "The skills that are needed now -- to collect the data, collate it, develop it, and use it to develop the next phase of AI or more to the point make AI more applicable for jobs -- will make the skills that are needed now obsolete because it will be doing the job," he said in an interview.


Cybercrime, AI supremacy and the metaverse: the tech stories that will dominate 2024

The Guardian

Partway through 2023, I caught up with a respected, high-ranking tech writer at another publication. We gossiped and nattered, and, a bit exasperated, empathised with each other: we were run ragged. The last two years have raised the stakes for what tech journalists do from serving a small niche community to covering stories that have an impact on the wider world. It's also down to the characters involved and what's at stake. Tech journalists have lived on fast-forward since Elon Musk first lodged his bid to take over Twitter โ€“ now X โ€“ in April 2022.


A Computational Framework for Behavioral Assessment of LLM Therapists

arXiv.org Artificial Intelligence

The emergence of ChatGPT and other large language models (LLMs) has greatly increased interest in utilizing LLMs as therapists to support individuals struggling with mental health challenges. However, due to the lack of systematic studies, our understanding of how LLM therapists behave, i.e., ways in which they respond to clients, is significantly limited. Understanding their behavior across a wide range of clients and situations is crucial to accurately assess their capabilities and limitations in the high-risk setting of mental health, where undesirable behaviors can lead to severe consequences. In this paper, we propose BOLT, a novel computational framework to study the conversational behavior of LLMs when employed as therapists. We develop an in-context learning method to quantitatively measure the behavior of LLMs based on 13 different psychotherapy techniques including reflections, questions, solutions, normalizing, and psychoeducation. Subsequently, we compare the behavior of LLM therapists against that of high- and low-quality human therapy, and study how their behavior can be modulated to better reflect behaviors observed in high-quality therapy. Our analysis of GPT and Llama-variants reveals that these LLMs often resemble behaviors more commonly exhibited in low-quality therapy rather than high-quality therapy, such as offering a higher degree of problem-solving advice when clients share emotions, which is against typical recommendations. At the same time, unlike low-quality therapy, LLMs reflect significantly more upon clients' needs and strengths. Our analysis framework suggests that despite the ability of LLMs to generate anecdotal examples that appear similar to human therapists, LLM therapists are currently not fully consistent with high-quality care, and thus require additional research to ensure quality care.


Uncovering Regulatory Affairs Complexity in Medical Products: A Qualitative Assessment Utilizing Open Coding and Natural Language Processing (NLP)

arXiv.org Artificial Intelligence

This study investigates the complexity of regulatory affairs in the medical device industry, a critical factor influencing market access and patient care. Through qualitative research, we sought expert insights to understand the factors contributing to this complexity. The study involved semi-structured interviews with 28 professionals from medical device companies, specializing in various aspects of regulatory affairs. These interviews were analyzed using open coding and Natural Language Processing (NLP) techniques. The findings reveal key sources of complexity within the regulatory landscape, divided into five domains: (A) Regulatory language complexity, (B) Intricacies within the regulatory process, (C) Global-level complexities, (D) Database-related considerations, and (E) Product-level issues. The participants highlighted the need for strategies to streamline regulatory compliance, enhance interactions between regulatory bodies and industry players, and develop adaptable frameworks for rapid technological advancements. Emphasizing interdisciplinary collaboration and increased transparency, the study concludes that these elements are vital for establishing coherent and effective regulatory procedures in the medical device sector.


Steering Llama 2 via Contrastive Activation Addition

arXiv.org Artificial Intelligence

We introduce Contrastive Activation Addition (CAA), an innovative method for steering language models by modifying activations during their forward passes. CAA computes ``steering vectors'' by averaging the difference in residual stream activations between pairs of positive and negative examples of a particular behavior such as factual versus hallucinatory responses. During inference, these steering vectors are added at all token positions after the user's prompt with either a positive or negative coefficient, allowing precise control over the degree of the targeted behavior. We evaluate CAA's effectiveness on Llama 2 Chat using both multiple-choice behavioral question datasets and open-ended generation tasks. We demonstrate that CAA significantly alters model behavior, outperforms traditional methods like finetuning and few-shot prompting, and minimally reduces capabilities. Moreover, by employing various activation space interpretation methods, we gain deeper insights into CAA's mechanisms. CAA both accurately steers model outputs and also sheds light on how high-level concepts are represented in Large Language Models (LLMs).


Cooperative Federated Learning over Ground-to-Satellite Integrated Networks: Joint Local Computation and Data Offloading

arXiv.org Artificial Intelligence

While network coverage maps continue to expand, many devices located in remote areas remain unconnected to terrestrial communication infrastructures, preventing them from getting access to the associated data-driven services. In this paper, we propose a ground-to-satellite cooperative federated learning (FL) methodology to facilitate machine learning service management over remote regions. Our methodology orchestrates satellite constellations to provide the following key functions during FL: (i) processing data offloaded from ground devices, (ii) aggregating models within device clusters, and (iii) relaying models/data to other satellites via inter-satellite links (ISLs). Due to the limited coverage time of each satellite over a particular remote area, we facilitate satellite transmission of trained models and acquired data to neighboring satellites via ISL, so that the incoming satellite can continue conducting FL for the region. We theoretically analyze the convergence behavior of our algorithm, and develop a training latency minimizer which optimizes over satellite-specific network resources, including the amount of data to be offloaded from ground devices to satellites and satellites' computation speeds. Through experiments on three datasets, we show that our methodology can significantly speed up the convergence of FL compared with terrestrial-only and other satellite baseline approaches.


How Not to Be Stupid About AI, With Yann LeCun

WIRED

Do not preach doom to Yann LeCun. A pioneer of modern AI and Meta's chief AI scientist, LeCun is one of the technology's most vocal defenders. He scoffs at his peers' dystopian scenarios of supercharged misinformation and even, eventually, human extinction. He's known to fire off a vicious tweet (or whatever they're called in the land of X) to call out the fearmongers. When his former collaborators Geoffrey Hinton and Yoshua Bengio put their names at the top of a statement calling AI a "societal-scale risk," LeCun stayed away.


Backdoor Attack with Sparse and Invisible Trigger

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) are vulnerable to backdoor attacks, where the adversary manipulates a small portion of training data such that the victim model predicts normally on the benign samples but classifies the triggered samples as the target class. The backdoor attack is an emerging yet threatening training-phase threat, leading to serious risks in DNN-based applications. In this paper, we revisit the trigger patterns of existing backdoor attacks. We reveal that they are either visible or not sparse and therefore are not stealthy enough. More importantly, it is not feasible to simply combine existing methods to design an effective sparse and invisible backdoor attack. To address this problem, we formulate the trigger generation as a bi-level optimization problem with sparsity and invisibility constraints and propose an effective method to solve it. The proposed method is dubbed sparse and invisible backdoor attack (SIBA). We conduct extensive experiments on benchmark datasets under different settings, which verify the effectiveness of our attack and its resistance to existing backdoor defenses. The codes for reproducing main experiments are available at \url{https://github.com/YinghuaGao/SIBA}.