Personal
On logic and generative AI
Gurevich, Yuri, Blass, Andreas
This article was originally written for the June 2024 issue of the Bulletin of European Association for Theoretical Computer Science, in the framework of the "Logic in Computer Science" column administered by Yuri Gurevich. In the following pages, the article is reproduced as is. The ongoing AI revolution raises many foundational problems. For quite a while, I felt that the issue needs to be addressed in this column. Not being an AI expert, I was looking for volunteers. This didn't work, and so one day I took a deep breath and started to write an article myself. Andreas Blass, my long-time collaborator, was reluctant to join me, but eventually he agreed. A hundred years ago, logic was almost synonymous with foundational studies. I tried to rekindle that tradition in [5]. The goal of the following dialog is to provoke young logicians with a taste for foundations to notice the foundational problems raised by the ongoing AI revolution. I think the most beautiful thing about deep learning is that it actually works. Q: I just learned that Daniel Kahneman, Nobel laureate in economics and the author of "Thinking, fast and slow" [7], passed away on March 27, 2024. I heard a lot about this book but have never read it.
Towards Efficient Neuro-Symbolic AI: From Workload Characterization to Hardware Architecture
Wan, Zishen, Liu, Che-Kai, Yang, Hanchen, Raj, Ritik, Li, Chaojian, You, Haoran, Fu, Yonggan, Wan, Cheng, Li, Sixu, Kim, Youbin, Samajdar, Ananda, Lin, Yingyan Celine, Ibrahim, Mohamed, Rabaey, Jan M., Krishna, Tushar, Raychowdhury, Arijit
The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, limited robustness, and a lack of explainability. To develop next-generation cognitive AI systems, neuro-symbolic AI emerges as a promising paradigm, fusing neural and symbolic approaches to enhance interpretability, robustness, and trustworthiness, while facilitating learning from much less data. Recent neuro-symbolic systems have demonstrated great potential in collaborative human-AI scenarios with reasoning and cognitive capabilities. In this paper, we aim to understand the workload characteristics and potential architectures for neuro-symbolic AI. We first systematically categorize neuro-symbolic AI algorithms, and then experimentally evaluate and analyze them in terms of runtime, memory, computational operators, sparsity, and system characteristics on CPUs, GPUs, and edge SoCs. Our studies reveal that neuro-symbolic models suffer from inefficiencies on off-the-shelf hardware, due to the memory-bound nature of vector-symbolic and logical operations, complex flow control, data dependencies, sparsity variations, and limited scalability. Based on profiling insights, we suggest cross-layer optimization solutions and present a hardware acceleration case study for vector-symbolic architecture to improve the performance, efficiency, and scalability of neuro-symbolic computing. Finally, we discuss the challenges and potential future directions of neuro-symbolic AI from both system and architectural perspectives.
"I Never Said That": A dataset, taxonomy and baselines on response clarity classification
Thomas, Konstantinos, Filandrianos, Giorgos, Lymperaiou, Maria, Zerva, Chrysoula, Stamou, Giorgos
Equivocation and ambiguity in public speech are well-studied discourse phenomena, especially in political science and analysis of political interviews. Inspired by the well-grounded theory on equivocation, we aim to resolve the closely related problem of response clarity in questions extracted from political interviews, leveraging the capabilities of Large Language Models (LLMs) and human expertise. To this end, we introduce a novel taxonomy that frames the task of detecting and classifying response clarity and a corresponding clarity classification dataset which consists of question-answer (QA) pairs drawn from political interviews and annotated accordingly. Our proposed two-level taxonomy addresses the clarity of a response in terms of the information provided for a given question (high-level) and also provides a fine-grained taxonomy of evasion techniques that relate to unclear, ambiguous responses (lower-level). We combine ChatGPT and human annotators to collect, validate and annotate discrete QA pairs from political interviews, to be used for our newly introduced response clarity task. We provide a detailed analysis and conduct several experiments with different model architectures, sizes and adaptation methods to gain insights and establish new baselines over the proposed dataset and task.
EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models
Chen, Yuyan, Wang, Hao, Yan, Songzhou, Liu, Sijia, Li, Yueze, Zhao, Yi, Xiao, Yanghua
Emotional intelligence in large language models (LLMs) is of great importance in Natural Language Processing. However, the previous research mainly focus on basic sentiment analysis tasks, such as emotion recognition, which is not enough to evaluate LLMs' overall emotional intelligence. Therefore, this paper presents a novel framework named EmotionQueen for evaluating the emotional intelligence of LLMs. The framework includes four distinctive tasks: Key Event Recognition, Mixed Event Recognition, Implicit Emotional Recognition, and Intention Recognition. LLMs are requested to recognize important event or implicit emotions and generate empathetic response. We also design two metrics to evaluate LLMs' capabilities in recognition and response for emotion-related statements. Experiments yield significant conclusions about LLMs' capabilities and limitations in emotion intelligence.
Who's the next LAPD chief? Likely finalists spotted at mayor's mansion
Things to Do in L.A. Tap to enable a layout that focuses on the article. Two likely finalists spotted at mayor's mansion From left: Former LAPD Deputy Chief Robert Arcos, LAPD Deputy Chief Emada Tingirides and former Los Angeles County Sheriff Jim McDonnell. Published Sept. 18, 2024 Updated Sept. 19, 2024 10:46 AM PT Mayor Karen Bass said she would conduct a nationwide search for the next chief of the Los Angeles Police Department, but in the end it seems she found three finalists close to home. Deputy Chief Emada Tingirides and Robert "Bobby" Arcos, a former LAPD assistant chief who works in the L.A. County district attorney's office, were seen arriving at Getty House, the mayor's residence, for their candidate interviews over the span of a few hours Tuesday. The third candidate is said to be former Los Angeles County Sheriff Jim McDonnell, who also served in the LAPD, leaving as first assistant chief.
Extracting Memorized Training Data via Decomposition
Su, Ellen, Vellore, Anu, Chang, Amy, Mura, Raffaele, Nelson, Blaine, Kassianik, Paul, Karbasi, Amin
The widespread use of Large Language Models (LLMs) in society creates new information security challenges for developers, organizations, and end-users alike. LLMs are trained on large volumes of data, and their susceptibility to reveal the exact contents of the source training datasets poses security and safety risks. Although current alignment procedures restrict common risky behaviors, they do not completely prevent LLMs from leaking data. Prior work demonstrated that LLMs may be tricked into divulging training data by using out-of-distribution queries or adversarial techniques. In this paper, we demonstrate a simple, query-based decompositional method to extract news articles from two frontier LLMs. We use instruction decomposition techniques to incrementally extract fragments of training data. Out of 3723 New York Times articles, we extract at least one verbatim sentence from 73 articles, and over 20% of verbatim sentences from 6 articles. Our analysis demonstrates that this method successfully induces the LLM to generate texts that are reliable reproductions of news articles, meaning that they likely originate from the source training dataset. This method is simple, generalizable, and does not fine-tune or change the production model. If replicable at scale, this training data extraction methodology could expose new LLM security and safety vulnerabilities, including privacy risks and unauthorized data leaks. These implications require careful consideration from model development to its end-use.
Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts
Liang, Jenny T., Lin, Melissa, Rao, Nikitha, Myers, Brad A.
The introduction of generative pre-trained models, like GPT-4, has introduced a phenomenon known as prompt engineering, whereby model users repeatedly write and revise prompts while trying to achieve a task. Using these AI models for intelligent features in software applications require using APIs that are controlled through developer-written prompts. These prompts have powered AI experiences in popular software products, potentially reaching millions of users. Despite the growing impact of prompt-powered software, little is known about its development process and its relationship to programming. In this work, we argue that some forms of prompts are programs, and that the development of prompts is a distinct phenomenon in programming. We refer to this phenomenon as prompt programming. To this end, we develop an understanding of prompt programming using Straussian grounded theory through interviews with 20 developers engaged in prompt development across a variety of contexts, models, domains, and prompt complexities. Through this study, we contribute 14 observations about prompt programming. For example, rather than building mental models of code, prompt programmers develop mental models of the FM's behavior on the prompt and its unique qualities by interacting with the model. While prior research has shown that experts have well-formed mental models, we find that prompt programmers who have developed dozens of prompts, each with many iterations, still struggle to develop reliable mental models. This contributes to a rapid and unsystematic development process. Taken together, our observations indicate that prompt programming is significantly different from traditional software development, motivating the creation of tools to support prompt programming. Our findings have implications for software engineering practitioners, educators, and researchers.
Interview with Jerone Andrews: a framework towards evaluating diversity in datasets
Jerone Andrews, Dora Zhao, Orestis Papakyriakopoulos and Alice Xiang won a best paper award at the International Conference on Machine Learning (ICML) for their position paper Measure Dataset Diversity. We spoke to Jerone about the team's methodology, and how they developed a framework for conceptualising, operationalising, and evaluating diversity in machine learning datasets. In our paper, we propose using measurement theory from the social sciences as a framework to improve the collection and evaluation of diverse machine learning datasets. Measurement theory offers a systematic and scientifically grounded approach to developing precise numerical representations of complex and abstract concepts, making it particularly suitable for tasks like conceptualising, operationalising, and evaluating qualities such as diversity in datasets. This framework can also be applied to other constructs like bias or difficulty.
The 1st InterAI Workshop: Interactive AI for Human-centered Robotics
Zhang, Yuchong, Yadollahi, Elmira, Ma, Yong, Fu, Di, Leite, Iolanda, Kragic, Danica
Her research, at the intersection of machine and challenges in human-centered interactive artificial learning and human-robot interaction, explores intelligence (AI) within the field of human-robot interaction two broad questions through an interdisciplinary lens: (HRI). It will focus on the integration of AI technologies that how to learn human behavior from multimodal data, and enhance human-robot collaboration, ensuring these interactions how to transfer this knowledge to robots for learning, are intuitive, efficient, and tailored to human needs and action, and interaction. Her work has been supported behaviors [1].
AACessTalk: Fostering Communication between Minimally Verbal Autistic Children and Parents with Contextual Guidance and Card Recommendation
Choi, Dasom, Park, SoHyun, Lee, Kyungah, Hong, Hwajung, Kim, Young-Ho
As minimally verbal autistic (MVA) children communicate with parents through few words and nonverbal cues, parents often struggle to encourage their children to express subtle emotions and needs and to grasp their nuanced signals. We present AACessTalk, a tablet-based, AI-mediated communication system that facilitates meaningful exchanges between an MVA child and a parent. AACessTalk provides real-time guides to the parent to engage the child in conversation and, in turn, recommends contextual vocabulary cards to the child. Through a two-week deployment study with 11 MVA child-parent dyads, we examine how AACessTalk fosters everyday conversation practice and mutual engagement. Our findings show high engagement from all dyads, leading to increased frequency of conversation and turn-taking. AACessTalk also encouraged parents to explore their own interaction strategies and empowered the children to have more agency in communication. We discuss the implications of designing technologies for balanced communication dynamics in parent-MVA child interaction.