Personal
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Zhou, Zhanhui, Liu, Jie, Yang, Chao, Shao, Jing, Liu, Yu, Yue, Xiangyu, Ouyang, Wanli, Qiao, Yu
A single language model (LM), despite aligning well with an average labeler through reinforcement learning from human feedback (RLHF), may not universally suit diverse human preferences. Recent approaches therefore opt for customization by collecting multi-dimensional feedback and creating distinct reward models (RMs) for each dimension (e.g., helpfulness, harmlessness, or honesty). Different LMs can then be optimized for different preferences using multi-objective RLHF (MORLHF) with different reward weightings. Yet, RL fine-tuning is unstable and resource-heavy, especially for MORLHF with diverse and usually conflicting objectives. In this paper, we present Multi-Objective Direct Preference Optimization (MODPO), an RL-free algorithm that extends Direct Preference Optimization (DPO) for multiple alignment objectives with minimal overheads. Essentially, MODPO folds language modeling directly into reward modeling, training LMs as implicit collective reward models (cRMs) that combine all objectives with specific weightings. While theoretically guaranteed to produce the same optimal solutions as MORLHF, MODPO is practically more stable and computationally efficient. Empirical results from safety alignment and long-form question answering confirm that MODPO matches or outperforms existing methods, consistently producing a Pareto front of LMs that cater to diverse preferences with 3 times less computational resources compared to MORLHF.
SER_AMPEL: a multi-source dataset for speech emotion recognition of Italian older adults
Grossi, Alessandra, Gasparini, Francesca
In this paper, SER_AMPEL, a multi-source dataset for speech emotion recognition (SER) is presented. The peculiarity of the dataset is that it is collected with the aim of providing a reference for speech emotion recognition in case of Italian older adults. The dataset is collected following different protocols, in particular considering acted conversations, extracted from movies and TV series, and recording natural conversations where the emotions are elicited by proper questions. The evidence of the need for such a dataset emerges from the analysis of the state of the art. Preliminary considerations on the critical issues of SER are reported analyzing the classification results on a subset of the proposed dataset.
Character-LLM: A Trainable Agent for Role-Playing
Shao, Yunfan, Li, Linyang, Dai, Junqi, Qiu, Xipeng
Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts. Such ability stimulates us to wonder whether LLMs can simulate a person in a higher form than simple human behaviors. Therefore, we aim to train an agent with the profile, experience, and emotional states of a specific person instead of using limited prompts to instruct ChatGPT API. In this work, we introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc. Our method focuses on editing profiles as experiences of a certain character and training models to be personal simulacra with these experiences. To assess the effectiveness of our approach, we build a test playground that interviews trained agents and evaluates whether the agents \textit{memorize} their characters and experiences. Experimental results show interesting observations that help build future simulacra of humankind.
Piezoelectric Soft Robot Inchworm Motion by Tuning Ground Friction through Robot Shape: Quasi-Static Modeling and Experimental Validation
Zheng, Zhiwu, Kumar, Prakhar, Chen, Yenan, Cheng, Hsin, Wagner, Sigurd, Chen, Minjie, Verma, Naveen, Sturm, James C.
Electrically-driven soft robots based on piezoelectric actuators may enable compact form factors and maneuverability in complex environments. In most prior work, piezoelectric actuators are used to control a single degree of freedom. In this work, the coordinated activation of five independent piezoelectric actuators, attached to a common metal foil, is used to implement inchworm-inspired crawling motion in a robot that is less than 0.5 mm thick. The motion is based on the control of its friction to the ground through the robot's shape, in which one end of the robot (depending on its shape) is anchored to the ground by static friction, while the rest of its body expands or contracts. A complete analytical model of the robot shape, which includes gravity, is developed to quantify the robot shape, friction, and displacement. After validation of the model by experiments, the robot's five actuators are collectively sequenced for inchworm-like forward and backward motion.
The Rise and Fall of the 'IBM Way'
IBM is one of the oldest technology companies in the world, with a raft of innovations to its credit, including mainframe computing, computer-programming languages, and AI-powered tools. But ask an ordinary person under the age of 40 what exactly IBM does (or did), and the responses will be vague at best. "Something to do with computers, right?" was the best the Gen Zers I queried could come up with. If a Millennial knows anything about IBM, it's Watson, the company's prototype AI system that prevailed on Jeopardy in 2011. Check out more from this issue and find your next story to read. In the chronicles of garage entrepreneurship, however, IBM retains a legendary place--as a flat-footed behemoth.
#NeurIPS2023 outstanding papers
The thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023) is underway in New Orleans. At the official opening session of the conference on Monday evening, the outstanding papers were announced. The awards comprised two outstanding main track paper awards, two outstanding main track runner-ups, two outstanding datasets and benchmark track papers, and the annual test of time award. Abstract: We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently.
Maatphor: Automated Variant Analysis for Prompt Injection Attacks
Salem, Ahmed, Paverd, Andrew, Köpf, Boris
Prompt injection has emerged as a serious security threat to large language models (LLMs). At present, the current best-practice for defending against newly-discovered prompt injection techniques is to add additional guardrails to the system (e.g., by updating the system prompt or using classifiers on the input and/or output of the model.) However, in the same way that variants of a piece of malware are created to evade anti-virus software, variants of a prompt injection can be created to evade the LLM's guardrails. Ideally, when a new prompt injection technique is discovered, candidate defenses should be tested not only against the successful prompt injection, but also against possible variants. In this work, we present, a tool to assist defenders in performing automated variant analysis of known prompt injection attacks. This involves solving two main challenges: (1) automatically generating variants of a given prompt according, and (2) automatically determining whether a variant was effective based only on the output of the model. This tool can also assist in generating datasets for jailbreak and prompt injection attacks, thus overcoming the scarcity of data in this domain. We evaluate Maatphor on three different types of prompt injection tasks. Starting from an ineffective (0%) seed prompt, Maatphor consistently generates variants that are at least 60% effective within the first 40 iterations.
"You Might Like It": How People Respond to Small Talk in Human-Robot Collaboration
Pineda, Kaitlynn Taylor, Mahmood, Amama, Huang, Chien-Ming
In this work, we investigate people's engagement and attitudes towards a non-anthropomorphic robot manipulator that initiates small talk with the user during a collaborative assembly task, and explore how the presence of negative team feedback may affect team dynamics and blame attribution. Through an exploratory study with 20 participants, we found that 18 individuals interacted socially with the robot, nine of which initiated questions back to the robot. We report the frequency and length of users' responses in task-oriented and non-task-oriented dialogue, and further elaborate on people's reactions to the negative system feedback and robot-initiated small talk. We discuss the potential for integrating small talk in non-social robots, and propose three design guidelines to enhance human-robot small talk interactions.
Alignment for Honesty
Yang, Yuqing, Chern, Ethan, Qiu, Xipeng, Neubig, Graham, Liu, Pengfei
Recent research has made significant strides in applying alignment techniques to enhance the helpfulness and harmlessness of large language models (LLMs) in accordance with human intentions. In this paper, we argue for the importance of alignment for honesty, ensuring that LLMs proactively refuse to answer questions when they lack knowledge, while still not being overly conservative. However, a pivotal aspect of alignment for honesty involves discerning the limits of an LLM's knowledge, which is far from straightforward. This challenge demands comprehensive solutions in terms of metric development, benchmark creation, and training methodologies. In this paper, we address these challenges by first establishing a precise problem definition and defining ``honesty'' inspired by the Analects of Confucius. This serves as a cornerstone for developing metrics that effectively measure an LLM's honesty by quantifying its progress post-alignment. Furthermore, we introduce a flexible training framework which is further instantiated by several efficient fine-tuning techniques that emphasize honesty without sacrificing performance on other tasks. Our extensive experiments reveal that these aligned models show a marked increase in honesty, as indicated by our proposed metrics. We open-source a wealth of resources to facilitate future research at https://github.com/GAIR-NLP/alignment-for-honesty, including honesty-aligned models, training and evaluation datasets for honesty alignment, concept glossary, as well as all relevant source code.
Can Large Language Models emulate an inductive Thematic Analysis of semi-structured interviews? An exploration and provocation on the limits of the approach and the model
Large Language Models (LLMs) have emerged as powerful generative Artificial Intelligence solutions which can be applied to several fields and areas of work. This paper presents results and reflection of an experiment done to use the model GPT 3.5-Turbo to emulate some aspects of an inductive Thematic Analysis. Previous research on this subject has largely worked on conducting deductive analysis. Thematic Analysis is a qualitative method for analysis commonly used in social sciences and it is based on interpretations made by the human analyst(s) and the identification of explicit and latent meanings in qualitative data. Attempting an analysis based on human interpretation with an LLM clearly is a provocation but also a way to learn something about how these systems can or cannot be used in qualitative research. The paper presents the motivations for attempting this emulation, it reflects on how the six steps to a Thematic Analysis proposed by Braun and Clarke can at least partially be reproduced with the LLM and it also reflects on what are the outputs produced by the model. The paper used two existing datasets of open access semi-structured interviews, previously analysed with Thematic Analysis by other researchers. It used the previously produced analysis (and the related themes) to compare with the results produced by the LLM. The results show that the model can infer at least partially some of the main Themes. The objective of the paper is not to replace human analysts in qualitative analysis but to learn if some elements of LLM data manipulation can to an extent be of support for qualitative research.