Goto

Collaborating Authors

 Personal


Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild

arXiv.org Artificial Intelligence

Engaging in the deliberate generation of abnormal outputs from large language models (LLMs) by attacking them is a novel human activity. This paper presents a thorough exposition of how and why people perform such attacks. Using a formal qualitative methodology, we interviewed dozens of practitioners from a broad range of backgrounds, all contributors to this novel work of attempting to cause LLMs to fail. We relate and connect this activity between its practitioners' motivations and goals; the strategies and techniques they deploy; and the crucial role the community plays. As a result, this paper presents a grounded theory of how and why people attack large language models: LLM red teaming in the wild.


Direct Visual Servoing Based on Discrete Orthogonal Moments

arXiv.org Artificial Intelligence

This paper proposes a new approach to achieve direct visual servoing (DVS) based on discrete orthogonal moments (DOMs). DVS is performed in such a way that the extraction of geometric primitives, matching, and tracking steps in the conventional feature-based visual servoing pipeline can be bypassed. Although DVS enables highly precise positioning, it suffers from a limited convergence domain and poor robustness due to the extreme nonlinearity of the cost function to be minimized and the presence of redundant data between visual features. To tackle these issues, we propose a generic and augmented framework that considers DOMs as visual features. By using the Tchebichef, Krawtchouk, and Hahn moments as examples, we not only present the strategies for adaptively tuning the parameters and order of the visual features but also exhibit an analytical formulation of the associated interaction matrix. Simulations demonstrate the robustness and accuracy of our approach, as well as its advantages over the state-of-the-art. Real-world experiments have also been performed to validate the effectiveness of our approach.


Long-Horizon Dialogue Understanding for Role Identification in the Game of Avalon with Large Language Models

arXiv.org Artificial Intelligence

Deception and persuasion play a critical role in long-horizon dialogues between multiple parties, especially when the interests, goals, and motivations of the participants are not aligned. Such complex tasks pose challenges for current Large Language Models (LLM) as deception and persuasion can easily mislead them, especially in long-horizon multi-party dialogues. To this end, we explore the game of Avalon: The Resistance, a social deduction game in which players must determine each other's hidden identities to complete their team's objective. We introduce an online testbed and a dataset containing 20 carefully collected and labeled games among human players that exhibit long-horizon deception in a cooperative-competitive setting. We discuss the capabilities of LLMs to utilize deceptive long-horizon conversations between six human players to determine each player's goal and motivation. Particularly, we discuss the multimodal integration of the chat between the players and the game's state that grounds the conversation, providing further insights into the true player identities. We find that even current state-of-the-art LLMs do not reach human performance, making our dataset a compelling benchmark to investigate the decision-making and language-processing capabilities of LLMs. Our dataset and online testbed can be found at our project website: https://sstepput.github.io/Avalon-NLU/


Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations

arXiv.org Artificial Intelligence

Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks. However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome. For example, a teacher might try to understand their student's current comprehension level to tailor their instruction accordingly, and a travel agent might ask questions of their customer to understand their preferences in order to recommend activities they might enjoy. LLMs trained with supervised fine-tuning or "single-step" RL, as with standard RLHF, might struggle which tasks that require such goal-directed behavior, since they are not trained to optimize for overall conversational outcomes after multiple turns of interaction. In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue. Our key insight is that, though LLMs might not effectively solve goal-directed dialogue tasks out of the box, they can provide useful data for solving such tasks by simulating suboptimal but human-like behaviors. Given a textual description of a goal-directed dialogue task, we leverage LLMs to sample diverse synthetic rollouts of hypothetical in-domain human-human interactions. Our algorithm then utilizes this dataset with offline reinforcement learning to train an interactive conversational agent that can optimize goal-directed objectives over multiple turns. In effect, the LLM produces examples of possible interactions, and RL then processes these examples to learn to perform more optimal interactions. Empirically, we show that our proposed approach achieves state-of-the-art performance in various goal-directed dialogue tasks that include teaching and preference elicitation.


Green Resilience of Cyber-Physical Systems

arXiv.org Artificial Intelligence

Cyber-Physical System (CPS) represents systems that join both hardware and software components to perform real-time services. Maintaining the system's reliability is critical to the continuous delivery of these services. However, the CPS running environment is full of uncertainties and can easily lead to performance degradation. As a result, the need for a recovery technique is highly needed to achieve resilience in the system, with keeping in mind that this technique should be as green as possible. This early doctorate proposal, suggests a game theory solution to achieve resilience and green in CPS. Game theory has been known for its fast performance in decision-making, helping the system to choose what maximizes its payoffs. The proposed game model is described over a real-life collaborative artificial intelligence system (CAIS), that involves robots with humans to achieve a common goal. It shows how the expected results of the system will achieve the resilience of CAIS with minimized CO2 footprint.


Manipulation and Peer Mechanisms: A Survey

arXiv.org Artificial Intelligence

In peer mechanisms, the competitors for a prize also determine who wins. Each competitor may be asked to rank, grade, or nominate peers for the prize. Since the prize can be valuable, such as financial aid, course grades, or an award at a conference, competitors may be tempted to manipulate the mechanism. We survey approaches to prevent or discourage the manipulation of peer mechanisms. We conclude our survey by identifying several important research challenges.


Foreign survivors of brutal Hamas attack on Israel recall terror massacre : 'Everything was burning'

FOX News

JERUSALEM – For Mitchai Sarabon, a Thai fieldhand working on Kibbutz Alumim in southern Israel, Oct. 7 started like any other Saturday. His one day off a week, the 32-year-old said, he woke early and began doing his laundry. His friends – a mix of Thai migrant workers and Nepalese agricultural students – were also milling about the compound where they lived on the edge of the kibbutz, taking care of various personal tasks, when suddenly they heard gunshots. "Suddenly, I saw one of the Nepalese guys being shot, others ran to hide in a bomb shelter and then the terrorists arrived," Sarabon recounted to Fox News Digital in a video interview from his home in Udon Thani, Thailand, on Friday. "They threw a grenade inside, some of the people died instantly and others ran away, they were shot dead too."


AI pioneer Fei-Fei Li: 'I'm more concerned about the risks that are here and now'

The Guardian

Fei-Fei Li is a pioneer of modern artificial intelligence (AI). Her work provided a crucial ingredient – big data – for the deep learning breakthroughs that occurred in the early 2010s. Li's new memoir, The Worlds I See, tells her story of finding her calling at the vanguard of the AI revolution and charts the development of the field from the inside. Li, 47, is a professor of computer science at Stanford University, where she specialises in computer vision. She is also a founding co-director of Stanford's Institute for Human-Centered Artificial Intelligence (HAI), which focuses on AI research, education and policy to improve the human condition, and a founder of the nonprofit AI4ALL, which aims to increase the diversity of people building AI systems.


I Hate Watching My Smart, Articulate Friends Transform Into "Mommy" and "Daddy"

Slate

Care and Feeding is Slate's parenting advice column. Have a question for Care and Feeding? Submit it here or post it in the Slate Parenting Facebook group. I don't have kids but a number of my peers now have babies and toddlers, which means I've heard an awful lot of my smart, articulate friends talking about themselves in the third person, like Elmo. I understand why toddlers do this in their language acquisition journey (pronouns are hard!), but why on earth do my friends say, "Mommy loves you," and, "Mommy needs you to not touch that," when they are "Mommy"? Basically, is too much time with a toddler scrambling their brains and I'm within my rights to roll my eyes, or is there a real cognitive reason why my friends speak this way to their kids?


Evaluating Language Models for Mathematics through Interactions

arXiv.org Artificial Intelligence

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs, and is insufficient for making an informed decision about which LLMs and under which assistive settings can they be sensibly used. Static assessment fails to account for the essential interactive element in LLM deployment, and therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analysing MathConverse, we derive a taxonomy of human behaviours and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, amongst other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by expert mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty respond well to user corrections, and are more interpretable and concise may constitute better assistants. Interactive evaluation is a promising way to navigate the capability of these models; humans should be aware of language models' algebraic fallibility and discern where they are appropriate to use.